데이터 전처리


item_price 피처


itme_price 피처를 살펴보면 가격앞에 $ 문자가 잇다. 수치형 데이터로 변환하기 위해서는 $ 문자를 제거해야 한다.

chipo['item_price'].head()
0     $2.39 
1     $3.39 
2     $3.39 
3     $2.39 
4    $16.98 
Name: item_price, dtype: object
chipo['item_price']=chipo['item_price'].apply(lambda x : float(x[1:]))
chipo['item_price'].head()
0     2.39
1     3.39
2     3.39
3     2.39
4    16.98
Name: item_price, dtype: float64

탐색적 분석


주문당 평균 계산 금액 출력하기


  1. order_id로 그룹 생성
  2. item_price 피처에 sum() 함수를 적용
  3. mean() 함수를 추가
chipo.groupby('order_id')['item_price'].sum().mean()
18.81142857142869

한 주문에 10 달러 이상 지불한 주문 번호 출력


chipo.head()
  order_id quantity item_name choice_description item_price
0 1 1 Chips and Fresh Tomato Salsa NaN 2.39
1 1 1 Izze [Clementine] 3.39
2 1 1 Nantucket Nectar [Apple] 3.39
3 1 1 Chips and Tomatillo-Green Chili Salsa NaN 2.39
4 2 2 Chicken Bowl [Tomatillo-Red Chili Salsa (Hot), [Black Beans... 16.98
chipo_order_id_group = chipo.groupby('order_id').sum()
result = chipo_order_id_group[chipo_order_id_group['item_price']>10]
result
  quantity item_price
order_id    
1 4 11.56
2 2 16.98
3 2 12.67
4 2 21.00
5 2 13.70
... ... ...
1830 2 23.00
1831 3 12.90
1832 2 13.20
1833 2 23.50
1834 3 28.75

1834 rows × 2 columns

각 아이템의 가격 구하기


# 동일 아이템을 1개만 구매한 주문 선별
chipo.head()
chipo_one_item = chipo[chipo['quantity']==1]

# group by를 이용해 item_name 별로 묶고, 각 아이템의 최저가 계산

price_per_item = chipo_one_item.groupby('item_name').min()

#item_price를 내림차순으로 정렬
price_per_item.sort_values(by = 'item_price',ascending=False)
  order_id quantity choice_description item_price
item_name        
Steak Salad Bowl 250 1 [Fresh Tomato Salsa, Lettuce] 9.39
Barbacoa Salad Bowl 501 1 [Fresh Tomato Salsa, Guacamole] 9.39
Carnitas Salad Bowl 468 1 [Fresh Tomato Salsa, [Rice, Black Beans, Chees... 9.39
Carnitas Soft Tacos 103 1 [Fresh Tomato Salsa (Mild), [Black Beans, Rice... 8.99
Carnitas Crispy Tacos 230 1 [Fresh Tomato Salsa, [Fajita Vegetables, Rice,... 8.99
Steak Soft Tacos 4 1 [Fresh Tomato Salsa (Mild), [Cheese, Sour Cream]] 8.99
Carnitas Salad 1500 1 [[Fresh Tomato Salsa (Mild), Roasted Chili Cor... 8.99
Carnitas Bowl 17 1 [Fresh Tomato (Mild), [Guacamole, Lettuce, Ric... 8.99
Barbacoa Soft Tacos 26 1 [Fresh Tomato Salsa, [Black Beans, Cheese, Let... 8.99
Barbacoa Crispy Tacos 75 1 [Fresh Tomato Salsa, Guacamole] 8.99
Veggie Salad Bowl 83 1 [Fresh Tomato Salsa, [Fajita Vegetables, Black... 8.75
Chicken Salad Bowl 20 1 [Fresh Tomato Salsa, Fajita Vegetables] 8.75
Steak Burrito 4 1 [Brown Rice] 8.69
Steak Crispy Tacos 40 1 [Fresh Tomato (Mild), [Lettuce, Cheese]] 8.69
Steak Salad 276 1 [Fresh Tomato Salsa (Mild), [Rice, Cheese, Sou... 8.69
Carnitas Burrito 14 1 [Fresh Tomato (Mild), [Lettuce, Black Beans, G... 8.69
Steak Bowl 25 1 [Fresh Tomato (Mild), [Guacamole, Lettuce, Pin... 8.69
Barbacoa Burrito 11 1 [Fresh Tomato (Mild), [Black Beans, Rice, Sour... 8.69
Barbacoa Bowl 19 1 [Fresh Tomato (Mild), [Lettuce, Black Beans, R... 8.69
Chicken Soft Tacos 6 1 [Fresh Tomato Salsa (Mild), [Black Beans, Rice... 8.49
Veggie Bowl 28 1 [Fresh Tomato Salsa (Mild), [Pinto Beans, Blac... 8.49
Veggie Burrito 26 1 [Fresh Tomato Salsa (Mild), [Black Beans, Faji... 8.49
Veggie Soft Tacos 304 1 [Fresh Tomato Salsa (Mild), [Pinto Beans, Rice... 8.49
Chicken Crispy Tacos 6 1 [Fresh Tomato Salsa (Mild), Fajita Veggies] 8.49
Veggie Crispy Tacos 668 1 [Fresh Tomato Salsa (Mild), [Pinto Beans, Rice... 8.49
Veggie Salad 686 1 [Roasted Chili Corn Salsa (Medium), [Black Bea... 8.49
Chicken Salad 109 1 [Fresh Tomato Salsa (Mild), Black Beans] 8.19
Chicken Burrito 8 1 [Fresh Tomato (Mild), [Black Beans, Rice, Sour... 8.19
Chicken Bowl 3 1 [Fresh Tomato (Mild), [Guacamole, Rice]] 8.19
Crispy Tacos 217 1 [Adobo-Marinated and Grilled Chicken] 7.40
Burrito 214 1 [Adobo-Marinated and Grilled Chicken, Pinto Be... 7.40
Bowl 279 1 [Adobo-Marinated and Grilled Steak, [Sour Crea... 7.40
Salad 575 1 [Brown Rice, Adobo-Marinated and Grilled Chick... 7.40
6 Pack Soft Drink 129 1 [Coke] 6.49
Chips and Guacamole 5 1 NaN 3.89
Izze 1 1 [Blackberry] 3.39
Nantucket Nectar 1 1 [Apple] 3.39
Chips and Mild Fresh Tomato Salsa 279 1 NaN 3.00
Chips and Tomatillo Red Chili Salsa 49 1 NaN 2.95
Chips and Tomatillo Green Chili Salsa 18 1 NaN 2.95
Chips and Roasted Chili Corn Salsa 102 1 NaN 2.95
Chips and Tomatillo-Red Chili Salsa 130 1 NaN 2.39
Chips and Tomatillo-Green Chili Salsa 1 1 NaN 2.39
Chips and Roasted Chili-Corn Salsa 85 1 NaN 2.39
Chips and Fresh Tomato Salsa 1 1 NaN 2.29
Chips 19 1 NaN 1.99
Side of Chips 3 1 NaN 1.69
Canned Soft Drink 114 1 [Coke] 1.25
Canned Soda 14 1 [Coca Cola] 1.09
Bottled Water 17 1 NaN 1.09

가장 비싼 주문에서 아이템이 총 몇개 팔렸는지 확인


order_id 그룹별 합계 연산 적용후 item_price를 기준으로 sort_values 반환

chipo.groupby('order_id').sum().sort_values('item_price',ascending=False)
  quantity item_price
order_id    
926 23 205.25
1443 35 160.74
1483 14 139.00
691 11 118.25
1786 20 114.30
... ... ...
17 2 10.08
889 2 10.08
1014 2 10.08
1303 2 10.08
1602 2 10.08

1834 rows × 2 columns

'Chicken Bowl' 이 몇번 주문 되었는지 구하기


chipo_chicken = chipo[chipo['item_name']=='Chicken Bowl']

# 한 주문 내에서 중복 집계된 item_name 제거
chipo_chicken = chipo_chicken.drop_duplicates(['item_name','order_id'])
print(len(chipo_chicken))
615
ariz1623