데이터 탐색과 시각화
가장 많이 주문한 아이템 top 10
value_counts() 함수는 컬럼내 데이터의 빈도수를 반환한다(내림차순)
item_count = chipo['item_name'].value_counts()[:10]
print(item_count)
Chicken Bowl 726
Chicken Burrito 553
Chips and Guacamole 479
Steak Burrito 368
Canned Soft Drink 301
Chips 211
Steak Bowl 211
Bottled Water 162
Chicken Soft Tacos 115
Chicken Salad Bowl 110
Name: item_name, dtype: int64
아이템 주문 개수와 총량
group by() 함수를 이용하여 아이템별 주문 개수와 총량을 구함
# 아이템별 주문 수 출력
item_count = chipo.groupby('item_name')['order_id'].count()
item_count
item_name
6 Pack Soft Drink 54
Barbacoa Bowl 66
Barbacoa Burrito 91
Barbacoa Crispy Tacos 11
Barbacoa Salad Bowl 10
Barbacoa Soft Tacos 25
Bottled Water 162
Bowl 2
Burrito 6
Canned Soda 104
Canned Soft Drink 301
Carnitas Bowl 68
Carnitas Burrito 59
Carnitas Crispy Tacos 7
Carnitas Salad 1
Carnitas Salad Bowl 6
Carnitas Soft Tacos 40
Chicken Bowl 726
Chicken Burrito 553
Chicken Crispy Tacos 47
Chicken Salad 9
Chicken Salad Bowl 110
Chicken Soft Tacos 115
Chips 211
Chips and Fresh Tomato Salsa 110
Chips and Guacamole 479
Chips and Mild Fresh Tomato Salsa 1
Chips and Roasted Chili Corn Salsa 22
Chips and Roasted Chili-Corn Salsa 18
Chips and Tomatillo Green Chili Salsa 43
Chips and Tomatillo Red Chili Salsa 48
Chips and Tomatillo-Green Chili Salsa 31
Chips and Tomatillo-Red Chili Salsa 20
Crispy Tacos 2
Izze 20
Nantucket Nectar 27
Salad 2
Side of Chips 101
Steak Bowl 211
Steak Burrito 368
Steak Crispy Tacos 35
Steak Salad 4
Steak Salad Bowl 29
Steak Soft Tacos 55
Veggie Bowl 85
Veggie Burrito 95
Veggie Crispy Tacos 1
Veggie Salad 6
Veggie Salad Bowl 18
Veggie Soft Tacos 7
Name: order_id, dtype: int64
# 아이템별 주문 총량
item_quantity = chipo.groupby('item_name')['quantity'].sum()
item_quantity
item_name
6 Pack Soft Drink 55
Barbacoa Bowl 66
Barbacoa Burrito 91
Barbacoa Crispy Tacos 12
Barbacoa Salad Bowl 10
Barbacoa Soft Tacos 25
Bottled Water 211
Bowl 4
Burrito 6
Canned Soda 126
Canned Soft Drink 351
Carnitas Bowl 71
Carnitas Burrito 60
Carnitas Crispy Tacos 8
Carnitas Salad 1
Carnitas Salad Bowl 6
Carnitas Soft Tacos 40
Chicken Bowl 761
Chicken Burrito 591
Chicken Crispy Tacos 50
Chicken Salad 9
Chicken Salad Bowl 123
Chicken Soft Tacos 120
Chips 230
Chips and Fresh Tomato Salsa 130
Chips and Guacamole 506
Chips and Mild Fresh Tomato Salsa 1
Chips and Roasted Chili Corn Salsa 23
Chips and Roasted Chili-Corn Salsa 18
Chips and Tomatillo Green Chili Salsa 45
Chips and Tomatillo Red Chili Salsa 50
Chips and Tomatillo-Green Chili Salsa 33
Chips and Tomatillo-Red Chili Salsa 25
Crispy Tacos 2
Izze 20
Nantucket Nectar 29
Salad 2
Side of Chips 110
Steak Bowl 221
Steak Burrito 386
Steak Crispy Tacos 36
Steak Salad 4
Steak Salad Bowl 31
Steak Soft Tacos 56
Veggie Bowl 87
Veggie Burrito 97
Veggie Crispy Tacos 1
Veggie Salad 6
Veggie Salad Bowl 18
Veggie Soft Tacos 8
Name: quantity, dtype: int64
시각화
x축에는 0~50까지의 숫자를 y축에는 주문 총량에 해당하는 값을 사용
import numpy as np
import matplotlib.pyplot as plt
item_name_list = list(item_quantity.index)
x_pos = list(range(len(item_name_list)))
order_cnt = list(item_quantity.values)
plt.figure(figsize=(10,5))
plt.bar(x_pos,order_cnt,align = 'center')
plt.ylabel ('ordered_item_count')
plt.title('Distribution of all orderd item')
plt.show()
'데이터 분석 > 데이터 분석 기초' 카테고리의 다른 글
텍스트 마이닝 첫걸음 -(1) (0) | 2020.10.09 |
---|---|
chipotle 주문 데이터 분석(3) - 데이터 분석 기초 (0) | 2020.10.08 |
chipotle 주문 데이터 분석(1) - 데이터 분석 기초 (0) | 2020.10.08 |
데이터 시각화 기초 (0) | 2020.09.18 |
매년 새해 첫날의 기온 그래프 - 데이터 분석 기초 (0) | 2020.09.18 |