Multi-index Grouping

Multi-index Grouping¶

You are not limited to just using a single column for grouping, you can also pass in a list to do the grouping based on multiple groupings. Let’s take an example by bringing back the original dataset again. One thing to keep in mind is you will get back data with a multi-index based on these groupings.

In [9]:

#Create the data
sales = pd.DataFrame([[1, 1, 1, 10],
 [1, 1, 2, 74],
 [1, 1, 3, 27],
 [1, 2, 1, 41],
 [1, 2, 2, 66],
 [1, 2, 3, 95],
 [2, 1, 1, 1],
 [2, 1, 2, 23],
 [2, 1, 3, 67],
 [2, 2, 1, 86],
 [2, 2, 2, 87],
 [2, 2, 3, 19],
 [3, 1, 1, 6],
 [3, 1, 2, 30],
 [3, 1, 3, 55],
 [3, 2, 1, 68],
 [3, 2, 2, 32],
 [3, 2, 3, 50]],
columns=['Day', 'Store', 'Product', 'Sales'])

print(sales)

    Day  Store  Product  Sales
0     1      1        1     10
1     1      1        2     74
2     1      1        3     27
3     1      2        1     41
4     1      2        2     66
5     1      2        3     95
6     2      1        1      1
7     2      1        2     23
8     2      1        3     67
9     2      2        1     86
10    2      2        2     87
11    2      2        3     19
12    3      1        1      6
13    3      1        2     30
14    3      1        3     55
15    3      2        1     68
16    3      2        2     32
17    3      2        3     50

Now what if we wanted to find the total sales by day and product? We can use both of them in grouping like below:

In [10]:

#Group by day and product
group_data = sales.groupby(['Day', 'Product'])['Sales'].sum()
print(group_data)

Day  Product
1    1           51
     2          140
     3          122
2    1           87
     2          110
     3           86
3    1           74
     2           62
     3          105
Name: Sales, dtype: int64

You might also want something like average sales for each store-day combo.

In [11]:

group_data = sales.groupby(['Store', 'Day'])['Sales'].mean()
print(group_data)

Store  Day
1      1      37.000000
       2      30.333333
       3      30.333333
2      1      67.333333
       2      64.000000
       3      50.000000
Name: Sales, dtype: float64

Data Science

Data Science

Multi-index Grouping

Multi-index Grouping¶

Leave A Reply Cancel reply

Modal title