Python pandas groupby aggregate on multiple columns, then pivot
Python pandas groupby aggregate on multiple columns, then pivot
Edited for Pandas 0.22+ considering the deprecation of the use of dictionaries in a group by aggregation.
We set up a very similar dictionary where we use the keys of the dictionary to specify our functions and the dictionary itself to rename the columns.
rnm_cols = dict(size=Size, sum=Sum, mean=Mean, std=Std)
df.set_index([Category, Item]).stack().groupby(Category)
.agg(rnm_cols.keys()).rename(columns=rnm_cols)
Size Sum Mean Std
Category
Books 3 58 19.333333 2.081666
Clothes 3 148 49.333333 4.041452
Technology 6 1800 300.000000 70.710678
option 1
use agg
← link to docs
agg_funcs = dict(Size=size, Sum=sum, Mean=mean, Std=std)
df.set_index([Category, Item]).stack().groupby(level=0).agg(agg_funcs)
Std Sum Mean Size
Category
Books 2.081666 58 19.333333 3
Clothes 4.041452 148 49.333333 3
Technology 70.710678 1800 300.000000 6
option 2
more for less
use describe
← link to docs
df.set_index([Category, Item]).stack().groupby(level=0).describe().unstack()
count mean std min 25% 50% 75% max
Category
Books 3.0 19.333333 2.081666 17.0 18.5 20.0 20.5 21.0
Clothes 3.0 49.333333 4.041452 45.0 47.5 50.0 51.5 53.0
Technology 6.0 300.000000 70.710678 200.0 262.5 300.0 337.5 400.0
df.groupby(Category).agg({Item:size,shop1:[sum,mean,std],shop2:[sum,mean,std],shop3:[sum,mean,std]})
Or if you want it across all shops then:
df1 = df.set_index([Item,Category]).stack().reset_index().rename(columns={level_2:Shops,0:costs})
df1.groupby(Category).agg({Item:size,costs:[sum,mean,std]})
Python pandas groupby aggregate on multiple columns, then pivot
If I understand correctly, you want to calculate aggregate metrics for all shops, not for each individually. To do that, you can first stack
your dataframe and then group by Category
:
stacked = df.set_index([Item, Category]).stack().reset_index()
stacked.columns = [Item, Category, Shop, Price]
stacked.groupby(Category).agg({Price:[count,sum,mean,std]})
Which results in
Price
count sum mean std
Category
Books 3 58 19.333333 2.081666
Clothes 3 148 49.333333 4.041452
Technology 6 1800 300.000000 70.710678