python – Pandas count(distinct) equivalent

python – Pandas count(distinct) equivalent

I believe this is what you want:

table.groupby(YEARMONTH).CLIENTCODE.nunique()

Example:

In [2]: table
Out[2]: 
   CLIENTCODE  YEARMONTH
0           1     201301
1           1     201301
2           2     201301
3           1     201302
4           2     201302
5           2     201302
6           3     201302

In [3]: table.groupby(YEARMONTH).CLIENTCODE.nunique()
Out[3]: 
YEARMONTH
201301       2
201302       3

Here is another method and it is much simpler. Let’s say your dataframe name is daat and the column name is YEARMONTH:

daat.YEARMONTH.value_counts()

python – Pandas count(distinct) equivalent

Interestingly enough, very often len(unique()) is a few times (3x-15x) faster than nunique().

Leave a Reply

Your email address will not be published. Required fields are marked *