python – How do I get the row count of a Pandas DataFrame?

python – How do I get the row count of a Pandas DataFrame?

For a dataframe df, one can use any of the following:

Performance


Code to reproduce the plot:

import numpy as np
import pandas as pd
import perfplot

perfplot.save(
    out.png,
    setup=lambda n: pd.DataFrame(np.arange(n * 3).reshape(n, 3)),
    n_range=[2**k for k in range(25)],
    kernels=[
        lambda df: len(df.index),
        lambda df: df.shape[0],
        lambda df: df[df.columns[0]].count(),
    ],
    labels=[len(df.index), df.shape[0], df[df.columns[0]].count()],
    xlabel=Number of rows,
)

Suppose df is your dataframe then:

count_row = df.shape[0]  # Gives number of rows
count_col = df.shape[1]  # Gives number of columns

Or, more succinctly,

r, c = df.shape

python – How do I get the row count of a Pandas DataFrame?

Use len(df) :-).

__len__() is documented with Returns length of index.

Timing info, set up the same way as in roots answer:

In [7]: timeit len(df.index)
1000000 loops, best of 3: 248 ns per loop

In [8]: timeit len(df)
1000000 loops, best of 3: 573 ns per loop

Due to one additional function call, it is of course correct to say that it is a bit slower than calling len(df.index) directly. But this should not matter in most cases. I find len(df) to be quite readable.

Leave a Reply

Your email address will not be published. Required fields are marked *