python – Convert a Pandas DataFrame to a dictionary
python – Convert a Pandas DataFrame to a dictionary
The to_dict()
method sets the column names as dictionary keys so youll need to reshape your DataFrame slightly. Setting the ID column as the index and then transposing the DataFrame is one way to achieve this.
to_dict()
also accepts an orient argument which youll need in order to output a list of values for each column. Otherwise, a dictionary of the form {index: value}
will be returned for each column.
These steps can be done with the following line:
>>> df.set_index(ID).T.to_dict(list)
{p: [1, 3, 2], q: [4, 3, 2], r: [4, 0, 9]}
In case a different dictionary format is needed, here are examples of the possible orient arguments. Consider the following simple DataFrame:
>>> df = pd.DataFrame({a: [red, yellow, blue], b: [0.5, 0.25, 0.125]})
>>> df
a b
0 red 0.500
1 yellow 0.250
2 blue 0.125
Then the options are as follows.
dict – the default: column names are keys, values are dictionaries of index:data pairs
>>> df.to_dict(dict)
{a: {0: red, 1: yellow, 2: blue},
b: {0: 0.5, 1: 0.25, 2: 0.125}}
list – keys are column names, values are lists of column data
>>> df.to_dict(list)
{a: [red, yellow, blue],
b: [0.5, 0.25, 0.125]}
series – like list, but values are Series
>>> df.to_dict(series)
{a: 0 red
1 yellow
2 blue
Name: a, dtype: object,
b: 0 0.500
1 0.250
2 0.125
Name: b, dtype: float64}
split – splits columns/data/index as keys with values being column names, data values by row and index labels respectively
>>> df.to_dict(split)
{columns: [a, b],
data: [[red, 0.5], [yellow, 0.25], [blue, 0.125]],
index: [0, 1, 2]}
records – each row becomes a dictionary where key is column name and value is the data in the cell
>>> df.to_dict(records)
[{a: red, b: 0.5},
{a: yellow, b: 0.25},
{a: blue, b: 0.125}]
index – like records, but a dictionary of dictionaries with keys as index labels (rather than a list)
>>> df.to_dict(index)
{0: {a: red, b: 0.5},
1: {a: yellow, b: 0.25},
2: {a: blue, b: 0.125}}
Should a dictionary like:
{red: 0.500, yellow: 0.250, blue: 0.125}
be required out of a dataframe like:
a b
0 red 0.500
1 yellow 0.250
2 blue 0.125
simplest way would be to do:
dict(df.values)
working snippet below:
import pandas as pd
df = pd.DataFrame({a: [red, yellow, blue], b: [0.5, 0.25, 0.125]})
dict(df.values)
python – Convert a Pandas DataFrame to a dictionary
Follow these steps:
Suppose your dataframe is as follows:
>>> df
A B C ID
0 1 3 2 p
1 4 3 2 q
2 4 0 9 r
1. Use set_index
to set ID
columns as the dataframe index.
df.set_index(ID, drop=True, inplace=True)
2. Use the orient=index
parameter to have the index as dictionary keys.
dictionary = df.to_dict(orient=index)
The results will be as follows:
>>> dictionary
{q: {A: 4, B: 3, D: 2}, p: {A: 1, B: 3, D: 2}, r: {A: 4, B: 0, D: 9}}
3. If you need to have each sample as a list run the following code. Determine the column order
column_order= [A, B, C] # Determine your preferred order of columns
d = {} # Initialize the new dictionary as an empty dictionary
for k in dictionary:
d[k] = [dictionary[k][column_name] for column_name in column_order]