sorting dictionary python 3

sorting dictionary python 3

A modern and fast solution, for Python 3.7. May also work in some interpreters of Python 3.6.

TLDR

To sort a dictionary by keys use:

sorted_dict = {k: disordered[k] for k in sorted(disordered)}

Almost three times faster than the accepted answer; probably more when you include imports.

Comment on the accepted answer

The example in the accepted answer instead of iterating over the keys only – with key parameter of sorted() or the default behaviour of dict iteration – iterates over tuples (key, value), which suprisingly turns out to be much slower than comparing the keys only and accessing dictionary elements in a list comprehension.

How to sort by key in Python 3.7

The big change in Python 3.7 is that the dictionaries are now ordered by default.

  • You can generate sorted dict using dict comprehensions.
  • Using OrderedDict might still be preferable for the compatibility sake.
  • Do not use sorted(d.items()) without key.

See:

disordered = {10: b, 3: a, 5: c}

# sort keys, then get values from original - fast
sorted_dict = {k: disordered[k] for k in sorted(disordered)}

# key = itemgetter - slower
from operator import itemgetter
key = itemgetter(0)
sorted_dict = {k: v for k, v in sorted(disordered.items(), key=key)}

# key = lambda - the slowest
key = lambda item: item[0]
sorted_dict = {k: v for k in sorted(disordered.items(), key=key)} 

Timing results:

Best for {k: d[k] for k in sorted(d)}: 7.507327548999456
Best for {k: v for k, v in sorted(d.items(), key=key_getter)}: 12.031082626002899
Best for {k: v for k, v in sorted(d.items(), key=key_lambda)}: 14.22885995300021

Best for dict(sorted(d.items(), key=key_getter)): 11.209122000000207
Best for dict(sorted(d.items(), key=key_lambda)): 13.289728325995384
Best for dict(sorted(d.items())): 14.231471302999125

Best for OrderedDict(sorted(d.items(), key=key_getter)): 16.609151654003654
Best for OrderedDict(sorted(d.items(), key=key_lambda)): 18.52622927199991
Best for OrderedDict(sorted(d.items())): 19.436101284998585

Testing code:

from timeit import repeat

setup_code = 
from operator import itemgetter
from collections import OrderedDict
import random
random.seed(0)
d = {i: chr(i) for i in [random.randint(0, 120) for repeat in range(120)]}
key_getter = itemgetter(0)
key_lambda = lambda item: item[0]


cases = [
    # fast
    {k: d[k] for k in sorted(d)},
    {k: v for k, v in sorted(d.items(), key=key_getter)},
    {k: v for k, v in sorted(d.items(), key=key_lambda)},
    # slower
    dict(sorted(d.items(), key=key_getter)),
    dict(sorted(d.items(), key=key_lambda)),
    dict(sorted(d.items())),
    # the slowest 
    OrderedDict(sorted(d.items(), key=key_getter)),
    OrderedDict(sorted(d.items(), key=key_lambda)),
    OrderedDict(sorted(d.items())),
]

for code in cases:
    times = repeat(code, setup=setup_code, repeat=3)
    print(fBest for {code}: {min(times)})

dict does not keep its elements order. What you need is an OrderedDict: http://docs.python.org/library/collections.html#collections.OrderedDict

edit

Usage example:

>>> from collections import OrderedDict
>>> a = {foo: 1, bar: 2}
>>> a
{foo: 1, bar: 2}
>>> b = OrderedDict(sorted(a.items()))
>>> b
OrderedDict([(bar, 2), (foo, 1)])
>>> b[foo]
1
>>> b[bar]
2

sorting dictionary python 3

I dont think you want an OrderedDict. It sounds like youd prefer a SortedDict, that is a dict that maintains its keys in sorted order. The sortedcontainers module provides just such a data type. Its written in pure-Python, fast-as-C implementations, has 100% coverage and hours of stress.

Installation is easy with pip:

pip install sortedcontainers

Note that if you cant pip install then you can simply pull the source files from the open-source repository.

Then youre code is simply:

from sortedcontainers import SortedDict
myDic = SortedDict({10: b, 3:a, 5:c})
sorted_list = list(myDic.keys())

The sortedcontainers module also maintains a performance comparison with other popular implementations.

Leave a Reply

Your email address will not be published. Required fields are marked *