# hash – Time complexity of accessing a Python dict

## hash – Time complexity of accessing a Python dict

See Time Complexity. The python dict is a hashmap, its worst case is therefore O(n) if the hash function is bad and results in a lot of collisions. However that is a very rare case where every item added has the same hash and so is added to the same chain which for a major Python implementation would be *extremely* unlikely. The average time complexity is of course O(1).

The best method would be to check and take a look at the hashs of the objects you are using. The CPython Dict uses int PyObject_Hash (PyObject *o) which is the equivalent of `hash(o)`

.

After a quick check, I have not yet managed to find two tuples that hash to the same value, which would indicate that the lookup is O(1)

```
l = []
for x in range(0, 50):
for y in range(0, 50):
if hash((x,y)) in l:
print Fail: , (x,y)
l.append(hash((x,y)))
print Test Finished
```

CodePad (Available for 24 hours)

You are not correct. `dict`

access is unlikely to be your problem here. It is almost certainly O(1), unless you have some very weird inputs or a very bad hashing function. Paste some sample code from your application for a better diagnosis.

#### hash – Time complexity of accessing a Python dict

It would be easier to make suggestions if you provided example code and data.

Accessing the dictionary is unlikely to be a problem as that operation is O(1) on average, and O(N) amortized worst case. Its possible that the built-in hashing functions are experiencing collisions for your data. If youre having problems with has the built-in hashing function, you can provide your own.

Pythons dictionary implementation

reduces the average complexity of

dictionary lookups to O(1) by

requiring that key objects provide a

hash function. Such a hash function

takes the information in a key object

and uses it to produce an integer,

called a hash value. This hash value

is then used to determine which

bucket this (key, value) pair should

be placed into.

You can overwrite the __hash__ method in your class to implement a custom hash function like this:

```
def __hash__(self):
return hash(str(self))
```

Depending on what your data actually looks like, you might be able to come up with a faster hash function that has fewer collisions than the standard function. However, this is unlikely. See the Python Wiki page on Dictionary Keys for more information.