collections – Python defaultdict and lambda

collections – Python defaultdict and lambda

I think the first line means that when I call x[k] for a nonexistent key k (such as a statement like v=x[k]), the key-value pair (k,0) will be automatically added to the dictionary, as if the statement x[k]=0 is first executed.

Thats right. This is more idiomatically written

x = defaultdict(int)

In the case of y, when you do y[ham][spam], the key ham is inserted in y if it does not exist. The value associated with it becomes a defaultdict in which spam is automatically inserted with a value of 0.

I.e., y is a kind of two-tiered defaultdict. If ham not in y, then evaluating y[ham][spam] is like doing

y[ham] = {}
y[ham][spam] = 0

in terms of ordinary dict.

You are correct for what the first one does. As for y, it will create a defaultdict with default 0 when a key doesnt exist in y, so you can think of this as a nested dictionary. Consider the following example:

y = defaultdict(lambda: defaultdict(lambda: 0))
print y[k1][k2]   # 0
print dict(y[k1])   # {k2: 0}

To create an equivalent nested dictionary structure without defaultdict you would need to create an inner dict for y[k1] and then set y[k1][k2] to 0, but defaultdict does all of this behind the scenes when it encounters keys it hasnt seen:

y = {}
y[k1] = {}
y[k1][k2] = 0

The following function may help for playing around with this on an interpreter to better your understanding:

def to_dict(d):
    if isinstance(d, defaultdict):
        return dict((k, to_dict(v)) for k, v in d.items())
    return d

This will return the dict equivalent of a nested defaultdict, which is a lot easier to read, for example:

>>> y = defaultdict(lambda: defaultdict(lambda: 0))
>>> y[a][b] = 5
>>> y
defaultdict(<function <lambda> at 0xb7ea93e4>, {a: defaultdict(<function <lambda> at 0xb7ea9374>, {b: 5})})
>>> to_dict(y)
{a: {b: 5}}

collections – Python defaultdict and lambda

defaultdict takes a zero-argument callable to its constructor, which is called when the key is not found, as you correctly explained.

lambda: 0 will of course always return zero, but the preferred method to do that is defaultdict(int), which will do the same thing.

As for the second part, the author would like to create a new defaultdict(int), or a nested dictionary, whenever a key is not found in the top-level dictionary.

Leave a Reply

Your email address will not be published. Required fields are marked *