I’m often find myself parsing deep nested json documents in python, and often thought, this should be easier. For a structure such as:

{
  "alpha": {
    "beta": {
      "gamma": 1
    }
  }
}

I would like to be able to do something like this:

>>> import json
>>> doc = json.loads('{"alpha": {"beta": {"gamma": 1}}}')
>>> doc.get('alpha.beta.gamma')
1
>>> doc.get('alpha.beta.delta')
None
>>> doc.get('one.two.three')
None

When searching for solutions, I would constantly focus on the ‘json’ bit, it took me longer than I would like to admit, that once I call json.loads I now have a python dictionary, it’s no longer json. A quick google after that and I found dpath-python:

A python library for accessing and searching dictionaries via /slashed/paths ala xpath

There is a few minor differences to my desired example but nothing that I can’t overcome. Dpath works like so:


>>> import json
>>> from dpath.util import get as dget
>>> jstring = '{"alpha": {"beta": {"gamma": 1}}}'
>>> doc = json.loads(jstring)
>>> dget(doc, '/alpha/beta/gamma')
1
>>> dget(doc, '/alpha/beta/delta')
...
KeyError: '/alpha/beta/delta'

The three main differences are that it uses a / as the default seperator, requires a leading seperator and there is no way to provide a default in the case of key not being found. I could cope with the first two, but the last once is a deal breaker for me. So I might as well fix all three! I’d like to make it as close to a standard python dictionary workflow as possible, so I am going to create a NestedDict class, which will subclass dict and then overide get.

import dpath.util

class NestedDict(dict):

    def get(self, path, default=None, separator='.'):
        if path[0] != separator:
            path = separator + path
        try:
            return dpath.util.get(self, path, separator=separator)
        except KeyError:
            return default

As you can see, I have made the leading separator optional, defaulted it to . and provided a way to set a default in the case of a KeyError. The first example now becomes:

>>> import json
>>> doc = NestedDict(json.loads('{"alpha": {"beta": {"gamma": 1}}}'))
>>> doc.get('alpha.beta.gamma')
1
>>> doc.get('alpha.beta.delta')
None
>>> doc.get('one.two.three')
None

Job done!

One thing I could do is add a quiet=True|False flag to control the KeyError, as with the current setup it’s not possible to tell the difference between the returned value form nested dictionary being None and the default reposnse being None.