Tell me if this looks familiar.
some_dict = {
"data" : [
{ "user" : { "name" : "Joshua Kehn" } }
]
}
name = some_dict["data"][0]["user"]["name"]
print "Gee, getting {name} was difficult!".format(name=name)
#=> Gee, getting Joshua Kehn was difficult!
Typically some_dict
is found in a response from REST API where the designers thought it was a fabulous idea to shrink-wrap everything in multiple objects and/or arrays1. In itself this isn’t a huge problem, you just remember where everything is laid out for every request and pray the service provider is consistent. The problem is when this data that’s returned is missing crucial stepping points. What happens if the inner array in the example above is empty?
>>> some_dict = { "data" : [] }
>>> name = some_dict["data"][0]["user"]["name"]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
An exception, how nice. Right in the middle of our production application too! There are several options for handling this, some more sane than others.
- Trust that the data will always be provided in the correct format.
- Validate everything you receive.
- Perform validation each time you touch data.
Option 1 is just stupid, you can’t trust anything you don’t control. Option 2 is probably the most sane, but isn’t always the easiest to do, especially when you have ad-hoc requests for more data in the middle of your application logic. This leaves us with #3, and it’s messy.
>>> some_dict = { "data" : [] }
>>> data = some_dict.get("data")
>>> if data and len(data) > 0:
... first = data[0]
... if first:
... user = first.get("user")
... if user:
... name = user.get("name")
>>> name
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'name' is not defined
Here’s where I would think about monads.
Then I looked at monads in python and thought better of it.2
To solve this quickly I wrote a little function. It’s not super elegant, and misses out on some of the great syntax a monad solution, but it works so well I don’t miss anything better.
def tree_get (obj, *args):
val = obj
for arg in args:
if val is None:
return None
# Allow filtering functions to be executed against ``val``.
if callable(arg):
val = arg(val)
# Treat ``arg`` as a key.
elif isinstance(val, dict):
val = val.get(arg, None)
# Treat ``arg`` as an index.
elif isinstance(val, (list, tuple)):
try:
val = val[arg]
except IndexError:
return None
# Treat ``arg`` as an object
elif isinstance(val, object):
val = getattr(val, arg, None)
else:
# ``val`` is something we can't operate on
return None
if val is None:
return None
return val
How about usage?
>>> some_dict = {
... "data" : [
... { "user" : { "name" : "Joshua Kehn" } }
... ]
... }
>>> print tree_get(some_dict, "data", 0, "user", "name")
Joshua Kehn
>>> some_dict = { "data" : [] }
>>> print tree_get(some_dict, "data", 0, "user", "name")
None
Very straight forward to use, and suppresses every dict/list/tuple error I can think of. As a special treat, if callable(arg):
allows lambda
or filtering functions to be stacked in place.
>>> tree_get(some_dict, "data", 0, "user", "name", lambda n: n.split(" "), 0)
'Joshua'