Rewriting Python Docstrings

... with a Metaclass

Jess Hamrick (@jhamrick)

http://www.jesshamrick.com

San Francisco Python Meetup

November 13th, 2013

Why metaclasses?

There are things you cannot do with classes.

Metaclasses let you do these things!

The Problem

Inherited docstrings aren't particularly informative:

In [1]:
class A(object):
    def my_func(self):
        """Do some stuff for class A."""
        pass

class B(A):
    pass

print A().my_func.__doc__
print B().my_func.__doc__ # this is doing stuff for class B!
Do some stuff for class A.
Do some stuff for class A.

More specifically...

The nose testing framework will print out the docstrings of test methods as it runs them.

Unfortunately, if you have a test suite class that inherits from another class, you won't be able to tell when it's running methods from the parent class vs. the subclass.

The Simple Solution

Just manually include information in the docstrings:

In [2]:
class A(object):
    def my_func(self):
        """A: Do some stuff."""
        pass

class B(A):
    def my_func(self):
        """B: Do some stuff."""
        super(B, self).my_func()

print A().my_func.__doc__
print B().my_func.__doc__
A: Do some stuff.
B: Do some stuff.

But, that's a lot of work if you have many subclasses and/or many methods.

A Better Solution?

"Aha!", one might say. "I will just edit the docstrings in the __init__ of the superclass!"

In [3]:
class A(object):
    def __init__(self):
        old_doc = self.my_func.__doc__
        cls_name = type(self).__name__
        self.my_func.__doc__ = "%s: %s" % (cls_name, old_doc)
        
    def my_func(self):
        """Do some stuff."""
        
class B(A):
    pass

Unfortunately, method docstrings aren't writable:

In [4]:
print A().my_func.__doc__
print B().my_func.__doc__
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-ddf68abe1a40> in <module>()
----> 1 print A().my_func.__doc__
      2 print B().my_func.__doc__

<ipython-input-3-d4d1b624ac26> in __init__(self)
      3         old_doc = self.my_func.__doc__
      4         cls_name = type(self).__name__
----> 5         self.my_func.__doc__ = "%s: %s" % (cls_name, old_doc)
      6 
      7     def my_func(self):

AttributeError: attribute '__doc__' of 'instancemethod' objects is not writable

Note: Function docstrings, in general, are writable -- it's just method docstrings that aren't.

So, is there any way to change the function's docstring before it becomes a method?

Taking a step back: what is a class?

A class is a special kind of object which creates new objects called instances.

A class is kind of like a form (e.g., tax form 1040).

An instance is kind of like your specific copy of the form.

type will tell us the class of an instance:

In [5]:
class A(object):
    def my_func(self):
        """Do some stuff."""
        pass
    
a_inst = A()
print "Instance `a_inst` has type:", type(a_inst).__name__
Instance `a_inst` has type: A

Remember: everything in Python is an object!

So, classes have types, too:

In [6]:
print "Class `A` has type:", type(A).__name__
Class `A` has type: type

In other words, classes are generated by a special type called type.

(Yes, the terminology is a bit confusing.)

A word on type

The type object actually does a few different things:

  1. It denotes a type of object (the type of classes, specifically).
  2. It tells you what type an object is.
  3. It can create new classes.

Creating a class on the fly

This is the type of class declaration you're used to:

In [7]:
class A(object):
    def my_func(self):
        """Do some stuff."""
        pass

But we can also use the type type to create new classes on demand:

In [8]:
def my_func(self):
    """Do some stuff."""
    pass

A_name = 'A'
A_parents = (object,)
A_methods = {'my_func': my_func}
A = type(A_name, A_parents, A_methods)

Modifying the docstring, take II

Let's try creating our new class programmatically.

This way, we can modify the function's docstring before it becomes a method:

In [9]:
def my_func(self):
    """Do some stuff."""
    pass
In [10]:
def make_class(name, parents, methods):
    """Create a new class and prefix its method's docstrings to 
    include the class name."""
    for f in methods:
        methods[f].__doc__ = "%s: %s" % (name, methods[f].__doc__)
    cls = type(name, parents, methods)
    return cls
In [11]:
A = make_class('A', (object,), {'my_func': my_func})
print A().my_func.__doc__

B = make_class('B', (A,), {'my_func': my_func})
print B().my_func.__doc__
A: Do some stuff.
B: A: Do some stuff.

Oops, that wasn't what we wanted! What happened?

What happened was that we modified the docstring of the same object (function) in memory.

Rather than having two separate functions in A and B, they point to the same function:

In [12]:
print A.my_func.__func__ is B.my_func.__func__
print my_func.__doc__
True
B: A: Do some stuff.

Creating functions on the fly

Luckily, we can programmatically create functions using the function type, too!

In [13]:
def my_func(self):
    """Do some stuff."""
    pass
In [14]:
def copy_function(f):
    """Create a new function in memory that is a duplicate of `f`."""
    func_type = type(f)
    new_func = func_type(
        f.func_code,      # bytecode
        f.func_globals,   # global namespace
        f.func_name,      # function name
        f.func_defaults,  # default keyword argument values
        f.func_closure)   # closure variables
    new_func.__doc__ = f.__doc__
    return new_func
In [15]:
my_new_func = copy_function(my_func)
my_new_func.__doc__ = "modified: %s" % my_func.__doc__

print my_func.__doc__
print my_new_func.__doc__
Do some stuff.
modified: Do some stuff.

Modifying the docstring, take III

Let's update our make_class function to copy the methods before changing their docstrings:

In [16]:
def my_func(self):
    """Do some stuff."""
    pass
In [17]:
def make_class(name, parents, methods):
    """Create a new class and prefix its method's docstrings to 
    include the class name."""
    for f in methods:
        # copy the function, overwrite the docstring, and replace the old method
        new_func = copy_function(methods[f])
        new_func.__doc__ = "%s: %s" % (name, methods[f].__doc__)
        methods[f] = new_func
    cls = type(name, parents, methods)
    return cls
In [18]:
# Now it works!

A = make_class('A', (object,), {'my_func': my_func})
B = make_class('B', (A,), {'my_func': my_func})

print A().my_func.__doc__
print B().my_func.__doc__
A: Do some stuff.
B: Do some stuff.

Hey, weren't we supposed to be learning about metaclasses?

Actually, we were! A metaclass is any callable that takes parameters for:

  1. the class name
  2. the class's bases (parent classes)
  3. the class's attributes (methods and variables)

The type type we were using before is just the default metaclass.

The function make_class is technically a metaclass, too!

  1. It takes three arguments for the class's name, bases, and attributes.
  2. It modifies the attributes by creating copies of the functions and editing their docstrings.
  3. It creates a new class using these modified attributes.
  4. It returns the new class.

However, Python creates classes in a slightly more complex way than we were creating classes.

We need to modify our make_class function to ignore other class attributes (e.g. non-functions):

In [19]:
def make_class(name, parents, attrs):
    """Create a new class and prefix its method's docstrings to
    include the class name."""
    for a in attrs:
        # skip special methods and non-functions
        if a.startswith("__") or not hasattr(attrs[a], "__call__"):
            continue
            
        # copy the function, overwrite the docstring, and replace the old method
        new_func = copy_function(attrs[a])
        new_func.__doc__ = "%s: %s" % (name, attrs[a].__doc__)
        attrs[a] = new_func
        
    cls = type(name, parents, attrs)
    return cls

Now, all we need is a little special "syntactic sugar" in our class definition, and it works!

In [20]:
class A(object):
    __metaclass__ = make_class
    
    def my_func(self):
        """Do some stuff."""
        pass
    
print A().my_func.__doc__
A: Do some stuff.

Note that this __metaclass__ syntax applies to Python 2.7. The syntax is slightly different for Python 3.

So, what exactly did we do?

(i.e., getting meta about metaclasses)

Metaclasses intervene on class (not instance) creation.

This gives us an opportunity to modify the class's methods before the class is actually created:

  1. Copy each of the functions that will later become methods.
  2. Change the docstrings of these new functions.
  3. Create the class using these new functions instead of the ones that were originally given.

A side note...

Subclasses still won't actually rewrite the docstring correctly:

In [21]:
class A(object):
    __metaclass__ = make_class
    
    def my_func(self):
        """Do some stuff."""
        pass
    
class B(A):
    pass
    
print A().my_func.__doc__
print B().my_func.__doc__
A: Do some stuff.
A: Do some stuff.

This is because my_func is not passed in as an attribute of B (it is already an attribute of A).

To really make this work, you have to go through all the attributes of all the parent classes and copy them, too.

My blog post (link) goes into this in more detail and includes the full code.

What else are metaclasses good for?

Django uses metaclasses to simplify its interface:

In []:
class Person(models.Model):
    name = models.CharField(max_length=30)
    age = models.IntegerField()
In []:
p = Person(name='Jess', age='24')
print(p.age) # this gives an int, not an IntegerField!

(Source: Classes as objects on StackOverflow)

Beware!

Metaclasses can make code incredibly difficult to understand.

Only use them when you really need them!

In the words of Tim Peters:

Metaclasses are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don't (the people who actually need them know with certainty that they need them, and don't need an explanation about why).

... unless you're like me, and you enjoy learning about obscure parts of Python.

But really, if you're writing code for anything anyone else will ever use, this is good advice.

Thanks!

Details are available on my website, http://www.jesshamrick.com/ (see also the first reference below). I'll be posting these slides for reference, too.

This presentation was created with the lovely IPython Notebook, using the ipython nbconvert subcommand to convert the notebook into reveal.js slides.

References