Python Code Documentation

Many coders are aware of Python documentation methods and converting python code to documentation. I’m sure you’ve encountered the old saying, a picture is worth a thousand words. Programs with high quality have documentations of their codes. Python allows developers to convert python code to documentation that ultimately helps other understand its structure and working of code.

Documenting Python Code: A Complete Guide tells you everything you need to know to document your Python code. You will learn how to document both the API and your project, how to use the standard tools in Python for manipulating those docs and write them yourself. Documentation is an important part of software development, but it’s boring and hard. This book makes it easy.

Learning how to code can be the best choice you’ll ever make. However if you want your code to be able to communicate with other programmers, you need to learn how to document it properly. Python code documentation is a method of using spacing, formatting and commenting in your python code to create easily readable documentation. If you have any questions about python documentation, please feel free to ask in the comment section!

Python style is now a popular choice for many projects. It’s readable, concise, and not overly verbose. You can also use the python code documentation to keep track of changes in your project.

Where Do I Start?

The documentation of projects have a simple progression:

  1. No Documentation
  2. Some Documentation
  3. Complete Documentation
  4. Good Documentation
  5. Great Documentation

If you’re at a loss about where to go next with your documentation, look at where your project is now in relation to the progression above. Do you have any documentation? If not, then start there. If you have some documentation but are missing some of the key project files, get started by adding those.

pdoc3 — the semi-automatic solution

What does it do?

Python package pdoc provides types, functions, and a command-line interface for accessing public documentation of Python modules, and for presenting it in a user-friendly, industry-standard open format […]

pdoc extracts documentation of:

– modules (including submodules),

– functions (including methods, properties, coroutines …),

– classes, and

– variables (including globals, class variables, and instance variables)

pdoc only extracts public API documentation ([…] if their identifiers don’t begin with an underscore ‘_’)

So pdoc takes you code (modules, functions/methods, classes, variables) and creates a browsable (html/plaintext) documentation. It’s semi-automatic because it uses your code to create the main docs, but it will add more useful info if you have docstrings.

Other features:

  • Docstrings for objects can be disabled, overridden, or whitelisted with a special module-level dictionary __pdoc__
  • Supports multiple docstring formats: pure Markdown (with extensions), numpydocGoogle-style and some reST directives
  • LaTeX math syntax is supported when placed between recognized delimiters
  • Linking to other identifiers in your modules
  • Programmatic usage — control pdoc using Python
  • Custom templates – override the built-in HTML/CSS or plaintext
  • With CLI params you can: change the output directory, omit the source code preview, target documentation at specific modules, filter identifiers that will be documented
  • Create output formatted in Markdown-Extra, compatible with most Markdown-(to-HTML-)to-PDF converters
  • Local HTTP server (*it was throwing exceptions for me)
  • Requires Python 3.5+
  • License GNU AGPL-3.0 (*make sure you double-check how you use pdoc3 in a commercial product, read more)

Installation:

pip install pdoc3 

Usage (inside your Python project):

pdoc --html .

This will create a directory called html containing another directory (named the same way as your project dir) and inside you will find .html files with your Python modules documented. Here is the output of pdoc ran on my example Python code

The blue “html” directory is the output of running “pdoc”.

The index.htmlfile:

“pdoc” index.html file opened in a browser.

We have all our modules indexed on the left. What is important to notice pdoc also documented code without docstrings. This is huge — you don’t need to have docstrings and you will still benefit from pdoc.

No docstring code:

module_variable = 1

class NoDocStrings:
class_variable = 2

def __init__(self):
self.instance_variable = 3

def foo(self):
pass

def _private_method(self):
pass

def __name_mangled_method(self):
pass


def module_function():
pass

The result:

Code without docstrings is also indexed by “pdoc”!

A class with docstrings:

class Foo:
"""
This is a docstring of class Foo
"""
class_variable = 3
"""This is a docstring for class_variable"""

def __init__(self):
self.instance_var_1 = 1
"""This is a docstring for instance_var_1"""
self.instance_var_2 = 2

def foo_method(self):
"""
This is a docstring for foo_method.
:param self:
:return:
"""

def bar_method(self):
"""
This is a docstring for bar_method.
:return:
"""

def _private_method(self):
"""
This is a docstring for _private_method
:return:
"""

def __name_mangled_method(self):
"""
This is a docstring for __name_mangled_method
:return:
"""

The result:

A class which has no docstrings, but inherits from a class with a docstring:

class InheritedFoo(Foo):

def foo_method(self):
pass

def bar_method(self):
"""This is an overwritten docstring for bar_method"""
pass

The result:

The bar_method docstring was overwritten.

Private and name-mangled methods are not documented but you can see them when you click “Expand source code”:

Nested classes:

class Baz:
"""
This is a docstring for class Baz
"""

class BazInner:
"""
This is a docstring for BazInner
"""

The result:

Inner class “BazInner” was indexed as a variablewish pdoc would also indicate that it is a class 😉

Module-level variables and functions:

"""
This is a docstring for module_c
"""

module_variable = 100
"""
This is a docstring for module_variable
"""

def module_function():
"""
This is a docstring for module_function
:return:
"""
function_variable = 10
"""
This is a docstring for function_variable
"""

def _private_module_function():
"""
This is a docstring for _private_module_function
:return:
"""

def __name_mangled_function():
"""
This is a docstring for __name_mangled_function
:return:
"""

The result:

You can see more examples on pdoc3 docs page — they documented their own code with pdoc 🙂

MkDocs — the manual solution

What does it do?

MkDocs is a fastsimple and downright gorgeous static site generator that’s geared towards building project documentation. Documentation source files are written in Markdown, and configured with a single YAML configuration file.

Out of the 3 tools I’m describing this one is the least automatic, it only autogenerates a nice-looking documentation website. All of the content is created manually.

Features:

Installation:

pip install mkdocs

Usage:

mkdocs new mkdocs_test

Result:

  • mkdocs.yml — configuration
  • index.md — the default docs page

To run the dev server:

mkdocs serve

And go to http://127.0.0.1:8000 (by default)

The server will auto-reload the page whenever you change the configuration or documented pages.

Adding a new page:

Create a .md file in docs/ dir and link it in the configuration file in nav section:

nav:
- Home: index.md
- About: about.md

You also get “search”, “previous”, “next” buttons for free.

Changing the theme is as easy as (in the config file):

theme: readthedocs

Building the site (in cli):

mkdocs build

This will create a static html site located int site directory.

Deploying:

The documentation site that you just built only uses static files so you’ll be able to host it from pretty much anywhere. GitHub project pages and Amazon S3 may be good hosting options, depending upon your needs.

Alternatives:

What other tools are available in the Python ecosystem that help with documentation:

  • The offical Python documentation pages use reStructuredText (as markup language) and Sphinx, (*I find Markdown a bit simpler than rST but it’s a personal choice)
  • Doxygen —generates documentation from annotated sources
  • Portray — Python3 command-line tool and library that helps you create great documentation websites for your Python projects with as little effort as possible
  • Pycco — Python port of Docco: the original quick-and-dirty, hundred-line-long, literate-programming-style documentation generator. It produces HTML that displays your comments alongside your code.

If you are creating an API then Swagger-UI is a must.

With very little effort you can create module/class/function documentation using pdoc3. If the developers write docstrings then you will benefit even more.

Writing manual documentation takes more time, but things like architecture overview, installation etc should be (at least briefly) described. MkDocs makes it easy to create simple and beautiful documentation.

Just remember that having some documentation is not an excuse for creating bad code. Self-documenting code is an absolute priority.

reStructuredText

Most Python documentation is written with reStructuredText. It’s like Markdown, but with all the optional extensions built in.

The reStructuredText Primer and the reStructuredText Quick Reference should help you familiarize yourself with its syntax.

Code Documentation Advice

Comments clarify the code and they are added with purpose of making the code easier to understand. In Python, comments begin with a hash (number sign) (#).

In Python, docstrings describe modules, classes, and functions:

def square_and_rooter(x):
    """Return the square root of self times self."""
    ...

In general, follow the comment section of PEP 8#comments (the “Python Style Guide”). More information about docstrings can be found at PEP 0257#specification (The Docstring Conventions Guide).

Commenting Sections of Code

Do not use triple-quote strings to comment code. This is not a good practice, because line-oriented command-line tools such as grep will not be aware that the commented code is inactive. It is better to add hashes at the proper indentation level for every commented line. Your editor probably has the ability to do this easily, and it is worth learning the comment/uncomment toggle.

Docstrings and Magic

Some tools use docstrings to embed more-than-documentation behavior, such as unit test logic. Those can be nice, but you won’t ever go wrong with vanilla “here’s what this does.”

Tools like Sphinx will parse your docstrings as reStructuredText and render it correctly as HTML. This makes it very easy to embed snippets of example code in a project’s documentation.

Additionally, Doctest will read all embedded docstrings that look like input from the Python commandline (prefixed with “>>>”) and run them, checking to see if the output of the command matches the text on the following line. This allows developers to embed real examples and usage of functions alongside their source code. As a side effect, it also ensures that their code is tested and works.

def my_function(a, b):
    """
    >>> my_function(2, 3)
    6
    >>> my_function('a', 3)
    'aaa'
    """
    return a * b

Docstrings versus Block comments

These aren’t interchangeable. For a function or class, the leading comment block is a programmer’s note. The docstring describes the operation of the function or class:

# This function slows down program execution for some reason.
def square_and_rooter(x):
    """Returns the square root of self times self."""
    ...

Unlike block comments, docstrings are built into the Python language itself. This means you can use all of Python’s powerful introspection capabilities to access docstrings at runtime, compared with comments which are optimized out. Docstrings are accessible from both the __doc__ dunder attribute for almost every Python object, as well as with the built in help() function.

While block comments are usually used to explain what a section of code is doing, or the specifics of an algorithm, docstrings are more intended towards explaining other users of your code (or you in 6 months time) how a particular function can be used and the general purpose of a function, class, or module.

Writing Docstrings

Depending on the complexity of the function, method, or class being written, a one-line docstring may be perfectly appropriate. These are generally used for really obvious cases, such as:

def add(a, b):
    """Add two numbers and return the result."""
    return a + b

The docstring should describe the function in a way that is easy to understand. For simple cases like trivial functions and classes, simply embedding the function’s signature (i.e. add(a, b) -> result) in the docstring is unnecessary. This is because with Python’s inspect module, it is already quite easy to find this information if needed, and it is also readily available by reading the source code.

In larger or more complex projects however, it is often a good idea to give more information about a function, what it does, any exceptions it may raise, what it returns, or relevant details about the parameters.

For more detailed documentation of code a popular style used, is the one used by the NumPy project, often called NumPy style docstrings. While it can take up more lines than the previous example, it allows the developer to include a lot more information about a method, function, or class.

def random_number_generator(arg1, arg2):
    """
    Summary line.

    Extended description of function.

    Parameters
    ----------
    arg1 : int
        Description of arg1
    arg2 : str
        Description of arg2

    Returns
    -------
    int
        Description of return value

    """
    return 42

The sphinx.ext.napoleon plugin allows Sphinx to parse this style of docstrings, making it easy to incorporate NumPy style docstrings into your project.

At the end of the day, it doesn’t really matter what style is used for writing docstrings; their purpose is to serve as documentation for anyone who may need to read or make changes to your code. As long as it is correct, understandable, and gets the relevant points across then it has done the job it was designed to do.

For further reading on docstrings, feel free to consult PEP 257

Other Tools

You might see these in the wild. Use Sphinx.PyccoPycco is a “literate-programming-style documentation generator” and is a port of the node.js Docco. It makes code into a side-by-side HTML code and documentation.RonnRonn builds Unix manuals. It converts human readable textfiles to roff for terminal display, and also to HTML for the web.EpydocEpydoc is discontinued. Use Sphinx instead.MkDocsMkDocs is a fast and simple static site generator that’s geared towards building project documentation with Markdown.

Reusing signatures and docstrings with autodoc

To use autodoc, first add it to the list of enabled extensions:docs/source/conf.py

extensions = [
    'sphinx.ext.duration',
    'sphinx.ext.doctest',
    'sphinx.ext.autodoc',
]

Next, move the content of the .. py:function directive to the function docstring in the original Python file, as follows:lumache.py

def get_random_ingredients(kind=None):
    """
    Return a list of random ingredients as strings.

    :param kind: Optional "kind" of ingredients.
    :type kind: list[str] or None
    :raise lumache.InvalidKindError: If the kind is invalid.
    :return: The ingredients list.
    :rtype: list[str]

    """
    return ["shells", "gorgonzola", "parsley"]

Finally, replace the .. py:function directive from the Sphinx documentation with autofunction:docs/source/usage.rst

you can use the ``lumache.get_random_ingredients()`` function:

.. autofunction:: lumache.get_random_ingredients

If you now build the HTML documentation, the output will be the same! With the advantage that it is generated from the code itself. Sphinx took the reStructuredText from the docstring and included it, also generating proper cross-references.

You can also autogenerate documentation from other objects. For example, add the code for the InvalidKindError exception:lumache.py

class InvalidKindError(Exception):
    """Raised if the kind is invalid."""
    pass

And replace the .. py:exception directive with autoexception as follows:docs/source/usage.rst

or ``"veggies"``. Otherwise, :py:func:`lumache.get_random_ingredients`
will raise an exception.

.. autoexception:: lumache.InvalidKindError

And again, after running make html, the output will be the same as before.

Generating comprehensive API references

While using sphinx.ext.autodoc makes keeping the code and the documentation in sync much easier, it still requires you to write an auto* directive for every object you want to document. Sphinx provides yet another level of automation: the autosummary extension.

The autosummary directive generates documents that contain all the necessary autodoc directives. To use it, first enable the autosummary extension:docs/source/conf.py

extensions = [
   'sphinx.ext.duration',
   'sphinx.ext.doctest',
   'sphinx.ext.autodoc',
   'sphinx.ext.autosummary',
]

Next, create a new api.rst file with these contents:docs/source/api.rst

API
===

.. autosummary::
   :toctree: generated

   lumache

Remember to include the new document in the root toctree:docs/source/index.rst

Contents
--------

.. toctree::

   usage
   api

Finally, after you build the HTML documentation running make html, it will contain two new pages:

  • api.html, corresponding to docs/source/api.rst and containing a table with the objects you included in the autosummary directive (in this case, only one).
  • generated/lumache.html, corresponding to a newly created reST file generated/lumache.rst and containing a summary of members of the module, in this case one function and one exception.
Summary page created by autosummary
Summary page created by autosummary

Each of the links in the summary page will take you to the places where you originally used the corresponding autodoc directive, in this case in the usage.rst document.

Note

The generated files are based on Jinja2 templates that can be customized, but that is out of scope for this tutorial.

Conclusion

Generating HTML Documentation from a single source file can be more useful because it puts all of the relevant information in one place. Doxygen is the most popular solution for generating documentation from annotated C++ sources, but it also supports other popular programming languages such as C, Objective-C, C

Nowadays, Python is the most popular language for coding. No wonder that scripters prefer it to implement any application or website. In fact, Python has long become the standard programming language which comes with a wide range of features and functionality. However, not so many people are aware of Python code documentation.

Leave a Comment