Best Documentation Tools for Python

What is Documentation? Documentation is a design element and is used to provide information regarding the use of a product in an intuitive way. Good documentation helps in easy to understand the code and installation of products. For example, it may contain information on how to build from source, how to run benchmarks, any material regarding safety usage, notes on maintenance and troubleshooting and so on. So, for anyone who intends to write something, even if it’s for yourself (and therefore will not be sharing with anyone else), it’s good practice to make sure that you’re documenting your processes and results somewhere.

When you have a growing software project, it can be increasingly challenging to provide good quality documentation. Fact is that most open-source projects don’t offer this type of documentation. In order to help the respective developers and other contributors, you definitely need to update your development documentation. These tools are designed for writing code quicker, for documenting objective API, for developing and testing both – in other words, these tools are what make your life easier as a developer.

Python coding is in big demand due to its unique features, like ease of use, readability, and code simplicity. In the current market where python developers are high in demand, several tools have come up for documenting Python code.

Documentation is an essential part of software development as well as becoming a good developer. It is necessary to document your API so that other developers can easily understand how it works and how to use it, rather than spending time reading the source code. In this article, we will discuss some of the best documentation tools for Python.

Pros of creating good documentation:

  1. Increases information exchange between team members— this single reason is just so powerful!
  2. Decreases onboarding time of new members

3. Helps to organize big projects (helps to see the big picture)

4. Increases team member awareness of how the whole project is organized

5. Increases development speed — finding information is faster and thus development is faster

6. Promotes standards and consistency

sphinx

By far the most recommended and comprehensive documentation generator. It supports reStructuredText in docstrings and produces a HTML output with a clean visual style. Countless examples (including official Python libraries) can be found here: http://www.sphinx-doc.org/en/master/examples.html

About the only con I could find is that setting it up requires a bit of configuration (using Makefiles) and the documentation for getting started assumes you’re working with a fresh repo. You can also run it with a quickstart script that uses default configurations but it still requires multiple steps. Works for Python 2 and 3 and loads docstrings dynamically through introspection.

pdoc

Probably the second-most popular Python-exclusive doc tool (Doxygen is more general) it’s got 373 stars and 12 contributors. Its code is a fraction of Sphinx’s complexity and the output is not quite as polished, but it works with zero configuration in a single step. It also supports docstrings for variables through source code parsing. Otherwise it uses introspection. Worth checking out if Sphinx is too complicated for your use case.

pydoctor

A successor to the popular epydoc, it works only for Python 2. Main benefit is that it traces inheritances particularly well, even for multiple interfaces. Works on static source and can pass resulting object model to Sphinx if you prefer its output style. I actually prefer the clean look of Pydoctor to Sphinx however.

doxygen

Not Python-exclusive and its interface is crowded and ugly. It claims to be able to generate some documentation (mostly inheritances and dependencies) from undocumented source code. Should be considered because many teams already know this tool from its wide use in multiple languages (particularly C++).

Readability is a primary focus for Python developers, in both project and code documentation. Following some simple best practices can save both you and others a lot of time.

Project Documentation

README file at the root directory should give general information to both users and maintainers of a project. It should be raw text or written in some very easy to read markup, such as reStructuredText or Markdown. It should contain a few lines explaining the purpose of the project or library (without assuming the user knows anything about the project), the URL of the main source for the software, and some basic credit information. This file is the main entry point for readers of the code.

An INSTALL file is less necessary with Python. The installation instructions are often reduced to one command, such as pip install module or python setup.py install, and added to the README file.

LICENSE file should always be present and specify the license under which the software is made available to the public.

TODO file or a TODO section in README should list the planned development for the code.

CHANGELOG file or section in README should compile a short overview of the changes in the code base for the latest versions.

Project Publication

Depending on the project, your documentation might include some or all of the following components:

  • An introduction should give a very short overview of what can be done with the product, using one or two extremely simplified use cases. This is the thirty-second pitch for your project.
  • tutorial should show some primary use cases in more detail. The reader will follow a step-by-step procedure to set-up a working prototype.
  • An API reference is typically generated from the code (see docstrings). It will list all publicly available interfaces, parameters, and return values.
  • Developer documentation is intended for potential contributors. This can include code convention and general design strategy of the project.

reStructuredText

Most Python documentation is written with reStructuredText. It’s like Markdown, but with all the optional extensions built in.

The reStructuredText Primer and the reStructuredText Quick Reference should help you familiarize yourself with its syntax.

Code Documentation Advice

Comments clarify the code and they are added with purpose of making the code easier to understand. In Python, comments begin with a hash (number sign) (#).

In Python, docstrings describe modules, classes, and functions:

def square_and_rooter(x):
    """Return the square root of self times self."""
    ...

In general, follow the comment section of PEP 8#comments (the “Python Style Guide”). More information about docstrings can be found at PEP 0257#specification (The Docstring Conventions Guide).

Commenting Sections of Code

Do not use triple-quote strings to comment code. This is not a good practice, because line-oriented command-line tools such as grep will not be aware that the commented code is inactive. It is better to add hashes at the proper indentation level for every commented line. Your editor probably has the ability to do this easily, and it is worth learning the comment/uncomment toggle.

Docstrings and Magic

Some tools use docstrings to embed more-than-documentation behavior, such as unit test logic. Those can be nice, but you won’t ever go wrong with vanilla “here’s what this does.”

Tools like Sphinx will parse your docstrings as reStructuredText and render it correctly as HTML. This makes it very easy to embed snippets of example code in a project’s documentation.

Additionally, Doctest will read all embedded docstrings that look like input from the Python commandline (prefixed with “>>>”) and run them, checking to see if the output of the command matches the text on the following line. This allows developers to embed real examples and usage of functions alongside their source code. As a side effect, it also ensures that their code is tested and works.

def my_function(a, b):
    """
    >>> my_function(2, 3)
    6
    >>> my_function('a', 3)
    'aaa'
    """
    return a * b

Docstrings versus Block comments

These aren’t interchangeable. For a function or class, the leading comment block is a programmer’s note. The docstring describes the operation of the function or class:

# This function slows down program execution for some reason.
def square_and_rooter(x):
    """Returns the square root of self times self."""
    ...

Unlike block comments, docstrings are built into the Python language itself. This means you can use all of Python’s powerful introspection capabilities to access docstrings at runtime, compared with comments which are optimized out. Docstrings are accessible from both the __doc__ dunder attribute for almost every Python object, as well as with the built in help() function.

While block comments are usually used to explain what a section of code is doing, or the specifics of an algorithm, docstrings are more intended towards explaining other users of your code (or you in 6 months time) how a particular function can be used and the general purpose of a function, class, or module.

Writing Docstrings

Depending on the complexity of the function, method, or class being written, a one-line docstring may be perfectly appropriate. These are generally used for really obvious cases, such as:

def add(a, b):
    """Add two numbers and return the result."""
    return a + b

The docstring should describe the function in a way that is easy to understand. For simple cases like trivial functions and classes, simply embedding the function’s signature (i.e. add(a, b) -> result) in the docstring is unnecessary. This is because with Python’s inspect module, it is already quite easy to find this information if needed, and it is also readily available by reading the source code.

In larger or more complex projects however, it is often a good idea to give more information about a function, what it does, any exceptions it may raise, what it returns, or relevant details about the parameters.

For more detailed documentation of code a popular style used, is the one used by the NumPy project, often called NumPy style docstrings. While it can take up more lines than the previous example, it allows the developer to include a lot more information about a method, function, or class.

def random_number_generator(arg1, arg2):
    """
    Summary line.

    Extended description of function.

    Parameters
    ----------
    arg1 : int
        Description of arg1
    arg2 : str
        Description of arg2

    Returns
    -------
    int
        Description of return value

    """
    return 42

The sphinx.ext.napoleon plugin allows Sphinx to parse this style of docstrings, making it easy to incorporate NumPy style docstrings into your project.

At the end of the day, it doesn’t really matter what style is used for writing docstrings; their purpose is to serve as documentation for anyone who may need to read or make changes to your code. As long as it is correct, understandable, and gets the relevant points across then it has done the job it was designed to do.

pdoc3 — the semi-automatic solution

What does it do?

Python package pdoc provides types, functions, and a command-line interface for accessing public documentation of Python modules, and for presenting it in a user-friendly, industry-standard open format […]

pdoc extracts documentation of:

– modules (including submodules),

– functions (including methods, properties, coroutines …),

– classes, and

– variables (including globals, class variables, and instance variables)

pdoc only extracts public API documentation ([…] if their identifiers don’t begin with an underscore ‘_’)

So pdoc takes you code (modules, functions/methods, classes, variables) and creates a browsable (html/plaintext) documentation. It’s semi-automatic because it uses your code to create the main docs, but it will add more useful info if you have docstrings.

Other features:

  • Docstrings for objects can be disabled, overridden, or whitelisted with a special module-level dictionary __pdoc__
  • Supports multiple docstring formats: pure Markdown (with extensions), numpydoc, Google-style and some reST directives
  • LaTeX math syntax is supported when placed between recognized delimiters
  • Linking to other identifiers in your modules
  • Programmatic usage — control pdoc using Python
  • Custom templates – override the built-in HTML/CSS or plaintext
  • With CLI params you can: change the output directory, omit the source code preview, target documentation at specific modules, filter identifiers that will be documented
  • Create output formatted in Markdown-Extra, compatible with most Markdown-(to-HTML-)to-PDF converters
  • Local HTTP server (*it was throwing exceptions for me)
  • Requires Python 3.5+
  • License GNU AGPL-3.0 (*make sure you double-check how you use pdoc3 in a commercial product, read more)

Installation:

pip install pdoc3 

Usage (inside your Python project):

pdoc --html .

This will create a directory called html containing another directory (named the same way as your project dir) and inside you will find .html files with your Python modules documented. Here is the output of pdoc ran on my example Python code

The blue “html” directory is the output of running “pdoc”.

Cons of creating documentation?

  1. Requires time (and money) — sometimes a project can’t afford to spend time on documentation.

2. It’s hard to keep it up-to-date, especially in startup projects with rapid changes

3. Creating documentation is not a “pleasant” activity for the developers (compared to creating code) — some developers don’t like to create the documentation, they will be demotivated when asked to do it.

Cons of bad documentation?

  1. “Out-of-date” documentation can lead to misunderstandings and slower development

2. Can get fragmented — it’s hard to maintain one, consistent documentation.

Conclusion

Documentation is the bane of many developers’ existence. But documentation can also be a great help when you’re stuck and need to find out more. As a developer, you will typically work either in an agile or agile-like environment. Projects are moving quickly, users are getting frustrated because they can’t figure out how to accomplish their goals, and much blood is being spilt on the software development floor. This can be a massive headache for developers.

The fastest way to get new Python programmers up to speed on your code is through documentation. It’s convenient and accessible, especially with tools like Jupyter Notebooks. In many cases, you can use the same tools that the professionals use — even if you’re just starting out.

Leave a Comment