Python Best Documentation Tool

The Python programming language was made to eliminate verbosity with a clear syntax and simple data typing. However, Python still requires a strong reference documentation that is as clear and intuitive as possible. The right documentation can help any programmer to grasp a new library or framework and start using it in their project immediately. There are many ways to produce documentation for Python code and often the correct solution depends on your specific requirements.

A Python Best Documentation Tool is a software that generates online documentation from Python files. It extracts docstrings from Python module source code to generate documentation. The example creates a web server and parse the module on the fly, passing in the file name in the URL.

Python documentation tools, pydoc example and python docstring are the three most popular topics of Python. In this page, you will see an overview of how to generate python documentation as well as how to install pydoc for generating documentation for your custom modules.

Readability is a primary focus for Python developers, in both project and code documentation. Following some simple best practices can save both you and others a lot of time.

Project Documentation

README file at the root directory should give general information to both users and maintainers of a project. It should be raw text or written in some very easy to read markup, such as reStructuredText or Markdown. It should contain a few lines explaining the purpose of the project or library (without assuming the user knows anything about the project), the URL of the main source for the software, and some basic credit information. This file is the main entry point for readers of the code.

An INSTALL file is less necessary with Python. The installation instructions are often reduced to one command, such as pip install module or python setup.py install, and added to the README file.

LICENSE file should always be present and specify the license under which the software is made available to the public.

TODO file or a TODO section in README should list the planned development for the code.

CHANGELOG file or section in README should compile a short overview of the changes in the code base for the latest versions.

Project Publication

Depending on the project, your documentation might include some or all of the following components:

  • An introduction should give a very short overview of what can be done with the product, using one or two extremely simplified use cases. This is the thirty-second pitch for your project.
  • tutorial should show some primary use cases in more detail. The reader will follow a step-by-step procedure to set-up a working prototype.
  • An API reference is typically generated from the code (see docstrings). It will list all publicly available interfaces, parameters, and return values.
  • Developer documentation is intended for potential contributors. This can include code convention and general design strategy of the project.

reStructuredText

Most Python documentation is written with reStructuredText. It’s like Markdown, but with all the optional extensions built in.

The reStructuredText Primer and the reStructuredText Quick Reference should help you familiarize yourself with its syntax.

Code Documentation Advice

Comments clarify the code and they are added with purpose of making the code easier to understand. In Python, comments begin with a hash (number sign) (#).

In Python, docstrings describe modules, classes, and functions:

def square_and_rooter(x):
    """Return the square root of self times self."""
    ...

In general, follow the comment section of PEP 8#comments (the “Python Style Guide”). More information about docstrings can be found at PEP 0257#specification (The Docstring Conventions Guide).

Commenting Sections of Code

Do not use triple-quote strings to comment code. This is not a good practice, because line-oriented command-line tools such as grep will not be aware that the commented code is inactive. It is better to add hashes at the proper indentation level for every commented line. Your editor probably has the ability to do this easily, and it is worth learning the comment/uncomment toggle.

Docstrings and Magic

Some tools use docstrings to embed more-than-documentation behavior, such as unit test logic. Those can be nice, but you won’t ever go wrong with vanilla “here’s what this does.”

Tools like Sphinx will parse your docstrings as reStructuredText and render it correctly as HTML. This makes it very easy to embed snippets of example code in a project’s documentation.

Additionally, Doctest will read all embedded docstrings that look like input from the Python commandline (prefixed with “>>>”) and run them, checking to see if the output of the command matches the text on the following line. This allows developers to embed real examples and usage of functions alongside their source code. As a side effect, it also ensures that their code is tested and works.

def my_function(a, b):
    """
    >>> my_function(2, 3)
    6
    >>> my_function('a', 3)
    'aaa'
    """
    return a * b

Docstrings versus Block comments

These aren’t interchangeable. For a function or class, the leading comment block is a programmer’s note. The docstring describes the operation of the function or class:

# This function slows down program execution for some reason.
def square_and_rooter(x):
    """Returns the square root of self times self."""
    ...

Unlike block comments, docstrings are built into the Python language itself. This means you can use all of Python’s powerful introspection capabilities to access docstrings at runtime, compared with comments which are optimized out. Docstrings are accessible from both the __doc__ dunder attribute for almost every Python object, as well as with the built in help() function.

While block comments are usually used to explain what a section of code is doing, or the specifics of an algorithm, docstrings are more intended towards explaining other users of your code (or you in 6 months time) how a particular function can be used and the general purpose of a function, class, or module.

Writing Docstrings

Depending on the complexity of the function, method, or class being written, a one-line docstring may be perfectly appropriate. These are generally used for really obvious cases, such as:

def add(a, b):
    """Add two numbers and return the result."""
    return a + b

The docstring should describe the function in a way that is easy to understand. For simple cases like trivial functions and classes, simply embedding the function’s signature (i.e. add(a, b) -> result) in the docstring is unnecessary. This is because with Python’s inspect module, it is already quite easy to find this information if needed, and it is also readily available by reading the source code.

In larger or more complex projects however, it is often a good idea to give more information about a function, what it does, any exceptions it may raise, what it returns, or relevant details about the parameters.

For more detailed documentation of code a popular style used, is the one used by the NumPy project, often called NumPy style docstrings. While it can take up more lines than the previous example, it allows the developer to include a lot more information about a method, function, or class.

def random_number_generator(arg1, arg2):
    """
    Summary line.

    Extended description of function.

    Parameters
    ----------
    arg1 : int
        Description of arg1
    arg2 : str
        Description of arg2

    Returns
    -------
    int
        Description of return value

    """
    return 42

The sphinx.ext.napoleon plugin allows Sphinx to parse this style of docstrings, making it easy to incorporate NumPy style docstrings into your project.

At the end of the day, it doesn’t really matter what style is used for writing docstrings; their purpose is to serve as documentation for anyone who may need to read or make changes to your code. As long as it is correct, understandable, and gets the relevant points across then it has done the job it was designed to do.

Comparison of Python documentation generators

I will attempt here to compare the top Python documentation tools using publicly available information. The goal is to identify which tool will be best for generating HTML from the docstrings within the source code. Criteria include but are not limited to:

  • Visual appeal and ease-of-use (where case studies and/or screenshots are available)
  • Potential dependency fragility, most importantly which versions of Python
  • Community size/engagement and availability of tool support
  • Run-time introspection vs static analysis (important only if parts of the software interface are dynamically generated)

I started by looking at the list of tools here: https://wiki.python.org/moin/DocumentationTools

The tools currently supported and/or in active development are:

sphinx

By far the most recommended and comprehensive documentation generator. It supports reStructuredText in docstrings and produces a HTML output with a clean visual style. Countless examples (including official Python libraries) can be found here: http://www.sphinx-doc.org/en/master/examples.html

About the only con I could find is that setting it up requires a bit of configuration (using Makefiles) and the documentation for getting started assumes you’re working with a fresh repo. You can also run it with a quickstart script that uses default configurations but it still requires multiple steps. Works for Python 2 and 3 and loads docstrings dynamically through introspection.

pdoc

Probably the second-most popular Python-exclusive doc tool (Doxygen is more general) it’s got 373 stars and 12 contributors. Its code is a fraction of Sphinx’s complexity and the output is not quite as polished, but it works with zero configuration in a single step. It also supports docstrings for variables through source code parsing. Otherwise it uses introspection. Worth checking out if Sphinx is too complicated for your use case.

pydoctor

A successor to the popular epydoc, it works only for Python 2. Main benefit is that it traces inheritances particularly well, even for multiple interfaces. Works on static source and can pass resulting object model to Sphinx if you prefer its output style. I actually prefer the clean look of Pydoctor to Sphinx however.

doxygen

Not Python-exclusive and its interface is crowded and ugly. It claims to be able to generate some documentation (mostly inheritances and dependencies) from undocumented source code. Should be considered because many teams already know this tool from its wide use in multiple languages (particularly C++).

This page is primarily about tools that help, specifically, in generating documentation for software written in Python, i.e., tools that can use language-specific features to automate at least a part of the code documentation work for you. The last section also lists general documentation tools with no specific support for Python (though some of them are themselves written in Python).

Tools that support auto-documentation of code can be broadly classified into tools that:

  • import the code to generate documentation based on runtime introspection
  • parse and analyze the code statically (without running it)

Hosting documentation online

If you intend to share your package with others, it will be useful to make your documentation accessible online. It’s common to host Python package documentation on the free online hosting service Read the Docs, which can automate the building, deployment, and hosting of your documentation. Read the Docs works by connecting to an online repository hosting your package documentation, such as a GitHub repository. When you push changes to your repository, Read the Docs automatically builds a fresh copy of your documentation (i.e., runs make html) and hosts it at the URL https://pkgname.readthedocs.io/ (you can also configure Read the Docs to use a custom domain name). This means that any changes you make to your documentation source files are immediately deployed to your users. If you need your documentation to be private (i.e., only available to employees of a company), Read the Docs offers a paid “Business plan” with this functionality.

The Read the Docs documentation will provide the most up-to-date steps required to host your documentation online. For our pycounts package, this involved the following steps:

  1. Visit https://readthedocs.org/ and click on “Sign up”.
  2. Select “Sign up with GitHub”.
  3. Click “Import a Project”.
  4. Click “Import Manually”.
  5. Fill in the project details by:
  6. Click “Next” and then “Build version”.

After following the steps above, your documentation should be successfully built by Read the Docs, and you should be able to access it via the “View Docs” button on the build page. For example, the documentation for pycounts is now available at https://pycounts.readthedocs.io/en/latest/. This documentation will be automatically re-built by Read the Docs each time you push changes to the specified default branch of your GitHub repository.

Conclusion

Documentation is an important aspect of software development that is often overlooked. But it shouldn’t be, because without documentation no one can use or understand your code. To make matters worse, the people who will rely on your code to get things done—engineers, analysts, and even internal support staff—are usually not programmers. They usually do not know Python at all. And to make matters worse still, they have little time and patience for learning if they have to wait until an engineer teaches them.

Python documentation is an essential part of every good Python project. It doesn’t matter if you create a quick script or a complex application – without correct documentation, the code won’t be user-friendly and its quality will be low.

Leave a Comment