Python Code Documentation Generator

Documentation is a field of study that encompasses several disciplines, which explains why there are different types of documentation. However, the most common form remains the same: it is text-based and organized into sections. In fact, when it comes to programming languages such as Python, documentation is absolutely essential because it helps programmers understand how a specific piece of code works.

There are many ways to document python code. You can write about every line in a readme file or you can use an external library like sphinx to format it for you. While the first method works well for small projects, it tends to look unprofessional and is mainly used by open-source projects. Libraries like sphinx, on the other hand, make documentation easier but aren’t always flexible enough to handle even simple projects.

Ever since open source became a thing, documentation has been a sensitive issue. I have experienced it first hand when working with various software. And frankly, that is why I am writing this post. I have spent a lot of time tweaking and fiddling to produce the best documentation I can for the people who use my tools, and I want to show you how to do it too because you deserve the best.

This tutorial uses a simple Python project (Sample Project) to demonstrates how to use Sphinx to generate HTML-based documents. The Sample Project is a collection of some types of binary search trees and binary tree traversal. It is well documented by following NumPy docstring style. The Sample Project’s primary purpose is to be a sample code for this Sphinx tutorial and demo how NumPy style docstrings translate to a real document via Sphinx.

Tips for technical documentation writing

With that in mind, here’s our top 10 list of documentation tips that will help get you started.

1 – Just write it!

You’ve just written code that adds a much-requested feature to your company’s application. It’s time to finish the job by documenting it, but you’ve got 20 items in the backlog…

Your mind will never be as clear about what you’ve done as it is right now. Each day that goes by will make it harder for you to remember what you’ve done and why you did it that way. Save yourself the time and hassle, and just write it now. And, of course, there’s no need to finish a feature before you begin writing. Do it a step at a time, and you’ll be glad you did.

The “fresh in your mind” version from today will give you the strongest framework to build on.

Write it now. Refine it later. Done is better than perfect.

2 – Write your technical documentation with a new dev in mind

Given the ever-moving nature of the high-tech industry, there’s a better than average chance that the next person to work on your code won’t be you. You will have moved on to a higher-profile project, and a new dev will come in to work with your old team. Or maybe your company has brought on a remote team to help keep up with growth.

So, write your code documentation with a new dev in mind.

The purpose of your internal technical documentation is to make your code more understandable. When deciding what to include, just keep that in mind.

3 – Keep code documentation simple

There’s no need to explain every line of code. Document what would you need explained if you were picking up this code to work on it for the first time.

Write in short sentences and keep explanations concise and straightforward. We don’t mean that you shouldn’t write advanced technical code documentation. On the contrary, it’s the elements that aren’t so straightforward that need the best explanations.

The Sample Project can be downloaded from Github.

$ git clone https://github.com/shunsvineyard/python-sample-code.git

The generated documents look like the picture below. It is also available at Read the Docs.

Assumptions and Requirements

This tutorial assumes the following environment:

Python 3.9
Sphinx 3.4.3
Ubuntu 20.04

Note: Sphinx can run on both Linux and Windows. Although this tutorial uses Ubuntu, the steps are the same in the Windows environment.

How to use Sphinx?

Sphinx uses reStructuredText as its markup language. The process of Sphinx generating documents is like this:

Project source code (Python or other supported languages) -> reStructuredText files -> documents (HTML or other supported format)

Sphinx provides two command-line tools: sphinx-quickstart and sphinx-apidoc.

sphinx-quickstart sets up a source directory and creates a default configuration, conf.py, and a master document, index.rst, which serves as a welcome page of a document.
sphinx-apidoc generates reStructuredText files to document from all found modules.

In short, we use these two tools to generate Sphinx source code, i.e., reStructuredText files, and we modify these reStructuredText files, and finally use Sphinx to build excellent documents.

Workflow

The same as software needs a developer’s maintenance, writing a software document is not a one-time job. It needs to be updated when the software changes. The picture below demonstrates the basic Sphinx workflow.

The following sections detail each step of the workflow.

Prepare

Before we start using Sphinx, we need to set up our working environment.

user@ubuntu:~$ python3.9 -m venv sphinxvenv
user@ubuntu:~$ source sphinxvenv/bin/activate
(sphinxvenv) user@ubuntu:~$ git clone https://github.com/shunsvineyard/python-sample-code.git
(sphinxvenv) user@ubuntu:~$ cd python-sample-code/
(sphinxvenv) user@ubuntu:~/python-sample-code$ python -m pip install -r requirements.txt

Now, we have the Sample Project and working environment for the Sphinx demo. Because the Sample Project already contains the docs folder, we need to delete it.

The layout of the Sample Project after we delete the docs folder looks like:

├── LICENSE
├── README.rst
├── requirements.txt
├── setup.py
├── tests
└── trees
    ├── __init__.py
    ├── __main__.py
    ├── bin
    │   ├── __init__.py
    │   └── tree_cli.py
    ├── binary_trees
    │   ├── __init__.py
    │   ├── avl_tree.py
    │   ├── binary_search_tree.py
    │   ├── binary_tree.py
    │   ├── red_black_tree.py
    │   ├── threaded_binary_tree.py
    │   └── traversal.py
    └── tree_exceptions.py

Step 1: Use sphinx-quickstart to generate Sphinx source directory with conf.py and index.rst

Assume we want to put all the document related files in the docs directory. So, we begin by creating a Sphinx documentation directory, docs. Then, we go to the docs directory and run sphinx-quickstart.

(sphinxvenv) user@ubuntu:~/python-sample-code$ mkdir docs
(sphinxvenv) user@ubuntu:~/python-sample-code$ cd docs/
(sphinxvenv) user@ubuntu:~/python-sample-code/docs$ sphinx-quickstart

Once we run sphinx-quickstart, it asks a few questions about this project. The followings are the example answers for these questions.

Welcome to the Sphinx 3.4.3 quickstart utility.

Please enter values for the following settings (just press Enter to
accept a default value, if one is given in brackets).

Selected root path: .

You have two options for placing the build directory for Sphinx output.
Either, you use a directory "_build" within the root path, or you separate
"source" and "build" directories within the root path.
> Separate source and build directories (y/n) [n]: y

The project name will occur in several places in the built documentation.
> Project name: Sample Project
> Author name(s): Author
> Project release []: 0.0.1

If the documents are to be written in a language other than English,
you can select a language here by its language code. Sphinx will then
translate text that it generates into that language.

For a list of supported codes, see
https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-language.
> Project language [en]: 

Creating file /home/shunsvineyard/workspace/python-sample-code/docs/source/conf.py.
Creating file /home/shunsvineyard/workspace/python-sample-code/docs/source/index.rst.
Creating file /home/shunsvineyard/workspace/python-sample-code/docs/Makefile.
Creating file /home/shunsvineyard/workspace/python-sample-code/docs/make.bat.

Finished: An initial directory structure has been created.

You should now populate your master file /home/shunsvineyard/workspace/python-sample-code/docs/source/index.rst and create other documentation
source files. Use the Makefile to build the docs, like so:
   make builder
where "builder" is one of the supported builders, e.g. html, latex or linkcheck.

At the end of the sphinx-quickstart, it shows how to build the documents.

You should now populate your master file ./source/index.rst and create other documentation
source files. Use the Makefile to build the docs, like so:
   make builder
where "builder" is one of the supported builders, e.g. html, latex or linkcheck.

If we do make html here, Sphinx will generate the default documents which contain nothing about the Sample Project.

Note: Sphinx is not a tool that offers fully automatic documents generation like Doxygen. sphinx-quickstart only generates some default files such as index.rst and conf.py with basic information answered by a user. Therefore, we need to do some work to make the documents real.

After running sphinx-quickstart, the layout of the docs folder looks like:

docs
├── Makefile
├── build
├── make.bat
└── source
    ├── _static
    ├── _templates
    ├── conf.py
    └── index.rst

Note that Makefile is for Linux, and make.bat is for Windows.

Step 2: Configure the conf.py

sphinx-quickstart generates a few files, and the most important one is conf.py, which is the configuration of the documents. Although conf.py serves as a configuration file, it is a real Python file. The content of conf.py is Python syntax.

Using Sphinx to generate a document is highly configurable. This section demonstrates the most basic configurations: the path to the project source code, the theme for the documents, and extensions.

Set the path to the project

To make Sphinx be able to find the project, we need to uncomment these three lines.

# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))

And update the path to the project.

import os
import sys
sys.path.insert(0, os.path.abspath('../..'))

Select the Theme

Sphinx provides many built-in themes. The default is alabaster.

html_theme = 'alabaster'

In this tutorial, we change it to bizstyle.

html_theme = 'bizstyle'

More themes and their configurations can be found at https://www.sphinx-doc.org/en/master/usage/theming.html

Add an extension for NumPy style

The Sample Project uses NumPy style for docstrings. Therefore, we need to add the extension (napoleon) for parsing NumPy style docstrings.

extensions = [
    'sphinx.ext.napoleon'
]

This extension (napoleon) supports NumPy and Google style docstrings and provides several configurable features. For the Sample Project, since we use NumPy style docstrings, we should disable Google style.

napoleon_google_docstring = False

Other settings for napoleon can be found at https://www.sphinx-doc.org/en/master/usage/extensions/napoleon.html#module-sphinx.ext.napoleon

The complete conf.py example can be found at https://github.com/shunsvineyard/python-sample-code/blob/master/docs/source/conf.py.

Besides, Sphinx has many built-in extensions and also supports custom extensions. To learn more, please visit https://www.sphinx-doc.org/en/master/usage/extensions/index.html

Now, we have the basic configuration for our project. Next, we use sphinx-apidoc to generate reStructuredText files from the Sample Project source code.

Step 3: Use sphinx-apidoc to generate reStructuredText files from source code

sphinx-apidoc is a tool for automatically generating reStructuredText files from source code, e.g. Python modules. To use it, run

$ sphinx-apidoc -f -o <path-to-output> <path-to-module>

-f means force overwriting of any existing generated files.

-o means the path to place the output files.

For the Sample Project, we run the following command.

(sphinxvenv) user@ubuntu:~/python-sample-code/docs$ sphinx-apidoc -f -o source/ ../trees/

Complete usage of sphinx-apidoc is at https://www.sphinx-doc.org/en/master/man/sphinx-apidoc.html.

In the Sample Project, sphinx-apidoc generates a few files, modules.rst, trees.bin.rst, trees.binary_trees.rst, and trees.rst. The layout of the project looks like the following:

docs
├── Makefile
├── build
│   └── doctrees
│       ├── environment.pickle
│       └── index.doctree
├── make.bat
└── source
    ├── _static
    ├── _templates
    ├── conf.py
    ├── index.rst
    ├── modules.rst
    ├── trees.bin.rst
    ├── trees.binary_trees.rst
    └── trees.rst

Step 4: Edit index.rst and the generated reStructuredText files

The other important file that sphinx-quickstart generates is index.rst. index.rst is the master document that serves as a welcome page and contains the root of the ‘’table of contents tree’’ (toctree). The toctree initially is empty when sphinx-quickstart creates index.rst.

.. toctree::
   :maxdepth: 2
   :caption: Contents:

Add the modules to the index.rst

The generated modules.rst contains all the modules. So we need to add the modules.rst to index.rst.

To add document to be listing on the welcome page (index.rst), do

.. toctree::
   :maxdepth: 2
   :caption: Contents:

   modules

Note: when to add another reStructuredText file, use the file name without extension. If there is a hierarchy of the file, use forward slash ‘’/’’ as directory separators.

Add the README.rst to index.rst

Since the Sample Project already has a readme file, README.rst, at the top level of the project, we can add it to the document’s welcome page.

Create a readme.rst file under docs/source and add the line .. include:: ../../README.rst. (See https://github.com/shunsvineyard/python-sample-code/blob/master/docs/source/readme.rst)
Add the readme to the index.rst, so the welcome page will include it.

After these two steps, the index.rst looks like:

.. Sample Project documentation master file, created by
   sphinx-quickstart on Wed Jan 13 23:24:14 2021.
   You can adapt this file completely to your liking, but it should at least
   contain the root `toctree` directive.

Welcome to Sample Project's documentation!
==========================================

.. toctree::
   :maxdepth: 2
   :caption: Contents:

   readme
   modules

Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

See https://raw.githubusercontent.com/shunsvineyard/python-sample-code/master/docs/source/index.rst for the complete example.

Note: when we add a new module, class, API, or any code change that affects the documents, we need to repeat Step 3 and Step 4 to update the documents.

Step 5: Build the documents

The last step to generate the documents is to issue make html (if we want to generate HTML-based documents).

(sphinxvenv) user@ubuntu:~/python-sample-code/docs$ make html

After we run the make html command, a build folder is created under docs. Also, the HTML-based documents are located at build/html. The generated document looks like this:

Software Documentation in a Post-Agile World

Software documentation doesn’t have a good reputation among certain software circles nowadays. There are people who think that worrying about documentation betrays the agile principles somehow. Working software, accompanied by a comprehensive suite of unit tests (they argue) should be all the documentation you need.

That couldn’t be further from the truth.

Yes, the Agile Manifesto’s authors do value working software more than documentation. And they’re right to do so. Documentation would be a complete waste of time if the thing it’s documenting doesn’t work.

But what if you do have working software? In that case, comprehensive, accurate, and up-to-date documentation is one of the key elements that will help you keep it that way.

Automated Documentation: What Does That Mean?

We’ve started this post by covering the need for software documentation, showing how many software teams commonly neglect this area and the benefits they leave on the table by doing so. Now it’s time to take things one step further. Keep reading to learn about the benefits of automation applied to documentation.

Automators Don’t Use Automation as Often as They Should

I like to say that automation is the raison d’?tre of software engineering. And I say that not only because I like to use French expressions in hopes of sounding intelligent, but also because it’s true. I think it’s fair to say that the whole purpose of our field is to employ automation in order to enable businesses and other organizations to solve their problems more efficiently.

So I think it’s ironic, and even a little bit sad, that our field has?until recently?failed to employ automation in order to solve our own problems more efficiently. Sure developers have been automating facets of their jobs since the dawn of software, but only recently have we as a field started to employ automation to its fullest potential. Nowadays automation isn’t only used in isolation, to help individual contributors perform specific and distinct tasks more efficiently. Instead, teams and companies as a whole design and execute comprehensive approaches that make use of automation at all stages in the software development cycle.

In such a scenario, automated documentation is definitely not the one odd out. On the contrary, you might say it’s even kind of obvious.

Defining Automated Documentation

People who don’t really buy into software documentation might have fair reasons not to do so. There are valid criticisms of documentation, and you might have heard some of them:

Writing documentation is slow, error-prone, and time-consuming.
It’s hard to prevent documentation from getting outdated.
Opportunity cost: the time spent creating and maintaining documentation could be better used on other tasks.

The list could go on and on, but you’ve got the picture. Writing and maintaining documentation is a high-cost endeavor. We’d be better off automating one (or most) parts of the process. That would not only cut down costs but also potentially improve the quality of the documentation itself. Finally, lots of additional person-hours would be suddenly available to be better employed elsewhere, on tasks that require human creativity the most. That pretty much sums up what automated documentation is about. So here goes our definition:

Automated documentation is the process of using automation to decrease the amount of human intervention required to create, maintain, and share software documentation, cutting down costs, improving quality and freeing humans from tedious, error-prone work.

Readability is a primary focus for Python developers, in both project and code documentation. Following some simple best practices can save both you and others a lot of time.

Project Documentation

A README file at the root directory should give general information to both users and maintainers of a project. It should be raw text or written in some very easy to read markup, such as reStructuredText or Markdown. It should contain a few lines explaining the purpose of the project or library (without assuming the user knows anything about the project), the URL of the main source for the software, and some basic credit information. This file is the main entry point for readers of the code.

An INSTALL file is less necessary with Python. The installation instructions are often reduced to one command, such as pip install module or python setup.py install, and added to the README file.

A LICENSE file should always be present and specify the license under which the software is made available to the public.

A TODO file or a TODO section in README should list the planned development for the code.

A CHANGELOG file or section in README should compile a short overview of the changes in the code base for the latest versions.

Project Publication

Depending on the project, your documentation might include some or all of the following components:

An introduction should give a very short overview of what can be done with the product, using one or two extremely simplified use cases. This is the thirty-second pitch for your project.
A tutorial should show some primary use cases in more detail. The reader will follow a step-by-step procedure to set-up a working prototype.
An API reference is typically generated from the code (see docstrings). It will list all publicly available interfaces, parameters, and return values.
Developer documentation is intended for potential contributors. This can include code convention and general design strategy of the project.

Conclusion

Python has one of the most powerful Integrated Development Environments [IDEs] out there, which in itself is a blessing. However, having such a robust, great and feature-rich IDE at our disposal can also be a curse sometimes. Things can be hard to find, in context or get lost within the massive amount of tools that it has to offer.

No need to write separate documentation for your python code again. Write comments in the style of Javadoc, PythonDoc or Pydoc and generate documentation for your programs in HTML, PDF and other formats.