Python is a great language but it was designed with the philosophy that a program should be readable, first and foremost. This means there are many self-documenting parts of Python that can confuse anyone who has never read (or written) Python before. If you’re using interpreter mode to learn Python, you don’t need any of this as Python displays information about special features directly to you. However, once you start writing programs they won’t display any extra information.
Documentation is an important feature in any application for users to understand the code. Python is a great language for writing scripts and applications that are used to run on servers. Python has a wide range of libraries for different purposes making it a popular programming language. The automatic documentation in python allows programmers to generate clean documentation from pythons source code.
Python documentation is one of the first things that beginners try and learn. Automating the documentation is something that we have seen and most of us have been through at least one implementation, especially if we were in a hurry to get up and running with our idea.If you are a python coder, you might have at least once thought of documenting your python code. Automatic documentation is not a new concept and there are many tools available that can do it for you automatically. This article will help you to try some of them and choose the right one for your usecase.
What makes for a good development tool?
Although the term “developer tool” is very general and can apply to a wide range of services, there are a few key features that your top developer tools should have.
It saves you time
As previously mentioned, software developers have an endless list of tasks to complete, so the more time a tool saves us, the better.
Good documentation
Software development is complicated, and some software development tools are complex. I can accept this, and I can take a lousy UI and, in some cases, bad UX, but if the docs are lacking, I’m not going to use the tool.
It integrates well with other tooling
Most developers rely on a particular workflow to complete their tasks. These workflows can include several tools, including Github, Slack, AWS, etc. Therefore, it’s critical when deciding on a dev tool you consider its integrations and how it will fit within your workflow – and improve it!
Good community
Sometimes you may get confused or run into an issue using your new open source development tool. A helpful community is often the answer to your problems. On top of that, a good community can propel a development tool forward, creating a plethora of plugins, themes, etc., with it.
Regular releases and updates
There are several open source development tools out there that are simply not active or maintained. Also, when requesting new features or reporting a bug, you want to be confident the maintainers are around to push a release.
sphinx
By far the most recommended and comprehensive documentation generator. It supports reStructuredText in docstrings and produces a HTML output with a clean visual style. Countless examples (including official Python libraries) can be found here: http://www.sphinx-doc.org/en/master/examples.html
About the only con I could find is that setting it up requires a bit of configuration (using Makefiles) and the documentation for getting started assumes you’re working with a fresh repo. You can also run it with a quickstart script that uses default configurations but it still requires multiple steps. Works for Python 2 and 3 and loads docstrings dynamically through introspection.
Sphinx is far and away the most popular Python documentation tool. Use it. It converts reStructuredText markup language into a range of output formats including HTML, LaTeX (for printable PDF versions), manual pages, and plain text.
There is also great, free hosting for your Sphinx docs: Read The Docs. Use it. You can configure it with commit hooks to your source repository so that rebuilding your documentation will happen automatically.
When run, Sphinx will import your code and using Python’s introspection features it will extract all function, method, and class signatures. It will also extract the accompanying docstrings, and compile it all into well structured and easily readable documentation for your project.
pdoc
Probably the second-most popular Python-exclusive doc tool (Doxygen is more general) it’s got 373 stars and 12 contributors. Its code is a fraction of Sphinx’s complexity and the output is not quite as polished, but it works with zero configuration in a single step. It also supports docstrings for variables through source code parsing. Otherwise it uses introspection. Worth checking out if Sphinx is too complicated for your use case.
pydoctor
A successor to the popular epydoc, it works only for Python 2. Main benefit is that it traces inheritances particularly well, even for multiple interfaces. Works on static source and can pass resulting object model to Sphinx if you prefer its output style. I actually prefer the clean look of Pydoctor to Sphinx however.
doxygen
Not Python-exclusive and its interface is crowded and ugly. It claims to be able to generate some documentation (mostly inheritances and dependencies) from undocumented source code. Should be considered because many teams already know this tool from its wide use in multiple languages (particularly C++).
Readability is a primary focus for Python developers, in both project and code documentation. Following some simple best practices can save both you and others a lot of time.
Project Documentation
A README
file at the root directory should give general information to both users and maintainers of a project. It should be raw text or written in some very easy to read markup, such as reStructuredText or Markdown. It should contain a few lines explaining the purpose of the project or library (without assuming the user knows anything about the project), the URL of the main source for the software, and some basic credit information. This file is the main entry point for readers of the code.
An INSTALL
file is less necessary with Python. The installation instructions are often reduced to one command, such as pip install module
or python setup.py install
, and added to the README
file.
A LICENSE
file should always be present and specify the license under which the software is made available to the public.
A TODO
file or a TODO
section in README
should list the planned development for the code.
A CHANGELOG
file or section in README
should compile a short overview of the changes in the code base for the latest versions.
Project Publication
Depending on the project, your documentation might include some or all of the following components:
- An introduction should give a very short overview of what can be done with the product, using one or two extremely simplified use cases. This is the thirty-second pitch for your project.
- A tutorial should show some primary use cases in more detail. The reader will follow a step-by-step procedure to set-up a working prototype.
- An API reference is typically generated from the code (see docstrings). It will list all publicly available interfaces, parameters, and return values.
- Developer documentation is intended for potential contributors. This can include code convention and general design strategy of the project.
reStructuredText
Most Python documentation is written with reStructuredText. It’s like Markdown, but with all the optional extensions built in.
The reStructuredText Primer and the reStructuredText Quick Reference should help you familiarize yourself with its syntax.
Code Documentation Advice
Comments clarify the code and they are added with purpose of making the code easier to understand. In Python, comments begin with a hash (number sign) (#
).
In Python, docstrings describe modules, classes, and functions:
def square_and_rooter(x): """Return the square root of self times self.""" ...
In general, follow the comment section of PEP 8#comments (the “Python Style Guide”). More information about docstrings can be found at PEP 0257#specification (The Docstring Conventions Guide).
Commenting Sections of Code
Do not use triple-quote strings to comment code. This is not a good practice, because line-oriented command-line tools such as grep will not be aware that the commented code is inactive. It is better to add hashes at the proper indentation level for every commented line. Your editor probably has the ability to do this easily, and it is worth learning the comment/uncomment toggle.
Docstrings and Magic
Some tools use docstrings to embed more-than-documentation behavior, such as unit test logic. Those can be nice, but you won’t ever go wrong with vanilla “here’s what this does.”
Tools like Sphinx will parse your docstrings as reStructuredText and render it correctly as HTML. This makes it very easy to embed snippets of example code in a project’s documentation.
Additionally, Doctest will read all embedded docstrings that look like input from the Python commandline (prefixed with “>>>”) and run them, checking to see if the output of the command matches the text on the following line. This allows developers to embed real examples and usage of functions alongside their source code. As a side effect, it also ensures that their code is tested and works.
def my_function(a, b): """ >>> my_function(2, 3) 6 >>> my_function('a', 3) 'aaa' """ return a * b
Docstrings versus Block comments
These aren’t interchangeable. For a function or class, the leading comment block is a programmer’s note. The docstring describes the operation of the function or class:
# This function slows down program execution for some reason. def square_and_rooter(x): """Returns the square root of self times self.""" ...
Unlike block comments, docstrings are built into the Python language itself. This means you can use all of Python’s powerful introspection capabilities to access docstrings at runtime, compared with comments which are optimized out. Docstrings are accessible from both the __doc__ dunder attribute for almost every Python object, as well as with the built in help() function.
While block comments are usually used to explain what a section of code is doing, or the specifics of an algorithm, docstrings are more intended towards explaining other users of your code (or you in 6 months time) how a particular function can be used and the general purpose of a function, class, or module.
MkDocs & Material installation
MkDocs is a static site generator for building project documentation and together with the Material framework, it simply looks gorgeous. First, we need to install a heap of packages in order to use all of the functionalities of MkDocs. All of these packages are pip-installable.
MkDocs uses a configuration file mkdocs.yml
, where you can enable all of the functionalities and packages installed above. Please find mine here. It includes references to the /docs
and /docs_assets
folders with the theme.
Automate type-hints to docstrings
Previously, I wrote on the importance of writing docstrings, with a focus on Sphinx documentation.
Docstrings are an essential tool to document your functions. Python 3.5+ introduced type-hints, a way to assign static types to variables directly in the function arguments.
Several IDEs such as Pycharm, Visual Studio, and Sublime Text support automatic docstring generation. They do not however infer variable types from type-hints yet, which means that you have to fill both the variable type and descriptions in the docstrings.
Shown above is the implementation in Pycharm with Google-style docstrings. You are free to use other styles (such as reStructuredText/Sphinx or NumPy), but I found a package that exclusively works with Google-style docstrings for our next automation steps.
Automate docstrings to MkDocs
The package mkgendocs
automatically translates Google-style docstrings into pages with the description of Python functions. It uses a configuration file mkgendocs.yml
. The configuration file looks like this
sources_dir: docs
templates_dir: docs/templates
repo: https://github.com/LouisdeBruijn/Medium
version: masterpages:
- page: "scripts/base/base.md"
source: "Python tips/base.py"
functions:
- parse_arguments
- print_df
- unescape_html
- equal_array_items
Two manual steps for the use of this package are
- Add the pages, sources, and functions to be documented to this
mkgendocs.yml
file. - Run
$ gendocs --config mkgendocs.yml
to create the static MkDocs pages with the documentation of these functions.
Next up, we will automate both steps by first creating a script to pre-fill our configurations file, and next attach both steps in a pre-commit Git hook.
Automate the documentation of new functions
First, I wrote a module automate.py
with a function automate_mkdocs_from_docstring()
to fill the mkgendocs.yml
configurations file with all the Python functions in modules (scripts) in a repository.
automate_mkdocs_from_docstring()
uses Pathlib to read the Python scripts in a directory and extract the function names. It saves both the module and the function names in a dictionary and uses this to overwrite the mkgendocs.yml
. This way we can automatically fil the configurations file for the mkgendoc
s
package.
Use of Software Programming Tools:
Given Below are several uses of the software development open source tools:
- Software tools are utilized to accomplish and inquire into the company processes, document the development process of this software, and optimize all procedures.
- By employing these tools in the software development process, the jobs’ results will be more effective.
- Working with the development tools, a programmer can easily maintain the workflow of the job.
Conclusion
Automatically generates Python documentation strings for the code (docstrings) Methods and classes that do not have docstrings are ignored, so the function can be used to scan for incomplete parts of your code.
Automatic documentation for Python projects can be a tedious process. The generally accepted solution to this problem is Sphinx. Sphinx is an open source documentation generation framework written in Python, used for the creation of software project documentation.