Published: Jan 31, 2022 by Liam Pattinson
Project info
Paramak is a Python package for generating 3D CAD models of fusion reactors. It is designed to rapidly produce tokamak geometries for parametric studies, and provides numerous standard CAD formats such as STP, STL and Brep.
As Paramak is already a feature-rich and well-written library, our aim was to improve the packaging, testing, and continuous integration aspects of the project rather than adding any specific new features. It already featured automated testing, documentation generation, and PyPI publishing, so only some minor tinkering around the edges was necessary.
One particular area for improvement was the packaging methodology. Paramak was
designed to be installed via a direct invocation of setup.py
, meaning the
recommended way to install the package was to call > python setup.py install
.
Similarly, source distributions and wheels for package publication were
generated using > python setup.py sdist bdist_wheel
. Python developers have
been dissuaded from this approach in recent years, and instead these functions
have delegated away from setuptools
and towards dedicated package management
and building tools such as pip
and build
.
The first step towards improving this aspect of the library was to move package
details away from setup.py
into a YAML-like configuration file, setup.cfg
.
A second configuration file – this time in the TOML format – was added called
pyproject.toml
, which contains information such as package building
dependencies. This approach is recommended by PEP 518, as it permits
build tools to detect and install necessary dependencies before attempting a
build. This solves a previous chicken-and-egg problem, in which Python had to
run setup.py
in order to determine its dependencies, but needed to already
have its dependencies in order to run setup.py
!
There are further advantages to this approach than simply adhering to the latest Python best practices. A significant advantage for Paramak was the inclusion of setuptools_scm, which allows version numbers to be automatically inferred from Git tags. The benefit of this is that the Git tags serve as the single source of truth for the project version number, and developers do not need to manually increment the version for each new release – an error-prone operation that can easily cause confusion further down the line. It also automatically increments the version with development tags after each commit, meaning developers won’t face ambiguity as to which version of the code they’re running when they’re between public releases.
These updates were not as straightfoward to implement as was initially
envisaged, as it was found to be quite difficult to automate version detection
when building a Docker container image. It is recommended to only include the
bare-minimum project files in Docker containers to keep their footprint as small
as possible, and as this excludes the Git repo files, some alternative method
was required. Several solutions were considered, such as temporarily mounting
the host machine’s repo using BuildKit features, or building from a Python wheel
generated by the Github action. In order to maintain multi-platform
compatibility and a user-friendly interface, the solution eventually settled
upon was to set an environment variable SETUPTOOLS_SCM_PRETEND_VERSION
to the
latest version from within Github actions. This environment variable exists
precisely to cater for this use case, as the difficulties in using
setuptools_scm
alongside Docker are well known.
Another problem encountered with Paramak was the layout of unit tests. Several
tests included within the tests/
directory were not designed to be run with
pytest
, and as a result calling pytest tests/
unexpectedly failed. The tests
in question were written to ensure that user examples, written as Jupyter
notebooks, compiled and ran successfully. Since pytest tests/
would not work,
the act of running all unit tests was delegated to a short bash script,
run_tests.sh
, which calls pytest
on each of the pytest
-able files
individually, and runs the notebook tests via a simple python
call. While
this seems innocuous at first glance, it causes surprising behaviour when
coupled with continuous integration, as it is possible for the run_tests.sh
script to execute successfully while failing many of tests within. The notebook
examples tests were moved to their own directory, and were further updated to
make them pytest
-able. This enables users to easily run all unit tests via a
call to pytest tests/
, and to run notebook examples by calling pytest
examples_tests/
. Some further changes were made to simplify the notebook
testing code and reduce code duplication, turning 3 test files and a utility
file into a single parametrized test!
Other changes to Paramak were mostly small additions here-and-there to improve
code style or developer usability. For example, when publishing on PyPI twine
now checks that distributions are built correctly before uploading, and flake8
now ignores a large number of convenience imports in __init__.py
.
As my first project with PlasmaFAIR, I found Paramak to be a very enjoyable library to work with. It is well-written, well-documented, and well-tested. A lot of the concepts I encountered in this project were brand new to me, but are likely to be useful for many other projects going forward. Paramak is preparing for a large update as CadQuery – one of its key dependencies – is expected to release a new version imminently, and it may be that we’ll be able to return to Paramak to assist with this move.