Published: Jan 12, 2022 by Peter Hill
Project info
Tokamesh is a Python package for generating and working with triangular meshes for tomographic inversion problems in tokamaks. Its features include barycentric linear-interpolation, and local mesh refinement, both of which enable creation of smaller meshes with a lower number of basis functions than more traditional approaches.
This is the second of two projects that I undertook for Dr. Bowman. By the time I started working on tokamesh, Chris had already implemented some of the features and techniques that I demonstrated in the first project, inference-tools, including expanding and automating the unit tests, and modernising and automating the packaging: an early success for PlasmaFAIR!
As a result, tokamesh was in very good state when I started on it, and as such, there wasn’t a great deal for me to do. I expanded the test suite, not just increasing the coverage, but also increasing the specificity by adding tests for functions already covered indirectly through being called by other functions being directly tested.
This left one last interesting thing to tackle: the use of a C library,
triangle
. The existing way this was used was to compile triangle
as an executable, write some input text files to disk, run triangle
, and
finally read in and parse some text files back to numpy
arrays.
The downside to this approach is it relies on a particular hardcoded compiler
(gcc
) being available, and so is not terribly portable, and would probably not
work on Apple or Windows machines. A secondary thing to be concerned about is
reading and writing text files to disk can be quite slow, especially for large
files. The solution is to use Cython to interface directly with the
library, and distribute a pre-compiled library.
Cython-ising the triangle
interface consists of four parts. The first part is
a .pxd
file, which is essentially a Python version of the C triangle.h
header file, containing a definition of the struct
used for input and output
to the library, and the function signature of the entry point.
The second part is a .pyx
file which provides the actual interface between the
Python code and the C library. Here, instead of writing numpy
arrays to text
files, I pass them directly in memory to triangle
(actually this not quite
true: they need to be manipulated somewhat to get them into the correct memory
layout, but once that’s done we can pass them by address). This ended up being
the hardest bit to implement, mostly due to the triangle
documentation being
quite vague about exactly which parts of the input and output structures need
allocating, and how they should be laid out in memory. Once I’d cracked that
nut, the resulting code was shorter and cleaner. It also had the added advantage
of avoiding truncation and rounding errors when reading/writing floating point
numbers to/from disk!
The last two parts are related to the packaging. One reason I wanted to use
Cython to interface to triangle
was that it can compile the C code as part of
the package. This requires a small addition to setup.py
to list the Cython
extensions to be built, their source files, libraries that need linking, and
preprocessor macros. Now, when we run setup.py
or pip install
, the
triangle
library and our Cython interface are compiled automatically. The very
last part is to automate this in CI.
Chris had already added the GitHub action from inference-tools
to publish the
package. I modified this to use cibuildwheel to build the
package for Linux, Windows, and Mac simultaneously, and for multiple Python
versions too (in addition to a “usual” source package). This means that users of
tokamesh
can install it without needed to compile the triangle
library
themselves.
With these changes, tokamesh
is now more portable, more readable, and
hopefully a little bit faster too!