tokamesh

Published: Jan 12, 2022 by Peter Hill

Project info

licence MIT

Tokamesh is a Python package for generating and working with triangular meshes for tomographic inversion problems in tokamaks. Its features include barycentric linear-interpolation, and local mesh refinement, both of which enable creation of smaller meshes with a lower number of basis functions than more traditional approaches.

This is the second of two projects that I undertook for Dr. Bowman. By the time I started working on tokamesh, Chris had already implemented some of the features and techniques that I demonstrated in the first project, inference-tools, including expanding and automating the unit tests, and modernising and automating the packaging: an early success for PlasmaFAIR!

As a result, tokamesh was in very good state when I started on it, and as such, there wasn’t a great deal for me to do. I expanded the test suite, not just increasing the coverage, but also increasing the specificity by adding tests for functions already covered indirectly through being called by other functions being directly tested.

This left one last interesting thing to tackle: the use of a C library, triangle. The existing way this was used was to compile triangle as an executable, write some input text files to disk, run triangle, and finally read in and parse some text files back to numpy arrays.

The downside to this approach is it relies on a particular hardcoded compiler (gcc) being available, and so is not terribly portable, and would probably not work on Apple or Windows machines. A secondary thing to be concerned about is reading and writing text files to disk can be quite slow, especially for large files. The solution is to use Cython to interface directly with the library, and distribute a pre-compiled library.

Cython-ising the triangle interface consists of four parts. The first part is a .pxd file, which is essentially a Python version of the C triangle.h header file, containing a definition of the struct used for input and output to the library, and the function signature of the entry point.

The second part is a .pyx file which provides the actual interface between the Python code and the C library. Here, instead of writing numpy arrays to text files, I pass them directly in memory to triangle (actually this not quite true: they need to be manipulated somewhat to get them into the correct memory layout, but once that’s done we can pass them by address). This ended up being the hardest bit to implement, mostly due to the triangle documentation being quite vague about exactly which parts of the input and output structures need allocating, and how they should be laid out in memory. Once I’d cracked that nut, the resulting code was shorter and cleaner. It also had the added advantage of avoiding truncation and rounding errors when reading/writing floating point numbers to/from disk!

The last two parts are related to the packaging. One reason I wanted to use Cython to interface to triangle was that it can compile the C code as part of the package. This requires a small addition to setup.py to list the Cython extensions to be built, their source files, libraries that need linking, and preprocessor macros. Now, when we run setup.py or pip install, the triangle library and our Cython interface are compiled automatically. The very last part is to automate this in CI.

Chris had already added the GitHub action from inference-tools to publish the package. I modified this to use cibuildwheel to build the package for Linux, Windows, and Mac simultaneously, and for multiple Python versions too (in addition to a “usual” source package). This means that users of tokamesh can install it without needed to compile the triangle library themselves.

With these changes, tokamesh is now more portable, more readable, and hopefully a little bit faster too!