添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
Codon compiles Pythonic code into executables that support parallelism
April 25, 2023
Deployed System
Authors:
Article shepherded by:
Laura Nolan

I first learned about Codon by reading an article in MIT's campus news magazine [1] about a compiler for Python. The article's author, Rachel Gordon, stressed that Codon can compile Python scripts into native code that is as fast, and sometimes faster, than hand-crafted code in C/C++. But quotes from Ariya Shajii, one of Codon’s two developers, mentioned that there were also limitations, in that some of the dynamic features of Python would not work, and that not all of Python 3 was available.

I had read another article, Investigating Managed Language Runtime Performance [2], that explained the reasons why Python and JS (v8) are so much slower than C, and even Java, another managed language. Managed languages have the advantage of being more secure, as they don't have the problems with pointers and strings that C/C++ programmers do. Java programs not only get compiled into bytecode, but JDK has a just-in-time (JIT) compiler that produces faster machine code than JS. JS is faster than Python because it converts its code into machine code using JIT, but produces code that is slower than that produced by Java.

Both Python and JS share two of the same performance limitations, as pointed out in the Performance article [2] and associated research [3]. The first is that every access to data or objects must be checked for its type before any operation can continue. In Python, that checking has an overhead of over 40%.
The second issue that both Python and JS have has to do with being able to process data in parallel. JS can have multiple threads, but only one thread gets executed at-a-time. Python has the global interpreter lock (GIL), again limiting a running Python process to a single thread at-a-time. These limitations do make programming easier as they eliminate the possibility of concurrently modifying the same data structure, likely corrupting data, but on modern CPUs, being confined to a single processor thread is lame.
Codon gets around all three of these issues: it's compiled, variables are typed at compile time, and it supports parallel execution. Codon is much faster than other managed languages, and in some cases faster than C/C++.
The Details

Codon sounds too good to be true: a version of Python that gets compiled into machine code and supports multiple threads of execution. But the "version of Python" part is actually an important point: the builders of Codon have built a compiler that accepts a large portion of Python, including all of the most commonly used parts—but not all.

Codon got its start as Seq [4], a domain specific language (DSL) written specifically for working with genomics data. A single sequenced genome consists of a five gigabyte index and tens of gigabytes for hash tables, meaning that programs or scripts written to analyze genomes will always be working with enormous data sets. The authors of Seq had several goals:

  • Allow researchers, who are not programmers by training, to use Pythonic syntax
  • Support parallel execution
  • Include specific features helpful for working with genome sequences

I am going to quote the Seq paper here, as I think the authors did a fine job of explained how they proceeded:

To achieve this, we designed a compiler with a static type system. It performs Python-style duck typing and runtime type checking at compile time, completely eliminating the substantial runtime overhead imposed by the reference Python implementation, CPython, and most other Python implementations alike. Unlike these, we reimplemented all of Python’s language features and built-in facilities from the ground up, completely independent of the CPython runtime. The Seq compiler uses an LLVM backend, and in general uses LLVM as a framework for performing general-purpose optimizations. Seq programs additionally use a lightweight (<200 LOC) runtime library for I/O and memory allocation; for the latter, CPython’s reference counting is replaced with the Boehm garbage collector, a widely-used conservative GC that is a drop-in replacement for malloc.

I want to expand on this a little, although you can learn a lot more by reading the Seq paper, or the later paper [4] that explains Codon in more detail.

Dynamic typing is very handy, and appropriate, for scripting languages like Python. Programmers can write prototypes of programs very quickly because Python abstracts away a lot of the detail in exchange for execution performance. And, for many purposes, you don't need Python to be ten or a hundred times faster when compiled because the script you are running is not processing enormous data files.

Duck typing means that the Codon compiler uses hints found in the source or attempts to deduce them to determine the correct type, and assigns that as a static type. If you wanted to process data where the type is unknown before execution, this may not work for you, although Codon does support a union type that is a possible workaround. In most cases of processing large data sets, the types are known in advance so this is not an issue.

Codon uses LLVM because of its flexibility and support for many platforms and types of hardware, allowing the Codon authors to use it in their backend for code generation and general-purpose optimizations. Compilers begin by parsing the input file, using a set of rules to convert the code into an abstract syntax tree (AST). Later phases of the Codon compiler perform type checking, convert the AST into intermediate representations (IR) that get optimized, then finally converted into machine code through LLVM.

Codon uses OpenMP, an API for shared-memory multiprocessing ( https://openmp.org ). Programmers using Codon indicate the loops that are candidates for multiple threads using the @par decorator. @par expects several parameters, similar to the pragmas used in C++ with OpenMP, such as scheduling, chunk size and number of threads. You can find out more about Codon's multithreading in the documentation [5] and in [4].

/ C++
#pragma omp parallel for schedule(dynamic, 10) num_threads(8)
for (int i = 0; i < N; i++)
c[i] = a[i] + b[i]

# Codon
@par(schedule='dynamic', chunk_size=10, num_threads=8)
for i in range(N):
c[i] = a[i] + b[i]

Codon's @par decorator can be used alone, having the compiler choose parameters for parallelization; or you can set them similarly as you would using OpenMP in C++ (section 4.2 in [4])
Trials

I decided I would install Codon and try to use it. Installation is fairly easy for Linux and Mac users, including support for Apple silicon. At the time I wrote this article, Codon had not been ported to Windows, but Ibrahim Numanagić, one of the developers, said that the code doesn't haven't many system dependencies and should be easy to port.

I downloaded the Linux/Debian version, ran the Python setup.py script after installing two dependencies: Cython and astunparse. I wondered about the inclusion of Cython, a tool for wrapping C++ libraries for use in Python scripts, and was told it would probably be removed in future versions.

Codon is not the same as Python, in that the developers have not yet implemented all the features you would find in Python 3.10, and this, along with duck typing, will likely cause problems if you just try and compile existing scripts. I quickly ran into problems, as I uncovered unsupported bits of Python, and, by looking at the Issues section of their Github pages , so have other people.

Codon supports a JIT feature, so that instead of attempting to compile complete scripts, you can just add a @codon.jit decorator to functions that you think would benefit from being compiled or executed in parallel, becoming much faster to execute.

The developers of Codon have formed a company, Exaloop, to support the further development of Codon. They have also chosen a relatively restrictive license, Business Source License, where the software must be licensed for commercial use, but versions older than three years convert to an Apache license. Non-commercial users are welcome to experiment with Codon.
Conclusions

The developers of Codon have taken a unique approach to supporting Python, in that they have built a compiler and added optimizations that are not possible with other tools for Python. I had thought that CPython was also a compiler, and learned that CPython is the reference implementation of Python , with the 'C' meaning that it is written in C. Numpy is a math library for Python, and PyPy adds a form of JIT to Python.

Whether your projects will benefit from experimenting with Codon will mean taking the time to read the documentation. Codon is not exactly like Python. For example, there's support for Nvidia GPUs included as well and I ran into a limitation when using a dictionary. I suspect that some potential users will appreciate that Codon takes Python as input and produces executables, making the distribution of code simpler while avoiding disclosure of the source. Codon, with its LLVM backend, also seems like a great solution for people wanting to use Python for embedded projects.

My uses of Python are much simpler: I can process millions of lines of nginx logs in seconds, so a reduction in execution time means little to me. I do think there will be others who can take full advantage of Codon.

Acknowledgements

I want to thank two of the developers of Codon, Ariya Shajii and Ibrahim Numanagić, for answering my many questions and reviewing a draft of this article for technical accuracy. The analysis and opinions are my own.

Appendix
References:
Article Categories:
SRE
Programming
Last updated April 25, 2023