Fork me on GitHub


CPython Compiler Tools



Parsers

Ply

PLY is an implementation of lex and yacc parsing tools for Python.

LALR(1)
PyParsing

The pyparsing module is an alternative approach to creating and executing simple grammars, vs. the traditional lex/yacc approach, or the use of regular expressions. The pyparsing module provides a library of classes that client code uses to construct the grammar directly in Python code.

LL(1)
Parsimonious

Parsimonious aims to be the fastest arbitrary-lookahead parser written in pure Python. It's based on parsing expression grammars (PEGs), which means you feed it a simplified sort of EBNF notation. Parsimonious was designed to undergird a MediaWiki parser that wouldn't take 5 seconds or a GB of RAM to do one page.

PEG
funcparserlib

funcparserlib is a parser combinator library.

LL(*)
Spark

SPARK stands for the Scanning, Parsing, and Rewriting Kit. It formerly had no name, and was referred to as the "little language framework."

Earley
pgen2

pgen2 is a pure Python implementation of the Python parser generator, pgen. It forms the basis for Mython, the extensible variant of the Python programming language.

LL(1)
pgenmodule.c

Much like the parser module exposes the Python parser, this pgenmodule.c exposes the parser generator used to create the Python parser, pgen, to Python iteslf. Proposed in PEP 269.

ANTLR

ANTLR is a Java parser generator framework that can emit Python parsers.

LL(1+)

Syntax Definition

ASDL

The Zephyr Abstract Syntax Description Lanuguage (ASDL) is a language designed to describe the tree-like data structures in compilers. Its main goal is to provide a method for compiler components written in different languages to interoperate. ASDL makes it easier for applications written in a variety of programming languages to communicate complex recursive data structures.

asdl_py

Metaprogramming

Mython

Mython is an extensible variant of the Python programming language. Mython makes Python extensible by adding two things: parametric quotation statement, and compile-time metaprogramming. The parametric quote statement is simply syntactic sugar for saying "run some function on this embedded string". Compile-time metaprogramming allows you to evaluate that function on the embedded string at compile time. This gives you added choice, both in terms of what your code looks like, and when you want to evaluate that code.

Basil

Basil is a metaprogramming framework and playground for Python variants.

Cog

Cog is a Python source generation library. Cog transforms files in a very simple way: it finds chunks of Python code embedded in them, executes the Python code, and inserts its output back into the original file. The file can contain whatever text you like around the Python code.

Code Generation

LLVMPy

llvmpy is a Python wrapper around the llvm C++ library which allows simple access to compiler tools.

LLVM
llvm-cbuilder

llvm-cbuilder is a Python DSL for constructing higher level LLVM logic.

LLVM
cgen

C/C++ source generation from an AST for CUDA and OpenCL.

C C++
CodePy

CodePy is a C/C++ metaprogramming toolkit for Python. It handles two aspects of native-code metaprogramming, Generating C/C++ source code and Compiling this source code and dynamically loading it into the Python interpreter.

C C++

Compilers

Cython

The Cython language is a superset of the Python language that additionally supports calling C functions and declaring C types on variables and class attributes. This allows the compiler to generate very efficient C code from Cython code. The C code is generated once and then compiles with all major C/C++ compilers.

Target: C
Theano

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently and with transparent use of a GPU.

Target: C++ Target: CUDA
Numba

Numba is a NumPy aware dynamic compiler for Python. It creates LLVM bit-code from Python syntax and then creates a wrapper around that bitcode to call from Python.

Target: LLVM
NumbaPro

NumbaPro is a proprietary Continuum Analytics product that compiles NumPy expressions to native code with support for parallel execution on multiple cores and GPU hardware. NumbaPro also comes with CUDA Python which supports CUDA programming with Python syntax.

Target: LLVM Target: CUDA PTX
Copperhead

Copperhead is a project to bring data parallelism to Python. Copperhead defines a small functional, data parallel subset of Python, which is then dynamically compiled and executed on parallel platforms. Currently, Copperhead targets NVIDIA GPUs, as well as multicore CPUs through OpenMP and Threading Building Blocks (TBB).

Target: C++
Shedskin

Shed Skin is an experimental compiler, that can translate pure, but implicitly statically typed Python programs into optimized C++. It can generate stand-alone programs or extension modules that can be imported and used in larger Python programs.

Target: C
Parakeet

Parakeet is a runtime compiler for numerical Python. It creates specialized versions of a function for distinct input types and translates array expressions and NumPy library calls into data parallel operators. The current backend uses LLVM but GPU support is in the works.

Target: LLVM
LLPython

The primary goal of the llpython package is to provide a Python dialect/subset that maps directly to LLVM code.

Target: LLVM
Nuitka

Right now Nuitka is a good replacement for the Python interpreter and compiles every construct that CPython 2.6 and 2.7 offer. It translates the Python into a C++ program that then uses "libpython" to execute in the same way as CPython does, in a very compatible way.

Target: C++
ocl

ocl is a minimalist library that dynamically (at run time) converts decorated Python functions into C99, OpenCL, or JavaScript. In the C99 case, it also uses distutils to compile the functions to machine language and allow you to run the compiled ones instead of the interpreted ones. In the OpenCL case you can run the compiled ones using pyOpenCL.

Target: C Target: Javascript
pythran

Pythran is a python to c++ compiler for a subset of the python language. It takes a python module annotated with a few interface description and turns it into a native python module with the same interface, but (hopefully) faster.

Target: C++11

Interpreters

MyPy

The mypy programming language is an experimental Python variant that aims to combine the benefits of dynamic (or "duck") typing and static typing.

VM: Alore
PyPy

PyPy is a fast, compliant alternative implementation of the Python language supporting a variety of language extensions and code generation paths.

VM: PyPy runtime
tinypy

TinyPy is a minimalist implementation of python in 64k of code.

VM: Custom

VMs

Byterun

Byterun is a pure-Python implementation of a Python bytecode execution virtual machine.

Python
falcon

Falcon is an extension module for Python which implements a optimized, register machine based interpreter, inside of your interpreter.

C++

GPU Interfaces

PyOpenCL

PyOpenCL lets you access the OpenCL parallel computation API from Python

CUDA
PyCuda

PyCUDA lets you access Nvidia‘s CUDA parallel computation API from Python.

OpenCL

Bytecode Utilities

BytecodeAssembler

BytecodeAssembler is a simple bytecode assembler module that handles most low-level bytecode generation details like jump offsets, stack size tracking, line number table generation, constant and variable name index tracking, etc. That way, you can focus your attention on the desired semantics of your bytecode instead of on these mechanical issues.

Byteplay

Byteplay lets you convert Python code objects into equivalent objects which are easy to play with, and lets you convert those objects back into living Python code objects. It's useful for applying crazy transformations on Python functions, and is also useful in learning Python byte code intricacies.

Unwind

Unwind provides a universal disassembler that is able to disassemble *.pyc files from both Python 2 and Python 3.

Maynard

Maynard is a Python bytecode dissasembler/assembler as well as a variety of utilities for working with python by

AST Utilities

codegen.py

codegen.py is a small script to translate Python AST to Python source.

Meta

A Pure Python module containing a framework to manipulate and analyze python ast's and bytecode.

astoptimizer

astoptimizer is an optimizer for Python code working on the Abstract Syntax Tree (AST, high-level representration). It does as much work as possible at compile time.

Type Utilities

RPython

RPython is a restricted subset of Python that is amenable to static analysis. RPython is a core part of the PyPy compiler infastructure.

python-typelanguage

Python-typelanguage provides a type language for communicating about Python programs and values. Humans communicating to other humans, humans communicating to the computer, and even the computer communicating to humans (via type inference and run-time contract checking).

Ad-hoc Local
python-type-inference

Python-type-inference is a Hindley-Milner type inference engine for Python with an OCaml implementation.

Hindley-Milner Global Local
starkiller

Optimization and Rewriting

PyRewrite

Pyrewrite aims to be a small term rewrite library written in pure Python, it is heavily inspired by the StrategoXT project and intended for rewriting ATerm like expression grammars.

Strategic Combinators
strategies

Strategies is a library for control flow programming with higher order functions that loosely resembles the Stratego language.

Strategic Combinators

Control Flow

LLPython has 0-CFA analysis for a subset of Python bytecode.

0-CFA

Static Analysis

PyLint

Pylint is a Python static checker.

Language Tools

bitey

Bitey is a LLVM import tool and ctypes wrapper.

nobitey

nobitey is a tool to load LLVM compiled bitcode and autogenerate a ctypes binding.

PyCParser

PyCParser is a C99 compatable parser written in pure Python, capable of parsing C source files and C header files.

Six

Six is a Python 2-3 compatability layer that offers a variety of compatability mappings for language level features including AST ,parsing, and bytecode.

Other Language Implementations


Lispy

Lispy is a Scheme Interpreter in Python

Bob Scheme

Bob is a suite of implementations of the Scheme language in Python.

Mini-C

Mini-C is a compiler for a subset of the C programming language written in Python.

Retroforth

Retro is a concatenative, stack based language with roots in Forth