Projects and Coding


A flexible, easy-integration tool for tracking users on web applications, pawprint leverages pandas and SQLalchemy to rapidly read, write, and filter through user events, allowing rapid analysis without constraining the user to a particular level of granularity or framework. It implements a light query language inspired by the Django ORM.


A package for simulating stochastic dynamical models, scotch implements algorithms for continuous-time Markov chains in Python. It is a work-in-progress collaboration with Ruthie Birger; we hope to implement more advanced simulation algorithms, such as adaptive timestepping, as well as trace sampling for calculating summary statistics and parameter inference for stochastic models. Simulation algorithms will eventually be Cythonised or rewritten in C++. Pull requests are welcome !


Manatee is a wrapper around PySpark's DataFrames, to make them more pandas-like. It was started as a learning exercise for PySpark, and to make basic data manipulation tasks less painful when coming from pandas. It's currently in alpha, but I'm hoping to push it to first release in the near future.


Pronounced like the French spirit génépi, genepy is a Python package for interactive sequence alignment and construction of phylogenetic trees, built on top of BioPython. Currently, it's simply a wrapper around ClustalO and PhyML aimed at facilitating visualisation and interactive work with short sequences. Pull requests absolutely welcome.


QED (Quantitative Evaluation of Distortion) is a Matlab package for extracting quantitative information about the distortion in the ommatidia of Drosophila melanogaster from SEM images. It's got a full GUI, and provides an intuitive manner to analyse images fairly rapidly. Why fly eyes ? Drosophila is a fantastic model organism for genetics, and some phenotypes can be seen in the eye. This was the case for the tau and shaggy genes that we studied during my Masters. This package could potentially be applied to other organs or organisms, and extended to compute other measures of distortion.

Princeton University Python Community

I'm one of the founders of the Princeton University Python Community, an interdisciplinary community of Python enthusiasts of all levels at Princeton University. I design and deliver Python for Scientific Computing, an introductory course working through numpy, scipy, matplotlib, and pandas. Along with Paul Gauthier, we also organise biweekly sessions where members meet to demo packages or present concepts and programming ideas. We currently have 250 members, from undergraduate to faculty, across over twenty departments. We operate in affiliation with the Princeton Institute for Computational Science and Engineering.