Projects

Projects

Check Github (and previously Bitbucket) for an accurate picture of what I’ve been working on.

Regardless, here’s a quick summary:

Pythological is a GitHub organization providing packages that bring logic/relational programming and symbolic mathematics together in Python.

More specifically, Pythological hosts more advanced Python implementations of unification and miniKanren (e.g. with constrain logic programming capabilities and non-stack-bound recursion).

A Python package with tools for the symbolic manipulation of the graphs created by PyMC.

These tools are designed to help automate the mathematics used in statistics and probability theory, especially the kind employed to produce specialized MCMC routines.

Hy is a wonderful dialect of Lisp that’s embedded in Python.

I’m a core developer on this project.

An implementation of microKanren with constraints in Hy.

hsplus is a Python library (with R bindings) that provides estimates for quantities involving the Horseshoe and Horseshoe+ shrinkage priors. It also contains general numerical estimation procedures for bivariate confluent hypergeometric functions, as well as symbolic SymPy implementations.

amimodels is a Python library that provides core implementations of models designed for use with Advanced Metering Infrastructure (AMI) data in eemeter. The implementations are fundamentally Bayesian state-space and mixture models that automatically account for the systematic changes, missing data and varied observation frequencies. The models and custom MCMC estimation methods are written in PyMC2 and—as such—are easily extensible.

Bus Time is the open source Java suite that provides real-time bus tracking to NYC. I designed and developed the statistical inference capabilities and helped build the production service components. The model handles free and constrained location tracking along street networks, inference for unobserved operational states (e.g. in layover, at a stop, in progress) and path-based states (e.g. current trip, route, run), as well as inference for faulty operator input (e.g. operator ids, sign codes).

In production the model handles real-time updates at ~30 second intervals for hundreds of routes and thousands of buses simultaneously. Its statistical specification is Bayesian and its estimation is performed by a custom particle filter.

prox-methods is a very experimental R package with C++ implementations (via Rcpp) for some of the proximal optimization methods from the paper “Proximal Algorithms in Statistics and Machine Learning”.

open-tracking-tools is an open-source vehicle tracking library that implements custom Particle Filters to infer locations, paths and on/off-road states. Given a transit graph, open-tracking-tools provides robust real-time Bayesian inference for noisy GPS data. OpenTripPlanner graph support is built in, so street information encoded in OpenStreetMap can be used with fairly minimal effort.

Extensions to the Cognitive Foundry API including, but not limited to, specialized distributions, sampling techniques, and numerically stable computations for Dynamic Linear Models.

ParticleBayes is an R package implementing a collection of particle filters for hierarchical Bayesian models that perform sequential parameter estimation.

Java code for Bayesian models that are estimated by Particle Filters and implement parameter learning.

Big data simulation of Chicago’s public transportation to improve transit planning and reduce bus crowding.

An energy analytics tool to make commercial building more energy efficient.