Resume

(Also available as a pdf. This page is generated by parsing the LaTeX source for the pdf with a bunch of wild regexes. If you see any conversion errors, please let me know.)

Experienced in big data platforms and pipelines in Spark, Hadoop MapReduce, and Scalding; large-scale, zero-downtime data and service migrations; scaling services to support rapid user growth; monorepo build tools like Bazel and Pants; developer experience; geolocation problems of all sorts; and high-QPS full-text indexing and retrieval.

Experience

Foursquare, Staff Engineer

New York, NY, 2011–2024

Big Data

Authored dozens of terabyte- to petabyte-scale, production-critical pipelines in Spark, Scalding, and Hadoop MapReduce. Performed zero-downtime upgrade of large monorepo with hundreds of pipelines from aws emr 5 to 6, Hadoop 2 to 3, Spark 2 to 3, and many transitive dependencies. Wrote a compatibility layer to enable the migration of hundreds of pipeline jobs to Airflow from a legacy workflow scheduling system. Led migration of data pipelines from on-prem to aws emr.

Backend Engineering

Migrated our core data collection from Mongodb to a custom, hybrid solution with ⅓ the operational cost. Built numerous core features of the Foursquare and Swarm apps enjoyed by millions of users, including the Stickers achievement system, the weekly Leaderboard, the Year in Review, a personalized Trivia system, and social sharing features.

DevEx

Migration of a 2.5m loc monorepo used by 100 developers to Bazel build system, lowering ci build times from 3 hours to 20 minutes. Upgraded 2m loc Scala codebase from Scala 2.11 to 2.12. Led upgrade of 200k loc Python codebase from Python 2 to 3. Enabled type checking in several Python repos and refactored them to an error free state. Experienced with Docker, Jenkins, GitHub Actions, and aws CodeBuild.

Search

Built and maintained the point-of-interest search service used by Instagram, Uber, and others for several years, and scaled the service from 2k qps to 40k qps. Designed and built ml-based ranking that improved precision@1 from 50% to 80%. Performed zero-downtime migration of search datastore from Solr to Elasticsearch.

ExpanDrive, Founding Engineer

Cambridge, MA, 2007–2010

Codeveloper

Built Strongspace, an online file sharing and backup service written in Ruby on Rails. Implemented billing, user management, and a web-based file browser. Primary Objective-C programmer of ExpanDrive, remote file system over ftp / sftp for macos.

Publications

Shaw, Blake, Jon Shea, Siddhartha Sinha, and Andrew Hogue. “Learning to rank for spatiotemporal search.” In Proceedings of the sixth ACM international conference on Web search and data mining, pp. 717–726. 2013. dl.acm.org/doi/abs/10.1145/2433396.2433485

Education

B.A., Physics, Dartmouth College, Hanover, NH, 2003

M.Sc., Solar-Terrestrial Physics (uncompleted), Thayer School of Engineering, Hanover, NH, 2006