Edifecs, Inc

lead Data Scientist for innovative healthcare tech company

Edifecs builds software for health insurance companies, large provider networks, and US Medicaid and Medicare organizations. Their software is used by 31 of 52 state Medicaids, and 25 of the 36 Blue plans. In the last few years, they’ve started a data science initiative, and I work as their lead Data Scientist. We use health data – medical claims, pharmacy claims, lab data, electronic health records, clinical text notes, etc – and do really cool data science with it: opioid analytics, hospital readmission prediction, medication adherence, etc. Some of my projects at Edifecs: [Read More]

Downtown Emergency Service Center (DESC)

data science volunteer to help Seattle's homelessness crisis

Since summer of 2017 I have been volunteering at DESC, an incredible local non-profit that runs 17 buildings throughout Seattle where they provide permanent supportive housing to the most vulnerable of Seattle’s homeless population. Their client database includes detailed information going back decades, including residency data as well as data from clinical programs (e.g. mental health, chemical dependency). I lead a team of two data science volunteers. Some of the projects we have worked on: [Read More]

Data4Democracy project lead

collaboration to analyze Indian health data

Data for Democracy is an international community of data professionals, trying to be helpful and bring about positive change. I am leading a team of 5-10 data scientists to analyze data coming from India’s National Family Health Survey. Conducted every ten years, these questionnaires provide rich datasets combining individual and family health, and household characteristics. We are mainly looking at women’s empowerment issues, for example trying to predict the sex of the reported head of the household from household characteristics. [Read More]

20 weeks of civic data analyses

Working weekly end-to-end through DS pipeline, for good

Each week for 20 weeks I worked through the data science pipeline with an interesting culturally/politically-relevant dataset, usually inspired by the news. Pose a question; find a dataset; clean it up; explore it to get a feel; build a ML model, or perform a statistical inference, or generate nice plots, or create an interactive Shiny app; communicate with a short report. The goal was to be quick and relevant. The project GitHub repo provides summaries, links to reports, and tags indicating which tools I used. [Read More]

Modulated Predator-Prey Dynamics

In a famous 2000 Science article, “Crossing the Hopf Bifurcation in a Live Predator-Prey System”, Fussmann et. al. measure the population dynamics of a predator-prey system modulated by a third parameter. In this ongoing project, I conjecture that this dataset has interesting topology that can be detected using topological data analysis. [Read More]

Topological data analysis with simulated data

Topological Data Analysis (TDA) looks for interesting topological features of a dataset, by measuring its “persistent homology”. Roughly speaking, 0th dimensional homology counts how many clusters the data is in. the 1st dimensional homology counts, roughly, how many circles or loops are in the data. The 2nd dimensional homology will detect and count bubbles, and higher dimensional homology will count higher-dimensional “bubbles”! [Read More]

Interactive JavaPlexDemo with Barcodes

I wrote a Processing sketch to help demo and build intuition about how topological data analysis works (in two dimensions). It builds on the javaplexDemo.pde written by Mikael Vejdemo-Johansson. You can add and remove data points and see the persistent homology barcode generated and plotted in real-time. You can save datasets and barcodes, to compare them. It provides a nice way to visualize the filtration parameter. [Read More]

Rain and water level in Lake Superior

One of my first projects to apply topological data analysis to real world data, this was a collaboration in 2016 with Josh Thompson at Northern Michigan University. We hunted for a minimal working example of how delayed oscillations can result in interesting topological features, and found it in datasets on precipitation and lake water levels around Lake Superior. [Read More]

Welcome.

This website will exhibit my data science portfolio. My data science projects involve machine learning on Indian population data, Shiny apps to explore biology data, various data visualizations of politically relevant data, and lots and lots of short reports on topological data analysis – finding the “shape of data”. [Read More]