machine learning, probabilistic inference, data science, online education
New Check out the Techcrunch article on some of my recent deep learning projects at Google Research.
New If you do work in the intersection of signal processing or machine learning and education, please consider submitting to the IEEE Special Issue on Signal Processing and Machine Learning for Education and Human Learning at Scale! Here is the Call for Papers and the deadline is October 1st 2016.
New Astrophotography excursion: here's a panoramic photo that Avneesh Sud, Nathan Silberman and I took of the Milky Way arch off Hole-in-the-Wall trail in Mojave National Preserve. The galactic center lies to the right with thunderstorms covering the lightdomes of Vegas to the left and Bullhead city in the center, which were aglow with dramatic lightning flashes. I'm also quite proud of the timelapse that we created the same night :) Best viewed full screen at 1080p.
New Here's a panoramic photo that I took recently of Queenstown, New Zealand.
Check out our visualization
of 40,000 Octave/Matlab implementations of linear regression! This is part of my ongoing Codewebs
project for analyzing and providing detailed feedback to students in a programming based MOOC
with Chris Piech,
Andy Nguyen, and
Leo Guibas. Data from
Andrew Ng's course on Machine Learning offered through
Also check out Ben Lorica's blog post, and Hal Hodson's article at New Scientist about our work!
I am a research scientist at Google working on machine learning, computer vision and NLP projects. Prior to Google, I was a postdoctoral fellow working in the Computer Science Department at Stanford University and was supported by an NSF/CRA CI (Computing Innovations) fellowship. At Stanford I was a member of the Geometric Computation Group which is headed by Leonidas Guibas. I was also part of the Lytics Lab, a multidisciplinary group focused on Learning Analytics.
I received a Ph.D. in Robotics from the School of Computer Science at Carnegie Mellon University in 2011, where I worked with Carlos Guestrin. During graduate school, I was fortunate enough to spend two happy summers interning in Seattle, first with Intel Research working with Ali Rahimi, then at Microsoft Research working with Ashish Kapoor.
Before coming to CMU, I studied math (also) at Stanford University. And before Stanford, I attended Oakton High School in Vienna, Virginia, and for a time, also Lynbrook High School in San Jose, California.
My research interests in wordle form. The right wordle is generated from my most recent publications on online education and the left wordle is generated from my work on probabilistic inference and learning with combinatorially structured data.
I am interested in theoretical and applied problems in machine learning. My main interests lie in designing computationally efficient probabilistic reasoning and learning algorithms which allow computers to deal with the uncertainty and complexity inherent in real world data. My work has focused on tackling applications whose mathematical abstractions involve probabilistic reasoning with combinatorially structured objects such as matchings, rankings, and trees. These problems are challenging both statistically and computationally due to structural constraints (like mutual exclusivity) which cause interactions between objects that traditional techniques in machine learning have been ill-equipped to handle. Portions of my work thus address:
While being dedicated to pushing on core research problems, I am also committed to problems with real world applications and impact. My past work has contributed solutions to a variety of applications such as predicting preference over webpages and political elections, tracking with camera networks, and reconstructing temporal orderings of events (such as the onset of symptoms in neurodegenerative diseases) from noisy and incomplete data.
I now focus most of my energies on applications with educational impact. The recent surge in popularity of massive open online courses (MOOCs), with platforms such as Coursera and EdX, has made it possible for almost anyone to take free university courses. However while new technologies allow for scalable content delivery, we remain limited in our ability to scalably evaluate and give feedback for open-ended assignments. I approach these challenges fundamentally as machine learning (ML) problems, in which we can leverage the massive datasets now collected by online learning platforms. My work has thus focused on ML-driven education and has contributed algorithms for giving feedback in MOOCs via crowdsourcing or semi-automated methods.
Note: Every now and then, Tomasz and I get emails from people about this code. While we're always happy to help out, I would like to point out that we wrote this code many years ago. Nowadays it is much more popular (and effective) to use collapsed samplers or online algorithms over the mean field + variational EM algorithm that was proposed in the first LDA paper.