My principle research interest is in the area of developing data and information management systems to (1) help data scientists quickly build reliable data analytics pipelines (2) help people navigate information pollution, using techniques at the intersection of natural language processing, machine learning and databases.