• USA-2014

    For contributions to provenance management research and technology, and computational reproducibility.

My research is in the general area of data management. I work both on the development of core data management technlogies and on applications that use these technologies. This gives me the opportunity to work with folks in cool areas such as Environmental Sciences, Physics, Ornithology, Networking and Graphics/Visualization. Problems I am currently working on or have worked on include: Provenance management and analytics, large-scale data analysis and visualization, retrieval, mining, querying and visualization of structured Web data. Reproducibility in Science. We are building infrastructure to simplify the creation, review and sharing of computational experiments. VisTrails. VisTrails is an open-source data analysis and visualization sytem. It captures detailed provenance for the data exploration process and uses this information to streamline the creation, execution, and sharing of computational processes (aka workflows, dataflow, pipelines) which are widely used to construct visualizations, perform data analysis and mining. Provenance Analytics BirdVis: Visualizing Geo-Temporal Data. BirdVis is an interactive visualization system that supports the analysis of spatio-temporal bird distribution models. Finding and Querying Structured Data on the Web. In this project, we have addressed the problem of large-scale information integration to enable on-the-fly queries over structured Web data. Uncovering Hidden Web data. Our goal in this project is to develop a scalable infrastructure that automates, to a large extent, the process of discovering, organizing, and extracting data from hidden-Web sources. We have built DeepPeep, a new search engine specialized in Web forms. For more details about this project, see