His work is centered around modeling, managing, and mining data, especially graph and text data. His contribution can be found in data mining, database systems, natural language processing, and their applications in interdisciplinary areas like bioinformatics. His works were extensively referenced, with over 15,000 citations per Google Scholar and thousands of software downloads. He received NSF CAREER Award, IBM Invention Achievement Award, ACM-SIGMOD Dissertation Runner-Up Award, and IEEE ICDM 10-year Highest Impact Paper Award.