Leveraging Data and People to Accelerate Data Science

2017 IEEE 33rd International Conference on Data Engineering (ICDE)(2017)

引用 4|浏览26
暂无评分
摘要
Doing data science - extracting insight by analyzing data - is not easy. Data science is used to answer interesting questions that typically involve multiple diverse data sources, many different types of analysis, and often, large and messy data volumes. To answer one of these questions, several types of expertise may be needed to understand the context and domain being served, to import and transform individual data sets, to implement effective machine learning and/or statistical methods, to design and program applications and interfaces to extract and share data and insights, and to manage the data and systems used for analysis and storage. In the IBM Research Accelerated Discovery Lab, we are studying how data scientists work, and using what we learn to help them gain insights faster. In this talk, we will look at what we have learned to date, through user studies and experience with tens of analytics projects, and the environment that we've built as a result. In particular, I will describe how we capture information to enable contextual search, provenance queries, and other functionality to afford teams faster progress in data-intensive investigations. I will also touch on our efforts to leverage data and people to explain what happens during an investigation, with an ultimate goal of moving from descriptive to prescriptive analytics in order to accelerate data science and the analytic process. I will illustrate these various efforts using an ambitious current project on applying metagenomics to food safety, and will conclude with a discussion of where more work is needed and our future directions.
更多
查看译文
关键词
data leveraging,data science acceleration,insight extraction,data analysis,machine learning,statistical methods,data extraction,data management,IBM research accelerated discovery lab,analytics projects,information capturing,contextual search,provenance queries,data-intensive investigations,descriptive analytics,prescriptive analytics,data analytic process,metagenomics,food safety
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要