Supporting exploratory data analysis with live programming

2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)(2015)

引用 20|浏览62
暂无评分
摘要
Data scientists often conduct exploratory data analysis in scripting environments with a read-eval-print loop (REPL), like R, IPython or MATLAB. This user experience requires diligent management of execution and generates lengthy histories of unwanted command responses. This paper explores the alternative of live programming, a user experience in which the user's edits immediately and automatically update the script results-a “ripple” effect familiar from spreadsheets. Which user experience provides better support for exploratory data analysis, REPL or ripple? We conducted a controlled lab study with 15 data-experienced professionals. Each participant explored four datasets, two in each experience. The REPL sessions left histories with both significantly more data results and significantly more errors than the live sessions. However, both experiences produced comparable numbers of data results that participants self-rated as insightful. Participants largely preferred the live experience for its responsiveness and ability to keep the script content clean, but missed the visible history that a REPL provides.
更多
查看译文
关键词
Programming environments,read-eval-print loop (REPL),command loop,live programming,data analysis,data mining,data science
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要