Scalable Access and Integration of Statistical Data for Digital Government

msra(2001)

引用 24|浏览11
暂无评分
摘要
The massive amount of statistical and text data available from government agencies has created a set of daunting challenges to both the research and analysis communities. These problems include heterogeneity, size, distribution, and control of terminology. At the Digital Government Research Center ( www.dgrc.org ) we are investigating solutions to these key problems. In this paper we focus on scalability of data integration across multiple databases and web sources, and ontology construction and mapping for terminology standardization. This collaboration between researchers from the Information Sciences Institute of the University of Southern California and the Department of Computer Science of Columbia University employs technology developed at both locations, in particular the SIMS multi-database access planner (AKH96,AK00), the SENSUS ontology (KL94,SPKR96,H98) and the LEXING automated dictionary and terminology analysis system (KM00,KW01). Our application targets gasoline data from the Bureau of Labor Statistics, the Energy Information Administration of the Department of Energy, the Census Bureau, and other government agencies (see (AAPH+01) for an overview of the project).
更多
查看译文
关键词
heterogeneous data sources,efficient query processing,data warehousing,ontology construction,web wrappers,— information integration,information integration,data integrity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要