IBEX: Harvesting Entities from the Web Using Unique Identifiers
WebDB, pp. 13-19, 2015.
In this paper we study the prevalence of unique entity identifiers on the Web. These are, e.g., ISBNs (for books), GTINs (for commercial products), DOIs (for documents), email addresses, and others. We show how these identifiers can be harvested systematically from Web pages, and how they can be associated with humanreadable names for the...More
PPT (Upload PPT)