Linked Data Provenance: State of the Art and Challenges

Anam, S., Kang, B.H., Kim, Y.S. and Liu, Q.

    Linked Open Data (LOD) is rapidly emerging in publishing and sharing structured data over the semantic web using URIs and RDF in many application domains such as fisheries, health, environment, education and agriculture. Since different schemas that have the same semantics are found in different datasets of the LOD Cloud, the problem of managing semantic heterogeneity among the schemas is increasing. Schema level mapping among the datasets of the LOD Cloud is necessary as instance level mapping among the datasets is not feasible in the process of making knowledge discovery easy and systematic. In order to correctly interpret query results over the integrated dataset, schema level mapping provenance is necessary. In this paper, we review existing approaches of linked data provenance representation, storage and querying, and applications of linked data provenance where mapping is at the instance level. The analysis of existing approaches will assist us in revealing open research problems in the area of linked data provenance where mapping is at the schema level. Furthermore, we explain how schema level mapping provenance in linked data can be used to facilitate data integration and data mining, and also to ensure quality and trust in data.
Cite as: Anam, S., Kang, B.H., Kim, Y.S. and Liu, Q. (2015). Linked Data Provenance: State of the Art and Challenges. In Proc. 3rd Australasian Web Conference (AWC 2015) Sydney, Australia. CRPIT, 166. Davis, J. G. and Bozzon, A. Eds., ACS. 19-28
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS