PROV – Overview

Provenance can be defined as information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. Information Provenance is needed as a way for provenance to be spread widely across internet and multiple information system. On the other words, Information Provenance is a key to the Open Information. Spreading the provenance is important because it can elevate the trustworthiness of the information.

On the other hand, PROV can be seen as a framework for Provenance. According to W3C,  PROV can be defined as a specification to express provenance records, which contain descriptions of the entities and activities involved in producing and delivering or otherwise influencing a given object.

Imagine that you are provided with information and have such questions:
•Who created that content (author/attribution)?
•Was the content ever manipulated, if so by what processes/entities?
•Who is providing that content (repository)?
•What is the timeliness of that content?
•Can any of the answers to these questions be verified (eg e-signatures)?

PROV framework can help you to answer those questions. By using existing technology, such as RDF and XML, the provenance of information can be tracked for the validity of that information.

To accomplish its mission, there are 12 documents that explain the various aspects of information provenance. Those documents are:

  1. PROV-­OVERVIEW: an overview of the PROV family documents.
  2. PROV-­PRIMER: a primer of PROV data model.
  3. PROV­-O: the PROV ontology.
  4. PROV-­DM: describing common vocabulary used to describe provenance.
  5. PROV-­N: notation for provenance.
  6. PROV­-CONSTRAINTS: a set of constraints to express valid provenance.
  7. PROV­-XML: an XML Schema for PROV data model.
  8. PROV­-AQ: mechanism for accessing and querying provenance.
  9. PROV­-DICTIONARY: introduces a specific type of collection, consisting of key-entity pairs.
  10. PROV­-DC: provides a mapping between PROV-O and Dublin Core Terms.
  11. PROV­-SEM: a declarative specification in terms of first-order logic of the PROV data model.
  12. PROV­-LINKS: introduces the mechanism to link across bundles.

They are depicted below.

PROV Framework

The Implementation of Information Provenance can be accross many areas, such as:
•Open information systems (such as the Web) -> Making trust judgments on what web content to trust.
•Business practices -> Manufacturing processes and providers of a given product.
•Science applications -> How new results were obtained: from assumptions to conclusions and everything in between.
•Laws for IP and privacy protection -> Licensing and attribution of a document/software that combines.
permissions and rights of text, images, etc ; Privacy of information as well as of its provenance.


[archives limit=5]

Leave a Reply