Last post (PROV-DM Case Study (part 2)), we had added some roles in our graph and demonstrated the revision on dataset. Now, we imagine that someone has quoted some text about this article, and provenance decided that this text (exb:quoteInBlogEntry‐20130326) is quoted from the article. The newspaper has also anticipated that there could be revisions to the article, so they created identifiers for both the article in general (exn:article) as a URI that got redirected to the first version of the article (exn:articleV1), allowing both to be referred to as entities in provenance data. Finally, after new dataset has arrived and a new chart was built, a new article, exn:articleV2 was created. In addition, The entities of exn:articleV1 and exn:articleV2 are related. Continue reading
After looking at PROV-DM (Data Model), let’s try to implement that through case study. The case study was taken from W3C, about online newspaper that publishes an article with a chart about crime statistics based on data (GovData) provided by a government portal. The article includes a chart based on the data, with data values composed (aggregated) by geographical regions. Based on that sentence, we can derive 5 entities refer to:
•The article (exn:article),
•An Original data set (exg:dataset1),
•A list of regions (exc:regionList),
•Data aggregated by region (exc:composition1),
•A chart (exc:chart1) Continue reading
PROV-DM is the conceptual data model that forms a basis for the W3C provenance (PROV) family of specifications. The following diagram provides a high level overview of the structure of PROV records.
One purpose of information provenance is to provide user with trusted information. It helps users to determine whether the information is trusted or not. Imagine that you find such as contradictory information and you hesitate about the quality of this information. Tim Berners-Lee once quoted:
“At the toolbar (menu, whatever) associated with a document there is a button marked “Oh, yeah?”. You press it when you lose that feeling of trust. It says to the Web, ‘so how do I know I can trust this information?’. The software then goes directly or indirectly back to meta-information about the document, which suggests a number of reasons.” – T. Berners-Lee, Web Design Issues, September 1997.
Provenance can be defined as information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness. Information Provenance is needed as a way for provenance to be spread widely across internet and multiple information system. On the other words, Information Provenance is a key to the Open Information. Spreading the provenance is important because it can elevate the trustworthiness of the information.
The past decades have seen the rapid development of Information technology, particularly in world’s data growth. Research by scientist in Barkley – University of California in 2003 revealed that from the beginning of human history until 2002, the world had produced 5 exabytes (5 billion gigabytes) of new information. Surprisingly, we were able to create the same amount of data in 2011 within only 2 days. Finally, we managed to create 5 exabytes within only 10 minutes in 2013. Clearly, the data is growing exponentially (Lyman and Varian, 2000).
To apply Grounded Theory method, data received from interview were processed into several steps: (i) transcribing, (ii) coding, (iii) memoing, (iv) insight refinement, and (v) saturation analysis (Adolph, Hall, & Kruchten, 2011). By using Grounded theory, the author tries to get the new perspective and insight by exploring the data received from interview to grow his insight (Batlajery, 2013)
According to Burke (2007), recommender system can be defined as a “personalized information agent that provides recommendation: suggestions for items likely to be of use to a user“. Similar with him, (Van Setten, Pokraev, & Koolwaaij, 2004) also yielded that recommender systems can be explained as systems capable of helping people to quickly and easily find their way through large amounts of information by determining what is of interest to a user.