Provenance use case – News Aggregator

One purpose of information provenance is to provide user with trusted information. It helps users to determine whether the information is trusted or not. Imagine that you find such as contradictory information and you hesitate about the quality of this information. Tim Berners-Lee once quoted:

“At the toolbar (menu, whatever) associated with a document there is a button marked “Oh, yeah?”. You press it when you lose that feeling of trust. It says to the Web, ‘so how do I know I can trust this information?’. The software then goes directly or indirectly back to meta-information about the document, which suggests a number of reasons.” – T. Berners-Lee, Web Design Issues, September 1997.

Besides determining the trust of information, information provenance can also accommodate the integration of information from diverse sources, help giving credit to the originator of information, and dealing with licensing of information.

Let’s have a quick look at an an use case of News Aggregator Scenario provided by W3C. In this scenario, News Aggregator website is assembling news items from many different sources (such as news site, blog, and tweet). In this example, PROV can help News Aggregator for verification, credit and licensing. It can be achieved by examining the process that created, processed, and delivered the information (a.k.a. news item). All can be done in automatic mechanism. It helps determining whether a web document or resource can be used, based on the original source of the content, the licensing information associated with the resource, and any usage restrictions on the content.

For example, News Aggregator finds that #Panda is a trendy topic in Tweeter with the tweet “#panda being moved from Chicago Zoo to Florida! Stop it from sweating” . As a result, many people have retweeted it across many platform. News Aggregator would like to find who is the first whistle blower who first got the word out.

In the process of finding the originator of the tweet, News Aggragator finds the website site protesting the move of Panda. Next, News Aggregator wants to find the organization that is responsible for the website so that its name can be written next to running text that News Aggregator runs. In determining the running text, News Aggregator would also like to know whether the text running on its page is an original text or it was quoted from other site. In addition, News Aggregator would like to automatically use a thumbnail version of the image in its site; therefore, it needs to determine if the image license has expired or not, or even allows this. Using all the information about the content News Aggregator could find, it creates a new aggregated news.

By utilize PROV, hopefully, the News Aggregator will produce more trusted content, attract new business, avoid legal issue, and give an acknowledgement to the right person so they may receive credit.


[archives limit=5]

Leave a Reply