1-P1-Semantic Web_short-with-RDF_ used.pdf
Document Details
Kasetsart University
Tags
Full Transcript
Department of Computer Engineering, Kasetsart University KE: Semantic Web HutchataiChanlekha...
Department of Computer Engineering, Kasetsart University KE: Semantic Web HutchataiChanlekha Department of Computer Engineering, Kasetsart University Email: [email protected] Knowledge Representation Web 1.0: Web of documents Department of Computer Engineering, Kasetsart University The WWW was invented by Sir Tim Berners-Lee in 1989 The key technology: Hyperlink Information in the form of web documents are connected through the hyperlink A user could click on a link and immediately go to the document identified in that link. 2 Knowledge Representation Web 1.0: Web of documents Department of Computer Engineering, Kasetsart University The great advantage of the web 1.0 is that it abstracts away the tedious physical layer. 3 Knowledge Representation Building web site Department of Computer Engineering, Kasetsart University Site editors surf the Web for new facts Update the site manually Without constantly check, site usually soon gets out-of-date Let’s change the approach … Editors roam the Web for new data published on Web sites “Scrape” the sites with a program to extract the information Changing the web site’s structure might result in incorrect extraction How about this approach … Editors roam the Web for new data via API-s Write some code to incorporate the new data … 4 Knowledge Representation What about APIs?? Department of Computer Engineering, Kasetsart University Various data sources also expose their data via Web Services... However... Each with a different API, a different logic, different structure To use these data, we are forced to reinvent the wheel many times because there is no standard way of doing things We want to extend the current Web to a standard way for a "Web of data" 5 Knowledge Representation What would we like to have? Department of Computer Engineering, Kasetsart University Use the data on the Web the same way as we do with documents: Be able to link to data (independently of their presentation) Use that data the way we want (present it, mine it, etc) Agents, programs, scripts, etc, should be able to interpret part of that data One possible way is to extend the current Web to a “Web of data” Allow for applications to exploit the data directly 6 Knowledge Representation Data on the Web Department of Computer Engineering, Kasetsart University There are more an more data on the Web government data, health related data, general knowledge, company information, flight information, restaurants,... More and more applications rely on the availability of that data But... data are often in isolation, “silos” 7 Knowledge Representation Data on the Web is not enough... Department of Computer Engineering, Kasetsart University We need a proper infrastructure for a real Web of Data Data is available on the Web accessible via standard Web technologies Data are interlinked over the Web i.e., data can be integrated over the Web This is where Semantic Web technologies come in 8 Department of Computer Engineering, Kasetsart University Knowledge Representation https://www.w3.org/2017/12/odi-study/ 9 Knowledge Representation Connecting Data Department of Computer Engineering, Kasetsart University Instead of having URL (or URI) of the documents, we will have URI of the facts or data. Links between data Image from http://www.cambridgesemantics.com/semantic-university/introduction-semantic-web 10 Knowledge Representation Building the Data Web Department of Computer Engineering, Kasetsart University Building the Data Web by connecting data silos The advantage of Semantic Web is that it abstracts away this tedious documents and application layer. Image from http://www.cambridgesemantics.com/semantic-university/introduction-semantic-web 11 Knowledge Representation Web of Data Department of Computer Engineering, Kasetsart University In the Web of Data, we should be able to … Publish the data to make it known on the Web Analogous approach to documents: Give URIs to the data Make it possible to “link” to that URI from other sources of data (not only Web pages) 12 Knowledge Representation Web 3.0: Connecting Data Department of Computer Engineering, Kasetsart University The idea of Web 3.0 is to connect the data … The Semantic Web Means of sharing and connecting data among systems/documents Specific data elements can be referenced between documents The Semantic Web Semantic Web emerged in the late 1990s. It was envisioned as a web of data that could be easily interpreted by both humans and machines. 13 Knowledge Representation Link of data Department of Computer Engineering, Kasetsart University Plant Vegeterian restaurant Cell Enterprise Animal Restaurant DNA Hotel Airline Airport Pig Vacation Indian Mammal Elephant Inchineon Gorilla Asia Genome Mumbay Airport Earth African Elephant Mumbay Continent Europe India China Africa Alexander the Great Angola Lao Tse Ceylon Alexandria Aristotle Egypt Zambia Memphis Philosophy 14 From slide “Introduction to Ontology Application Development”, Dr. Marut Buranarach Knowledge Representation Link for Data Department of Computer Engineering, Kasetsart University In the traditional web, web link has a “context” that a person may use for understanding But machines can’t make sense of the link alone To enable machine understanding.. Extra information (e.g. label) must be added to a link Should be machine readable A characterization of both the link and its target What we need for a Web of Data: Use URIs to publish data instead of full documents Allow the data to link to other data Characterize the data and the links to convey some extra meaning Use standards for all these 15 Knowledge Representation Standards apply to semantic web Department of Computer Engineering, Kasetsart University Semantic Web consists primarily of three technical standards Resource Description Framework The data modeling language for the Semantic Web. RDF All Semantic Web information is stored and represented in the RDF. SPARQL Protocol and RDF Query Language SPARQL The query language of the Semantic Web. It is specifically designed to query data across various systems. Web Ontology Language The schema language of the Semantic Web. OWL OWL enables you to define concepts composably so that these concepts can be reused as much and as often as possible 16 Knowledge Representation Why these standards Department of Computer Engineering, Kasetsart University What makes each of those standards unique … RDF: The graph nature of the RDF means that it is by nature open-ended So new data and new relationships can always be added. SPARQL: The distributed nature of queries across data sources requires extremely flexible and powerful JOIN-like and dynamic translation capabilities. OWL: This language is descriptive, so ontologies are independent of the data that they describe Unlike in a traditional database schema, where the data described are determined directly. 17 Knowledge Representation Semantic Web Landscape Department of Computer Engineering, Kasetsart University Clip: https://cambridgesemantics.com/blog/semantic- university/intro-semantic-web/semantic-web-landscape/ 18 Department of Computer Engineering, Kasetsart University Knowledge Representation RDF: RESOURCE DESCRIPTION FRAMEWORK 20 Department of Computer Engineering, Kasetsart University Knowledge Representation Knowledge: Angola and Zambia are located in Africa. Angola shares a border with Zambia. Angola Africa Continent Country Zambia From slide “Introduction to Ontology Application Development”, Dr. Marut Buranarach 21 Department of Computer Engineering, Kasetsart University Knowledge Representation Knowledge: Angola and Zambia are located in Africa. Angola shares a border with Zambia. Angola Africa Country Zambia Continent From slide “Introduction to Ontology Application Development”, Dr. Marut Buranarach 22 Knowledge Representation RDF: Resource Description Framework Department of Computer Engineering, Kasetsart University Standard model for data interchange on the Web A set of triples Consists of Subject, Predicate, Object Example “Angola is located in Africa.” Subject Angola Predicate Located_in Object Africa Located_in Angola Africa Need to refer to the objects by using URIs. 23 Knowledge Representation URIs Department of Computer Engineering, Kasetsart University URIs are “Uniform Resource Identifiers” IRI: Unicode-based “Internationalized Resource Identifiers” Every URI identifies one entity Semantic Web URIs usually use HTTP HyperText Transfer Protocol Can be resolved to get more data (ideally) Linked data Namespace h t t p ://se ma nticw e b.o rg /id/Africa QName (Qualified Name) as abbreviation Local Name Namespace T h in g : A fric a Prefix From slide “Introduction to Ontology Application Development”, Dr. Marut Buranarach 24 Department of Computer Engineering, Kasetsart University Knowledge Representation Angola Africa Country Zambia Continent From slide “Introduction to Ontology Application Development”, Dr. Marut Buranarach 25 From slide “Introduction to Ontology Application Development”, Dr. Marut Buranarach Knowledge Representation Angola Africa Department of Computer Engineering, Kasetsart University http://ontoworld.org/id/Angola Located in http://ontoworld.org/id/Africa http://ontoworld.org/id/Category:Continent Borders Continent http://ontoworld.org/id/Category:Country Country http://ontoworld.org/id/Zambia http://www.w3.org/1999/02/22/rdf-syntax-ns#type Zambia http://www.w3.org/2000/01/rdf-schema#label 26 From slide “Introduction to Ontology Application Development”, Dr. Marut Buranarach Knowledge Representation แองโกลา แอฟริกา ตั้งอยู่ Department of Computer Engineering, Kasetsart University http://ontoworld.org/id/Angola http://ontoworld.org/id/Africa ชายแดน http://ontoworld.org/id/Category:Continent ทวีป http://ontoworld.org/id/Category:Country http://ontoworld.org/id/Zambia ประเทศ http://www.w3.org/1999/02/22/rdf-syntax-ns#type แซมเบีย http://www.w3.org/2000/01/rdf-schema#label 27 Knowledge Representation Applying the Semantic Web: Two Camps Department of Computer Engineering, Kasetsart University Two perspective of applying semantic web First View Point: Semantic Web as the Future of AI Using Semantic Web technologies to enable machines to infer new facts from existing facts and data Second View Point: Semantic Web as a Data Model Focuses more on the flexibility of the data model Easy for Semantic Web systems to incorporate new facts as needed, including new kinds of data not anticipated at the beginning 28 Knowledge Representation Example Semantic Web Applications Department of Computer Engineering, Kasetsart University Biogen Idec for supply chain management [link] BBC for media management [link] [link] Chevron for data Integration in Oil & Gas [link] [link] Best Buy [link] 29 Knowledge Representation Example of semantic web data published on WWW Department of Computer Engineering, Kasetsart University Fiend-of-a-friend project (http://www.foaf-project.org/) Bio2RDF (http://bio2rdf.org/) NeuroCommons project (http://neurocommons.org/) DBPedia (http://dbpedia.org/) 30 Knowledge Representation Usage of Semantic Web Technology Department of Computer Engineering, Kasetsart University Vision of semantic web led to vastly adopted concepts. Schema.org: Millions of web pages are now tagged with semantic annotations using Schema.org. These annotations enhance web search experiences by providing context and meaning to the content. Example: https://schema.org/Country Linked Open Data (LOD): A cloud of interlinked structured datasets published across thousands of servers without centralized control. Knowledge graphs (KG): KG came later, but quickly became a powerful driver for the adoption of Semantic Web standards and semantic technologies implementing them. 31 Knowledge Representation Knowledge Graph (KG) Department of Computer Engineering, Kasetsart University The term “Knowledge Graph” was coined by Google in 2012. Google embraced semantic technology due to the fact that crawling and categorizing content on the web is a very hard problem to solve without semantics and metadata. With this, and the widespread adoption of schema.org, marked the beginning of the meteoric rise of graph technology and knowledge graphs. Ref: https://yearofthegraph.xyz/knowledge-graphs/ 32 Knowledge Representation Knowledge Graph Department of Computer Engineering, Kasetsart University Knowledge Graph is a graph-based knowledge representation that captures entities and their relationships. It is a vast interconnected network of information. In a knowledge graph, Nodes represent entities (things) Directed labeled edges represent relationships between connected entities. Tom Hanks acted_in Cast Away Robert directed Zemeckis 33 Knowledge Representation Knowledge Graph and Ontology Department of Computer Engineering, Kasetsart University An ontology is a structured way to represent concepts and their relationships. Ontology can be thought of as the foundation—a set of formal definitions for concepts (nodes) and relationships (edges) within a domain. Ontologies provide the structure and semantics, as well as a formal specification of the relationships, for KGs, making the interconnected concepts more understandable and accessible. KGs rely on ontologies to organize data. The ontology, i.e. data model, is applied to individual data points to construct KG. Basically, Ontology + Data = Knowledge Graph 34 Department of Computer Engineering, Kasetsart University ONTOLOGY 35 Knowledge Representation Knowledge Representation Ontology as Knowledge Representation Department of Computer Engineering, Kasetsart University Knowledge engineering is to analyze knowledge into explicit fact structure for computer to understand. Structure of facts by Defining concepts relevant to the knowledge Relating concepts so they are linked in a form of graph Extending Conceptual Facts with real world individuals Mostly, an ontology contains descriptive knowledge. 36 36 Knowledge Representation What is ontology? Department of Computer Engineering, Kasetsart University Property Class Name string locatedIn Population City Country Integer... isa isa Name กรุงเทพมหานคร locatedIn Population Bangkok Thailand 10,820,921... Individual 37 Knowledge Representation What is ontology? Department of Computer Engineering, Kasetsart University Ontology is one of the building blocks of Semantic Web technology. Ontology is a formal description of knowledge as a set of concepts within a domain and the relationships that hold between them. It includes machine-interpretable definitions of basic concepts in the domain and relations among them. Represent knowledge by formally specify the following components: Concepts or classes of objects in a domain of discourse Properties or relations of the class, describing various features and attributes of the concepts. Restrictions on the properties, including rules and axioms Individuals (instances of objects) An ontology together with a set of individual instances of classes constitutes a knowledge base. https://www.ontotext.com/knowledgehub/fundamentals/what-are-ontologies/ 38 https://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html