Utilize expert knowledge to create thesauri and ontologies to. Compare materials. Investigate trends, topics. Assess qua
WP2: Linked Data and Semantics Artem Revenko Semantic Web Company
Introduction
Contributions
Conclusion
Introduction Contributions Collecting Data Quality Assessment Data Processing Knowledge Graph Visualizations Crowd-Sourcing Conclusion
2 / 23
Introduction
Contributions Collecting Data Quality Assessment Data Processing Knowledge Graph Visualizations Crowd-Sourcing
Conclusion
Introduction
Contributions
Conclusion
WP2 Overview
Goal I
“. . . to develop and to establish a methodology for a linked data lifecycle. “
I
“. . . information integration and aggregation of incoming sources will become more precise and can be managed in a more agile way . . . ”
4 / 23
Introduction
Contributions
Conclusion
Platform Functionalities
5 / 23
Introduction
Contributions
Conclusion
Objectives
I
Utilize expert knowledge to create thesauri and ontologies to Compare materials Investigate trends, topics Assess quality of contributions Formalize and extract user preferences
6 / 23
Introduction
Contributions
Conclusion
Objectives
I
Utilize expert knowledge to create thesauri and ontologies to Compare materials Investigate trends, topics Assess quality of contributions Formalize and extract user preferences
I
Harvest and process relevant data. Annotations enables semantic search
6 / 23
Introduction
Contributions
Conclusion
Objectives
I
Utilize expert knowledge to create thesauri and ontologies to Compare materials Investigate trends, topics Assess quality of contributions Formalize and extract user preferences
I
Harvest and process relevant data. Annotations enables semantic search
I
Use visualization tools to demonstrate the data and facilitate deeper analysis
6 / 23
Introduction
Contributions
Conclusion
Objectives
I
Utilize expert knowledge to create thesauri and ontologies to Compare materials Investigate trends, topics Assess quality of contributions Formalize and extract user preferences
I
Harvest and process relevant data. Annotations enables semantic search
I
Use visualization tools to demonstrate the data and facilitate deeper analysis
I
Use crowdsourcing/interactive tools to improve data and create collective awareness
6 / 23
Introduction
Contributions
Conclusion
Deliverables
4 D2.1 PROFIT core knowledge model (M6) 4 D2.2 Data and information streams - assessment tools (M9) 4 D2.3 Data crawlers, adaptors and extractors (M12) 4 D2.4 PROFIT Knowledge Graph (M12) 4 D2.5 Visualizations and information widgets (M18) Ü D2.6 Sharing activities and crowdsourcing mechanisms (M24)
7 / 23
Introduction
Contributions Collecting Data Quality Assessment Data Processing Knowledge Graph Visualizations Crowd-Sourcing
Conclusion
Introduction
Contributions
Conclusion
Crawling News Articles
I
UnifiedViews pipelines: scheduled harvesting
I
11 source, 30 named graphs, ≈ 250 articles weekly
I
Different methods: RSS feeds, API, permalinks
Statistics: https: //custom_apps.poolparty.biz/ProfitViz/page/statistics GraphSearch: https://profit.poolparty.biz/GraphSearch/ In platform: http://profit-demo.eea.sk:88/articles/
9 / 23
Introduction
Contributions
Conclusion
Document Assessment and Classification
Dimensions of Quality I Readability I
Relatedness to Topic
I
Sentiment
I
Extracted Concepts
Document Classification Classify documents into predefined categories. Demo: https://artem.semantic-web.at/text_assessment/
10 / 23
Introduction
Contributions
Conclusion
Topic Transitions
Topics Topics represents combinations of concepts that often and meaningfully occur together. Transitions show how different topics develop. Demo: https://artem.semantic-web.at/text_assessment/ topic_transitions
11 / 23
Introduction
Contributions
Conclusion
Bright Concepts Idea I
Sometimes something is not mentioned on purpose
I
If there is a change in context, how do we detect it?
12 / 23
Introduction
Contributions
Conclusion
Bright Concepts Idea I
Sometimes something is not mentioned on purpose
I
If there is a change in context, how do we detect it?
Example I A corpus about US politics with dates
12 / 23
Introduction
Contributions
Conclusion
Bright Concepts Idea I
Sometimes something is not mentioned on purpose
I
If there is a change in context, how do we detect it?
Example I A corpus about US politics with dates I
We create several corpora of it separated by dates
12 / 23
Introduction
Contributions
Conclusion
Bright Concepts Idea I
Sometimes something is not mentioned on purpose
I
If there is a change in context, how do we detect it?
Example I A corpus about US politics with dates I
We create several corpora of it separated by dates
I
We analyze the corpora. Findings: “White House”, “politics”, “president”, “Obama” are very often found together before 2016.
12 / 23
Introduction
Contributions
Conclusion
Bright Concepts Idea I
Sometimes something is not mentioned on purpose
I
If there is a change in context, how do we detect it?
Example I A corpus about US politics with dates I
We create several corpora of it separated by dates
I
We analyze the corpora. Findings: “White House”, “politics”, “president”, “Obama” are very often found together before 2016.
I
In the fresh (2017) corpus “Obama” disappears.
12 / 23
Introduction
Contributions
Conclusion
Bright Concepts Idea I
Sometimes something is not mentioned on purpose
I
If there is a change in context, how do we detect it?
Example I A corpus about US politics with dates I
We create several corpora of it separated by dates
I
We analyze the corpora. Findings: “White House”, “politics”, “president”, “Obama” are very often found together before 2016.
I
In the fresh (2017) corpus “Obama” disappears.
12 / 23
Introduction
Contributions
Conclusion
Bright Concepts Idea I
Sometimes something is not mentioned on purpose
I
If there is a change in context, how do we detect it?
Example I A corpus about US politics with dates I
We create several corpora of it separated by dates
I
We analyze the corpora. Findings: “White House”, “politics”, “president”, “Obama” are very often found together before 2016.
I
In the fresh (2017) corpus “Obama” disappears.
Outcome: We signal the user that in the context “White House”, “politics”, “president” the concept “Obama” was substitued by “Trump”. 12 / 23
Introduction
Contributions
Conclusion
PROFIT Knowledge Graph
Motivation Thesauri and ontologies help to I
Set the focus by defining concepts of interest
13 / 23
Introduction
Contributions
Conclusion
PROFIT Knowledge Graph
Motivation Thesauri and ontologies help to I
Set the focus by defining concepts of interest
I
Formalize expert knowledge about semantic relations between entities
13 / 23
Introduction
Contributions
Conclusion
PROFIT Knowledge Graph
Motivation Thesauri and ontologies help to I
Set the focus by defining concepts of interest
I
Formalize expert knowledge about semantic relations between entities
I
Facilitate integration and processing of information. Example: document similarities.
13 / 23
Introduction
Contributions
Conclusion
PROFIT Thesaurus
I
I
I
Reuses EuroVoc, STW; extended by experts; # Concepts 10860 # Broader/narrower 11232 27970 Statistics: # Related # Languages 25 # Labels 226255 Published: http://profit.poolparty.biz/profit_thesaurus.html
14 / 23
Document Similarity Thesaurus ECB
Mario Draghi
Document 1 Mario Draghi was not expected to announce any changes to monetary policy
Money and Markets
Monetary Policy
Interest Rate
Document 2 European Central Bank leaves interest rate at record-low 1%
Document Similarity Thesaurus ECB
Mario Draghi
Document 1 Mario Draghi was not expected to announce any changes to monetary policy
Money and Markets
Monetary Policy
Interest Rate
Document 2 European Central Bank leaves interest rate at record-low 1%
Document Similarity Thesaurus Money and Markets
ECB
Mario Draghi
Monetary Policy
Document 1
Document 2
Mario Draghi was not expected to announce any changes to monetary policy
Mario Draghi
European Central Bank leaves interest rate at record-low 1% 0.8 "
#
Monetary Policy 0.5
V
V
"
Interest Rate
ECB Interest Rate
#
Introduction
Contributions
Conclusion
User Multi-Classification
User Cycle 1. User registers and fills out some info about himself 2. User assigns himself and gets assigned to a level/category 3. User uses the platform, leaves comments, tries quizes 4. User takes educational materials and tests
16 / 23
Introduction
Contributions
Conclusion
User Multi-Classification
User Cycle 1. User registers and fills out some info about himself 2. User assigns himself and gets assigned to a level/category 3. User uses the platform, leaves comments, tries quizes 4. User takes educational materials and tests When does user comes to a “new” level?
16 / 23
Introduction
Contributions
Conclusion
User Multi-Classification
User Cycle 1. User registers and fills out some info about himself 2. User assigns himself and gets assigned to a level/category 3. User uses the platform, leaves comments, tries quizes 4. User takes educational materials and tests When does user comes to a “new” level? Taking the user’s activity in the platform into account we can automatically reassign the user based on the other users’ statistics.
16 / 23
Introduction
Contributions
Conclusion
Experiment
17 / 23
Introduction
Contributions
Conclusion
Ontologies
I
Finance Ontology: http://profit.poolparty.biz/PROFIT_Finance.html
I
User Ontology: http://profit.poolparty.biz/PROFIT-User.html
I
Initial Knowledge Graph (≈ 500 statements about 50 entities)
18 / 23
Introduction
Contributions
Conclusion
Visualization of Data Graph Goal Enable navigation in a huge data graph such that 1. No information is hidden 2. Not too much information on the page Demo: https: //custom_apps.poolparty.biz/ProfitViz/page/thesaurus I
Vertical hierarchy: more natural to scroll down
I
Hide intermediate nodes, only show how many of them exist
I
Show concepts schemes / top concepts (plural in case of poly-hierarchies)
I
Show total number of children and number of direct children of this concept, parent concept, top concept
I
Show all the concept that are in any relation to this concept 19 / 23
Introduction
Contributions
Conclusion
Numerical and Topical Trends
I
Visualize time series and trends
I
Intuitive, interactive, user-friendly
Demo: http://profit-demo.eea.sk:88/forecasting/oil/
20 / 23
Introduction
Contributions
Conclusion
Crowd-Sourcing
I
Users may suggest new elements: concepts, assign classes to concepts, add relations between concepts
I
Other users may vote on the extensions; when enough votes collected input is added to KG.
21 / 23
Introduction
Contributions Collecting Data Quality Assessment Data Processing Knowledge Graph Visualizations Crowd-Sourcing
Conclusion
Introduction
Contributions
Conclusion
Conclusion I
Linked data lifecycle established
I
Harvesting pipelines up and running, data gets integrated
I
Processing tools facilitate data consumption
23 / 23
Introduction
Contributions
Conclusion
Conclusion I
Linked data lifecycle established
I
Harvesting pipelines up and running, data gets integrated
I
Processing tools facilitate data consumption
Outcomes 1. Harvesting of news articles: 250 weekly with annotations 2. Assessment: quality, classification, topic trends 3. PROFIT thesaurus := EuroVoc + STW + expert input
you! Thank Thank you!
4. PROFIT ontologies and knoledge graph
⇒
5. Visualization projectprofit.eu of thesaurus: unique tool
⇐
6. Numerical and topical visualization tools 7. Crowd sourcing: collaborative workflows on data graph
23 / 23
Introduction
Contributions
Conclusion
Conclusion I
Linked data lifecycle established
I
Harvesting pipelines up and running, data gets integrated
I
Processing tools facilitate data consumption
Outcomes 1. Harvesting of news articles: 250 weekly with annotations 2. Assessment: quality, classification, topic trends 3. PROFIT thesaurus := EuroVoc + STW + expert input
you! Thank Thank you!
4. PROFIT ontologies and knoledge graph
⇒
5. Visualization projectprofit.eu of thesaurus: unique tool
⇐
6. Numerical and topical visualization tools 7. Crowd sourcing: collaborative workflows on data graph
23 / 23