Lectures

University of Leipzig

 

An Introduction to the Linked Data Web     

Over the last years, the Linked Data Web has developed into a compendium of several billions of facts pertaining to a multitude of domains. This data-centric complement of the current document-driven Web is based on similar principles including decentral publishing and governance. The aim of this tutorial is to introduce the basic ideas and technologies that underly the Web of Data. To this end, we will introduce the Linked Data lifecycle. Each of the steps of this lifecycle will be fleshed out with basic considerations, descriptions of tools that target the said step and current challenges that need to be addressed.

 

CITEC, Bielefeld University

CITEC, Bielefeld University

National University of Ireland

Question Answering over Linked Data     

While more and more structured data is published on the web, the question of how typical web users can access this body of knowledge becomes of crucial importance. Over the past years, there is a growing amount of research on interaction paradigms that allow end users to profit from the expressive power of Semantic Web standards while at the same time hiding their complexity behind an intuitive and easy-to-use interface. Especially natural language interfaces, such as question answering systems, have received wide attention, as they allow users to express arbitrarily complex information needs in an intuitive fashion and, at least in principle, in their own language. The key challenge is to translate the users' information needs into a form such that they can be evaluated using standard Semantic Web query processing and inferencing techniques.

This tutorial will provide an introduction to the growing field of question answering over linked data. In particular, it will outline the main challenges involved in getting from natural language questions to structured queries and answers, how state-of-the-art approaches address these challenges, and which open issues remain.

 

 

 

 

 

 

 

Google Research
Mountain View

IMIS, Athena Research Center

Query Processing for RDF Databases     

Query processing is one of the most active research areas in databases. Part of its allure is the fact that the area is "reborn" every few years, mostly due to the arrival of new data models. And in spite of the large body of work that already exists, newer models often have novel requirements that either necessitate a revision of past query processing techniques, or in the "worst" case lead to the development of new ones. This simple fact will be used as the starting point of our tutorial. We will begin with some background on RDF (the new kid in the "data model" block), and examine closely its characteristics (and requirements). We will attempt to explain why and when past query processing techniques (e.g., those developed for the relational or XML models) need to be revised, and identify areas where new techniques should be developed.

Our journey will take us through query processing for both centralized and distributed RDF stores. We will also navigate through the emerging paradigm of cloud computing and, after introducing some basics on NoSQL databases and MapReduce, we will show how to handle massive amounts of RDF data using such frameworks.

 

 

Polytechnic University of Catalonia

Polytechnic University of Catalonia

Sparsity Technologies

Graph Databases and their Applications     

The use of graphs in analytic environments is getting more and more generalised, with applications in many different environments like social network analysis, fraud detection, industrial management, knowledge analysis, etc. Graph databases are one important solution to consider in the management of large datasets.

The course will be oriented to tackle four important aspects of graph management. First, to understand the different environments where graphs are important and how they can be used to solve real life problems. Second, to review the technologies for graph management and focus on the particular case of Sparksee. Third, to analyse in depth one of the important applications and how graphs are used to solve it, Social Network Analysis. Fourth, to understand the use of benchmarking to make the requirements of the user compatible with the growth of the technologies for graph management.

 

 

 

 

 

 

 

 

 

 

 

 

 

Birkbeck College

 

 

 

 

An Introduction to Description Logics     

Description Logics (DLs) are a family of knowledge representation formalisms that form a basis of the Web Ontology Language (OWL). In recent years, along with the traditional reasoning services of consistency checking and classification, conjunctive query answering has emerged as the main task in the context of Ontology-Based Data Access (OBDA). In the OBDA paradigm, an ontology gives the user a high-level unified view of multiple data sources (relational databases, triple stores, etc.) and enriches incomplete data with background knowledge. OBDA is widely believed to be a key to the new generation of information systems on the Web.

This lecture will cover the syntax and semantics of DLs with basics of reasoning algorithms. It will focus on query answering techniques tailored for the light-weight DLs: in particular, on query rewriting for DL-Lite and the combined approach for EL. Practical aspects of OBDA will be illustrated with the open-source system Ontop.

Hamburg University of Technology

Hamburg University of Technology

 

Ontology Based Data Access on Temporal and Streaming Data     

Though processing time-dependent data has been investigated for a long time, the research on temporal and especially stream reasoning over linked open data and ontologies is reaching its high point these days.

In this tutorial, we give an overview of state-of-the art query languages and engines for temporal and stream reasoning. On a more detailed level, we discuss the new language STARQL (Reasoning-based Query Language for Streaming and Temporal ontology Access), which is being developed within the EU funded project OPTIQUE.

STARQL is designed as an expressive and flexible stream query framework that offers the possibility to embed different (temporal) description logics as filter query languages over ontologies, and hence it can be used within the OBDA paradigm (Ontology based data access in the narrow sense) and within the ABDEO paradigm (Accessing Big Data over expressive ontologies).

 

 

 

 

 

University of Montpellier, Inria

TU Dresden

Ontology-Based Query Answering with Existential Rules     

The need for an ontological layer on top of data, associated with advanced reasoning mechanisms able to exploit the semantics encoded in ontologies, has been acknowledged in the database, knowledge representation and Semantic Web communities. We focus here on the ontology-based data querying problem, which consists of querying data while taking ontological knowledge into account. To tackle this problem, we consider a logical framework based on existential rules, also called Datalog+/-.

In this tutorial, we will introduce fundamental notions on ontology-based query answering with existential rules. We will present basic reasoning techniques, explain the relationships with other languages such as lightweight description logics, and review decidability, complexity and algorithmic results. We will end with ongoing research and some challenging issues.

 

 

 

 

 

IBM Research
Ireland

 

Large-Scale Semantic and Reasoning Systems for Cities     

The Semantic Web has left the lab. In this tutorial, we will go through some real-world industrial Semantic Web projects and discuss the costs an benefits on this disruptive technology. We will describe some key challenges that Semantic technologies are uniquely capable to address in order to provide critical functionality in traditionally very challenging domains, such as processing and reasoning over Big Data coming from various entities in cities and beyond. Finally, we will show demonstrations of industrial research projects and do a deep dive to architecture and design principles of related systems.

University of Antwerp

Yahoo Labs, Barcelona

 

Querying and Learning in Probabilistic Databases      

Probabilistic Databases (PDBs) lie at the expressive intersection of databases, first-order logic, and probability theory. PDBs employ logical deduction rules to process Select-Project-Join queries, which form the basis for a variety of query languages such as Relational Algebra, Datalog, and SQL. They employ logical consistency constraints to resolve data inconsistencies, and they represent query answers via logical lineage formulas (aka. "data provenance") that trace the dependencies between these answers and the input tuples that led to their derivation. While the literature on PDBs dates back to more than 25 years of research, only fairly recently the key role of lineage for establishing a closed and omplete representation model of relational operations over this kind of probabilistic data was discovered. However, although PDBs benefit from their efficient and scalable DB infrastructures for data storage and indexing, they also need to couple the data computation with probabilistic inference, the latter of which remains a #P-hard problem also in the context of PDBs.

The first part of the lecture starts with a review on the key concepts of probabilistic databases. As a second part, we provide an overview of our own recent research results related to this field. We conclude with an outlook on the application of PDBs to Semantic-Web-style reasoning tasks, and we devise their application to a large-scale information extraction setting by means of the YAGO knowledge base.

GraphLab

 

 

 

 

 

 

 

 

GraphLab: Large-Scale Machine Learning on Graphs  

From social networks, to protein molecules and the web, graphs encode structure and context, enable advanced machine learning, and are rapidly becoming the future of big-data. In this talk we will present the next generation of GraphLab, an open-source platform and machine learning framework designed to process graphs with hundreds of billions of vertices and edges on hardware ranging from a single mac-mini to the cloud.

We will present the GraphLab programming abstraction that blends a vertex and edge centric view of computation to enable users to express algorithms that can be efficiently executed on hardware ranging from multi-core to the cloud. We will describe some of the technical innovations that form the foundation of the GraphLab runtime and enable unprecedented scaling performance. Using recommender systems and PageRank as a running examples we will show how to design, implement, and execute graph analytics on real-world twitter-scale graphs. Finally, we will present the GraphLab machine learning frameworks and demonstrate how they can be used to identify communities and important individuals, target customers, and extract meaning from text data.