Sessions by Track

= In-Person Session

= Online Session

Business Use Cases

Beamery is a funded startup based in London and the US, selling talent management software to enterprises, and is industry leading in the talent management sector. As a company, we have invested heavily in practical AI solutions underpinned by knowledge graph technology. By far the largest knowledge base is the Beamery Talent Graph – a 20 billion fact graph of artifacts from the HR domain. All AI solutions are built on top of the knowledge graph, which is a global representation of skills, people and companies.

Speaker: Vaishali Raghvani

As industries grapple with the need and various approaches to implementing Digital Transformation, a shift in thinking about data technology and data culture within organizations is required to realize its full potential. Knowledge graph technology presents an emerging approach to manage and integrate complex scientific data in a flexible way. This session introduces the application of ontology and knowledge graph technology within biopharmaceutical manufacturing and how this approach can be expanded to other manufacturing use cases or industry.

Speaker: Stephen Kahmann

This talk will be about understanding graph technology as a solution to detect transitive relationships between indicators of fraud, and help uncover new networks of fraudsters before they scale. We will talk about building an in house ecosystem that stores fraud events as graph data and emits graph features to power our fraud controls. We will end by discussing about scaling the technology to solve problems in multiple fraud domains and the challenges involved there.

Speaker: Rajat Saxena

Companies are increasingly using artificial intelligence (AI) applications for decision-making today. However, due to lack of context, AI systems have not yet achieved their full potential as reliable solutions for complex problems. Enter knowledge graphs – the modern way to capture relationships and convey their meaning. Knowledge graphs drive intelligence into the data itself and give AI the context to be more explainable, accurate, and repeatable.

Speaker: Maya Natarajan

In this presentation we will show how we are using curated information stored in the IKEA Knowledge Graph to generate valuable product data. The manual effort is done on a general level and is small in size compared to the generated product data. This approach proves that “a little semantics goes a long way” and helps IKEA to serve its customers better information on its various products.

Speaker: Katariina Kari

Personalization is the key to success for consumer product organizations. In this talk, you’ll learn the insights and design patterns to build a scalable & polyglot Knowledge Graph to provide personalized experiences to end customer. It’d also cover the practical tips and architectural know how to create & run the platform that can withstand the test of time for internal as well as external customers.

Speaker: Gautam Gupta

Turning dross into gold. Knowledge graphs, with their capacity for surfacing vast hidden networks, can help detect looted art from the ownership history – or provenance – of artworks. The cultural heritage sector and art industry have explored named entity recognition with an event-based approach using CIDOC-CRM. However, Nazi-looted art poses a particular challenge, in part due to the passage of time, and in part due to unreliable data, as attempts to conceal and distort information which began in the Nazi era continue into the digital age. Missing, confusing or badly coded entities, false dates, names, events, places, the mixing of speculation and fact occur with such frequency in Nazi-looted art that it is useful to view errors, not as anomalies to be cleansed from the dataset, but as primary features to be analyzed. This presentation focuses on strategies and methods to quantify, classify, code and exploit this unreliable information in order to detect looted art and the patterns and networks which underly its commercialisation.

Speaker: Laurel Zuckerman

Biomedical data and knowledge are expanding exponentially in the 21st century. In this talk, I will present how Knowledge Graph technologies are increasingly being developed and leveraged in academia and industry to tackle complex data integration challenges in life sciences and healthcare. I will showcase how these technologies are currently being used in several different areas, such as drug discovery and safety, clinical search and decision support, biomedical research, and disease monitoring. These technologies, in conjunction with machine learning methods, enable the capture, representation, and provision of ever evolving and expanding biological and medical data and knowledge. I will end the talk with my perspective on some of the major bottlenecks that currently hinder the wide-spread adoption of knowledge graphs and discuss some of the next directions and technical solutions to mitigate these bottlenecks. Eventually, these technologies will enable the development of scalable platforms that can support artificial intelligence methods for augmenting human intelligence to achieve better clinical outcomes for patients, to enhance the quality of biomedical research, and to improve our understanding of living systems.

Speaker: Maulik Kamdar

Content Knowledge Graphs

We started our Knowledge Graph journey with a content knowledge graph that helps us unify and connect various media types for our multi-purpose streaming platform (RTL+). We include media, entities & enriched metadata from Movies, Series, Music, Podcasts and Audiobooks. But we soon realised that our knowledge graph has relevant applications in many more areas across the organisation and can serve as a single source-of-truth that can enable use-cases much beyond the streaming app. In this presentation, we would like to share our business use case, experience, and learnings from our knowledge graph development journey.

Speaker: Sidharth Ramachandran

The talk focuses on the use of large language models (LLMs) for generative AI, and how incorporating symbolic knowledge (attributes from a knowledge graph of an eCommerce website) can improve the accuracy and usefulness of generated content.

Speaker: Andrea Volpini

This presentation will explore the potential of using structured data to improve the way we organize and access information. The talk will introduce the concept of knowledge graphs and discuss their potential benefits for both content creators and consumers. It will also touch on the role of headless content management systems (CMS) and document stores in supporting the creation and maintenance of knowledge graphs. Finally, it will discuss advances in AI/ML and provide some perspective on how this can supercharge content knowledge graphs. Overall, the presentation will provide a comprehensive introduction to the field of content knowledge graphs, the benefits they can offer, and the advantages of integration with AI & ML.

Speaker: Gavin Mendel-Gleason

There’s been growing discussion recently concerning “content graphs” that utilize graph relationships to manage content for online publication. Some have observed the similarities between knowledge graphs and headless content management and have sought to combine these approaches. Yet how compatible are these approaches in practice? After examining the overlaps between knowledge graphs and headless CMS, the presentation will explore the assumptions embedded in making them work together, and how these assumptions may run up against scenarios they are poorly equipped to handle. The talk will address the differences in using knowledge graphs to support editorial planning compared with relying on them to support delivery orchestration.

Speaker: Michael Andrews

This talk will look at Knowledge Graphs in publishing and broadcast media. Where have things moved on since pioneering projects using Linked Data? Have knowledge graphs significantly changed the landscape? We explore the topic through a set of case studies looking at how knowledge graphs are changing publishing across different channels, media types and business models.

Speaker: Silver Oliver

Machine-mediated interactions on the Web call for different approaches towards marketing communication and its artifacts: content and data objects. Linked Data and Content Now! is about the potential and real use of schema.org for marketing communication on the Web. In this talk you will learn how schema.org can be a means towards richer content experiences and data-centric marketing communication. I will show how we can use the affordances of the Semantic Web to enhance the rhetorical powers of content and its ability to inform, persuade, educate and entertain. Then we will have a look at how thetthe top 10 companies in the S&P 500 ESG index use, or don’t, the potential of Linked Data for content.

Speaker: Teodora Petkova

Content personalization and recommendation engines are pervasive in today’s society. They power some of our most used platforms–including Amazon, Spotify, Netflix, and more. However, many organizations struggle to provide their employees and customers with the same ease of content and information discovery. They are turning to semantic recommendation engines to solve these business challenges. How do you get started in designing and implementing an enterprise content recommendation engine to connect people with relevant content at the time of need? How do you define a knowledge graph that will effectively connect people to the right experts ? Do you take a deterministic or statistical approach with your semantic foundation? Nash and Duane have led numerous recommendation engine implementations, including with organizations in the healthcare, finance, and government sectors. During this presentation, they will share case studies, lessons learned, and success stories of enterprise semantic recommendation engines, a clear approach for establishing a recommender, and best practices to drive a successful implementation.

Speakers: Sara Duane; Sara Nash

Data Architecture

A multi-model database allows multiple data representations to coexist in the same instance. In simpler times, that meant two models — such as document and graph — in one. In this presentation, I tackle the multi-model dream of tomorrow: a single model with data spanning many and various database services: key-value, document, relational, lake, in-memory, search service, graph, and more. Cloud providers have all these services on the truck, and it’s never been easier to get them all up and running. There’s just one complication: drawing the model! Graph to the rescue! I show, using the example of movies, a straightforward ontological approach to model the structure of data spanning database services across “the truck.”

Speaker: Michael Havey

On an average, onboarding a Data professional to any large organization takes around 3 months for him to really understand business and data models within the core applications. Once he understands he shall be productive. Allianz GraphXL solves this challenge and is now been used by Business Analysts, Data modellers, Developers, Data Warehouse team etc. who are writing SQL queries as a part of their day to day job to find the relevant data within few minutes.

Speaker: Naveed Ahamed

Built correctly, an organisational Knowledge Graph can combine the power of data, the cloud, and AI in one unified structure.

This talk will outline four main contentions:

• Networked Data. That network-shaped data can model complex structures including circular feedback loops and abstract models, and that networks (or graphs) make the connections between things as important as the things themselves.
• Networked Cloud. That the network-shaped cloud of connected computers means that ALL the important data in an organisation can be joined together regardless of where that information is stored. Moreover, it states that an organisation’s Knowledge Graph is not just one big centralised database, but rather a distributed and interconnected ecosystem.
• Networked AI. That networked-shaped AI lets us make predictions about connections, loops and abstractions and embed the generated insights directly back as an integral part of the Knowledge Graph. And that this very active branch of machine learning is starting to outperform ‘traditional’ AI in complex tasks.
• Unified Network. And finally, it states that these three networks (data, cloud & AI) can be joined into one Knowledge Graph that has the powers of each component but is also more than just the sum of those parts.

The task will then ground this in reality by outlining three practical tools:

• The Graph Adapter. Which sits on top of the existing databases, APIs and files in your organisation and converts 2D sets of tabular data into 3D graphs of data. The key intuition here is that the underlying databases, files and APIs do not need to change — the adapter just exposes a network-shaped layer on top of all other data structures.
• The Data Service. Which is a specialisation of an existing and well-established architectural pattern called a microservice (but where data itself is treated as a first-class citizen). An individual data service can use the graph adapters to publish graph fragments into the cloud using an HTTP server. Each data item is given a unique resolvable network address in the form of a URL. The data services (or Data Products) combine to form a peer-to-peer network (or Data Mesh).
• The Graph Neural Network. This allows each data service to mirror the passive graph-shaped data with an active graph-shaped machine learning model that gives each node the potential to also learn and predict. The data service publishes these learned node embeddings back into the network as pure data.

Speaker: Tony Seale

After 10+ years implementing knowledge graph solutions across many industries, we created graph.bulid, to greatly lower the time and cost of creating graph models, semantic or property, from any data. First we will demonstrate our highly configurable, highly available, cloud ready, horizotally scalable architectures, ready for any production workload. Then we demonstrate our ground breaking UI: * Design your target graph models visually * LPG or RDF * Target SPARQL, OPENCYPHER or GREMLIN * Source data from SQL, Kafka, JSON/XML/CSV + many more * One UI/UX, Multiple sources, multiple mappings, multiple users * Massive workloads.

Speaker: George Richard Loveday

Graphs help answer complex questions, but they have traditionally been far too slow to be used in high-volume streaming data applications. While graph _databases_ have served batch-processing use cases for decades, a new streaming architecture is showing profound results for modern high-volume data pipelines. Quine (https://quine.io) is a new open source “streaming graph” with a fundamentally new architectural design allowing the common property graph data model to easily scale beyond millions of events per second. This talk will explore the design and applications of the Quine streaming graph for modern high-volume data pipelines.

Speaker: Ryan Wright

The EU Knowledge Graph at the European Commission

The European Commission is maintaining a Knowledge Graph using Wikibase, the same software that is running behind Wikidata (one of the most successful public Knowledge Graphs). In this talk we describe:
– the content of the EU Knowledge Graph
– why and how it is reusing Wikibase and software that is integrated with it (like OpenRefine)
– how data is ingested and maintained fresh
– what services are served over this Knowledge Graph

It is generally difficult to see a production Knowledge Graph. In this talk we will show you its main components and describe how it is operated behind the scene.

This presentation will be given together with Max De Wilde, information architect at DG CNECT, European Commission.

Speaker: Dennis Diefenbach

Graph-based technologies are appealing because they promise more flexibility than other database technologies. However, it turns out that graph-based databases still require data to be structured and are not that different from traditional databases, therefore creating disappointment for end users whose expectations are towards more flexibility. We will discuss why this disconnect exists and present technical solutions to fill the gap.

Speaker: Michel Biezunski

Web APIs have become a de-facto standard to enable HTTP-based access to machine-processable data. In this talk, I will present the SPARQL micro-service architecture, a lightweight approach that bridges web APIs and RDF knowledge graphs by making it possible to enforce a uniform data model over multiple sources, while using SPARQL as a unique query language. I will illustrate how we apply these techniques in the biodiversity area where multiple data sources provide complementary yet often conflicting data, and how we can integrate in-house and public data sources in a seamless knowledge graph supporting our applications and enriching the users’ experience.

Speaker: Franck Michel

When people think of knowledge graphs and graph analytics they reflexively think they need a specialized database. That answer is often incomplete or just wrong for their workload. We’ll analyze the 11 access patterns involved in graph analytics and build a logical and physical reference architecture referencing market leaders found in 2023. If we have time, we’ll have a quick demo of a Databricks Lakehouse doing valuable graphy stuff from a recent customer related engagement.

Speaker: Douglas Moore

Deep Learning for and with Knowledge Graphs

In this hands-on masterclass, we show how to do this with examples from RelationalAI’s Rel language. We also preview several other interesting tasks that use MLMs, such as semantic search and automatic labeling of features with concepts from an ontology. This combination of the formal and informal is the future of AI. Jupyter Python notebooks will be made available ahead of the class showing how to do (a) through (d) above with OpenAI’s GPT3.5/ChatGPT models. Participants wishing to run the code during class will need their own accounts on OpenAI. Examples will be demonstrated in real use cases.

Speakers: Vijay Saraswat, Nikolaos Vasiloglou

Knowledge graph construction which aims to extract knowledge from the text corpus, has appealed to researchers. Previous decades have witnessed the remarkable progress of knowledge graph construction on the basis of neural models; however, those models often cost massive computation or labeled data resources. Recently, numerous approaches have been explored to mitigate the efficiency issues for knowledge graph construction, such as prompt learning. In this talk, we aim to bring interested researchers up to speed on the recent and ongoing techniques for efficient knowledge graph construction with pre-trained language models.

Speaker: Ningyu Zhang

One of the current key challenges in Explainable AI is in correctly interpreting activations of hidden neurons. It seems evident that accurate interpretations thereof would provide insights into the question what a deep learning system has internally detected as relevant on the input, thus lifting some of the black box character of deep learning systems. The state of the art on this front indicates that hidden node activations appear to be interpretable in a way that makes sense to humans, at least in some cases. Yet, systematic automated methods that would be able to first hypothesize an interpretation of a hidden neuron activations, and then verify it, are mostly missing. In this presentation, we provide such a method and demonstrate that it provides meaningful interpretations. It is based on using large-scale background knowledge – a class hierarchy of approx. 2 million classes curated from the Wikipedia Concept Hierarchy – together with a symbolic reasoning approach called concept induction based on description logics that was originally developed for applications in the Semantic Web field. Our results show that we can automatically attach meaningful labels from the background knowledge to individual neurons in the dense layer through a hypothesis and verification process.

Speaker: Abhilekha Dalal

Graph embeddings can be used for a variety of applications, including recommendation, fraud detection, and other machine learning tasks. In this work, we aim to walk through various different embedding techniques, starting with spectral approaches, moving towards graph neural networks, and finally newer, inductive techniques such as NodePiece. Throughout the tutorial, we will be implementing algorithms using the (open-source) TigerGraph Machine Learning Workbench, a tool for easily training machine learning algorithms on large-scale graph datasets. Using a variety of datasets, we will discuss the pros and cons of each technique, demonstrate them working, and examine future directions for the field of graph embedding research from the lens of industry.

Speaker: Parker Erickson

In this talk, we explore how such hierarchical ontological components in knowledge graphs are incorporated into KG representation learning. We present multiple practical machine learning methods, such as hierarchical graph modeling, graph neural networks, self-supervised learning, and language models, that can effectively and efficiently capture ontological information, given different knowledge graph formulations. As a result, our proposed approaches address various real-world challenges in multiple domains, from knowledge graph itself to diverse disciplines including natural language processing (language models), recommender systems, bioinformatics, and societal studies, and expand ML frontiers to knowledge graphs to multi-modal applications.

Speaker: Junheng Hao

Knowledge Graphs (KGs) are often generated automatically or manually which lead to KGs being in complete. Recent years have witnessed many studies on link prediction using KG embeddings which is one of the mainstream tasks in KG completion. Most of the existing methods learn the latent representation of the entities and relations whereas only a few of them consider contextual information as well as the textual/numeric descriptions of the entities. This talk will cover deep learning based methods for performing KG completion tasks such as link prediction and entity type prediction.

Speaker: Mehwish Alam

Visual AI has made incredible progress in basic vision tasks using deep learning techniques that can detect concepts in visual scenes accurately and quickly. However, the existing techniques rely on labelled datasets that lack common sense knowledge about visual concepts and have biased distribution of visual semantic relationships. As a result, these techniques have limited visual relationship prediction performance, limiting the expressiveness and accuracy of semantic representation and downstream reasoning. We employed deep neural networks to predict visual concepts, including objects and visual relationships, and linked them to generate symbolic image representation. To alleviate the challenges above, we leveraged rich and diverse common sense knowledge in heterogenous knowledge graphs to systematically refine and enrich the generated image representation. As a result, we observed significant improvement in recall rates of visual relationship prediction (7% increase in Recall@100), expressiveness of the representation, and the performance of downstream visual reasoning tasks, including image captioning (15% increase in SPICE score) and image reconstruction. The encouraging results depict the effectiveness of the proposed approach and the impact on downstream visual reasoning.

Speaker: Muhammad Jaleed Khan

We present a novel approach for learning embeddings of concepts from knowledge bases expressed in the ALC description logic. They reflect the semantics in such a way that it is possible to compute an embedding of a complex concept from the embeddings of its parts by using appropriate neural constructors. Embeddings for different knowledge bases are vectors in a shared vector space, shaped in such a way that approximate subsumption checking for arbitrarily complex concepts can be done by the same neural network for all the knowledge bases.

Speaker: Jedrzej Potoniec

Conversational Systems (CSys) represent practical and tangible outcomes of advances in NLP and AI. CSys see continuous improvements through unsupervised training of large language models (LLMs) on a humongous amount of generic training data. However, when these CSys are suggested for use in domains like Mental Health, they fail to match the acceptable standards of clinical care, such as the clinical process in Patient Health Questionnaire (PHQ-9). The talk will present, Knowledge-infused Learning (KiL), a paradigm within NeuroSymbolic AI that focuses on making machine/deep learning models (i) learn over knowledge-enriched data, (ii) learn to follow guidelines in process-oriented tasks for safe and reasonable generation, and (iii) learn to leverage multiple contexts and stratified knowledge to yield user-level explanations. KiL established Knowledge-Intensive Language Understanding, a set of tasks for assessing safety, explainability, and conceptual flow in CSys.

Speaker: Manas Gaur

Graphs are ubiquitous, they form a language for describing entities and their interactions. They are emerging as a powerful data analytic tool for addressing difficult real-world problems and more recently graph representation learning has revolutionized AI and modern data science tasks. In Optum, we were aiming to investigate the marriage of graph DBs (Knowledge graphs) along with the state-of-the-art Graph ML(e.g. GNN) models for addressing challenging healthcare related problems such as fraud detection.

Speaker: Amir Yazdavar

The emerging landscape of deep learning and knowledge graph technologies provides vast opportunities to use public repositories as a source of knowledge for recommendations. Often, the knowledge is represented as a knowledge graph, and a recommendation regarding an entity instance is extracted by querying it. For example, a knowledge graph representation can be used for describing cybersecurity domain entities, such as adversarial techniques and their countermeasures. We can use this graph to recommend a specific countermeasure given a system-specific detected technique. This approach raises few challenges. First, how to correlate between the instance entity and the suitable object in the knowledge graph? Second, how to extract the recommendations from the graph? We developed a hybrid AI approach that addressed these challenges by utilizing deep learning-based language models and graph traversal algorithms. In this session we will demonstrate how we automatically categorized vulnerability descriptions discovered in specific systems according to their adversarial techniques and recommended relevant countermeasures.

Speaker: Hodaya Binyamini

Environmental, Social and Governance

Presentation covers a technology used to automate Knowledge extraction from Text. This novel technology blends neural language models, semantic tech, rule systems, linguistic theory to achieve reliable extraction performance. Specifically, the dicussion will focus on the work done together with Dow/Factiva, involving the extraction of facts buried in news articles, news letters, reports, etc. about the subject area of ESG (Environmental..) through the application of a homegrown ESG taxonomy and ontology. Extracted facts are output as RDF triples and ingested into a Semantic Knowledge Graph stored in a triple store. Also, the Knowledge graph supports BI & reporting over these facts (inferred facts with reasoning!) through standard graph queries.

Speakers: Prasad Yalamanchi

240 TeraWatts and counting. Mobile network infrastructures are consuming a significant portion of the world’s energy. This is the story of GEORG GEIGER who as the head of NOKIA’s mobile network software supply-chain identified knowledge graphs as a means for the automation of his business ….and ended up finding billions of dollars’ worth of potential energy savings.

Speaker: Chris Brockmann

While 2.5 quintillion bytes of data were produced every day in 2021, this number is expected to grow exponentially in the coming years. With the constantly increasing flow of data, organisations, particularly those in the Global South, are struggling to process, structure, use, and share the data produced leading to missed opportunities and inability to track progress and monitor successes and failures. Graph databases can help prevent poor-quality data with standardization and connected data.

Speaker: Firuza Nahmadova

ESG (Environmental, Social, Governance) is increasingly becoming the central guiding principle for companies in all industries. ESG need not remain an imposed compliance constraint for companies, but can also provide a wealth of opportunities and ideas for how organizations can evolve. However, once ESG is recognized as a major opportunity, it quickly becomes clear that this set of issues is extremely multi-layered, complex and interconnected. Knowledge-based enterprise recommender systems promise welcome support in enabling personalized views of ESG-relevant content. This presentation will report on the development of an ESG knowledge graph that serves as the core of several applications, including an enterprise recommender system that generates personalized views of relevant ESG content.

Speaker: Andreas Blumauer

In this talk, we present the SustainGraph, as a Knowledge Graph that is developed within the framework of the ARSINOE Horizon Europe project to track information related to the progress towards the achievement of targets defined in the United Nations Sustainable Development Goals (SDGs) at national and regional levels. The SustainGraph aims to act as a unified source of knowledge around information related to the SDGs, by taking advantage of the power provided by the development of graph databases and the exploitation of Machine Learning (ML) techniques for data population, knowledge production and analysis. The main concepts represented in the SustainGraph are going to be detailed, while indicative usage scenarios are going to be provided.

Speaker: Eleni Fotopoulou

TellFinder is a counter human trafficking application with origins in a DARPA R&D project that is nearing 10 years of active deployment with various agencies. Based on a knowledge graph-like structure extracted from enriched adult service ads, we have learned from numerous unique challenges over the life of this application, its users and the technology and want to share our business successes and challenges.

Speaker: Emily Wyatt

In the field of sustainability reporting and auditing, sustainability presents some challenges and opportunities. Due to the difficulty in the technical analysis of this data and the lack of a central database, we show in the presentation how data from different sources on the suppliers and the supply chain of the textile companies Adidas, H&M and Nike can be transferred into a knowledge graph and what advantages this offers for sustainability reporting and auditing.

Speaker: Julian Gruemmer

Metadata

In general, property graphs are very flexible since we can associate any number of properties with nodes and edges. To add more structure, nodes and/or edges are often typed (via a label). In that case, a labeled node (or edge) of a particular type is expected to have specific properties. This works fine if node types are well defined and remain relatively stable. But what if we want to define relationships between any kind of nodes (existing or future node types)? For instance, in a metadata graph, we may be interested in the data lineage between various node types (“entities”), but in reality it doesn’t matter whether the node type is a dataset, the input to (or output of) a machine learning model, a physical device or digital twin that provides real-time data, etc. To model data lineage, all nodes need to include a group of properties that we would refer to as a database schema, but the actual type of those nodes is irrelevant. In general, how nodes can be related to other nodes, or how any service can observe or interact with nodes in a graph merely depends on the shared groups of properties which are often referred to a aspects or facets. In our presentation, we provide numerous examples of the various benefits that graph models which are based on facets provide. In particular, we will focus on actionable property graphs that can be utilized for self-governing data management and various aspects of optimizations via a pattern that is very popular in game programming, namely Entity Component Systems (ECS). Instead of defining nodes of a particular type, nodes are merely modeled as UIDs and sets of facets (aspects, components) that are standardized and can be added dynamically. For a metadata graph, this could include the logical model (via schema and ontology facets), physical aspects (facets for data formats and locations), statistics and usage, governance (e.g. facets state details about the inclusion of personal identifiable information). In order to make a graph actionable, external processes (so-called systems) operate on (arbitrary nodes) that happen to include certain facets. One system would operate on nodes that contain a schema facet and ensure that data lineage is maintained and provides an impact analysis if changes are necessary. Another system continually monitors access restrictions for nodes that represent datasets and contain a facet that specifies personally identifiable information.


Other systems automate the data placement of datasets, but operate on nodes that include multiple facets (for data location, but also usage statistics, and PII). With this information, the system can find the optimal location of a dataset while taking usage and legal restrictions into consideration.
While the concept of facets or aspects is not new, the purpose of the presentation is to raise awareness for the benefits of facets – in particular we show how facets can help turning property graphs into “active” property graphs.

Speaker: Jens Doerpmund

Sourcing valuable data has become a competitive advantage for many market-leading companies in the past several years. But finding that data and putting it to use efficiently and effectively is an opaque and complicated process with innumerable steps and gatekeepers. Jordan Hauer has been an expert in this space for over a decade and will educate the audience on methodologies to increase an organization’s capacity for onboarding new external datasets and for structuring research regarding external data.

Speaker: Jordan Hauer

We spend a lot of time to optimise our digital platforms, websites, apps to rank higher in Google and for customers to click our links and come to our website from search engines. And sometimes we feel our job is done when our websites rank higher than our competitors in Google (and other search engines) and there is a lot of quality traffic that enters our digital ecosystem. Yes, our job on SEO is done. But the bigger piece has only just started. How do we ensure that this quality traffic is converting? That they’re finding the right content, products or services that will meet their individual needs? That it’s easy for them to look for the right information? And that through that we as business are able to meet our goals: selling a product, having them apply for a service, make them do a task (e.g. subscribe to a newsletter), etc. Putting in the right strategies and principles to bridge this gap between SEO and CRO can help businesses save on their marketing dollars and cut short their marketing budgets that they spend to attract quality customers to their website.

Speaker: Kanika Bhatia

In this presentation we will show how we are using curated information stored in the IKEA Knowledge Graph to generate valuable product data. The manual effort is done on a general level and is small in size compared to the generated product data. This approach proves that “a little semantics goes a long way” and helps IKEA to serve its customers better information on its various products.

Speaker: Adam Keresztes

UBS is building a new firm wide data platform that will empower every employee to harness AI, data and analytics to gain valuable insights for our clients. Knowledge Graph technology plays a key role key in this next generation data ecosystem by helping to better find, understand, trust and use relevant data assets.

Speaker: Gregor Wobbe

This presentation deals with the creation of a Knowledge Graph for an online leading news publisher in Germany. We used the technology behind Knowledge Graphs to semantically link tags and entities to articles, injecting content all over the website with context-rich information that could talk to search engines and appeal to users. In our analysis we will present results from Organic Search and showcase the validation process we used to ensure the correctness of the information provided at all stages to constantly improve the knowledge base. The aim of this presentation is to showcase the power of Knowledge Graphs in the field of modern SEO.

Speaker: Beatrice Gamba

Natural Language Processing (NLP)

Drugs4Covid is based on the preliminary work of 15 people from the Ontological Engineering Group of the Universidad Politécnica de Madrid (OEG-UPM) in two hackathons organized by the Community of Madrid and the European Commission (‘Vence al Virus’ and ‘EUvsVirus’). During the peak of the pandemic, the Madrid Health Service and some hospital pharmacies expressed drug shortages and the need to find available substitutes in their pharmacies. In addition, pharmaceutical companies needed to know the active ingredients used in clinical trials in order to be able to supply and manufacture different dosage forms of them in case of success. To meet these challenges, we had repositories of scientific documentation related to the coronavirus thanks to the fact that in March 2020 the White House Office of Science and Technology Policy called on the AI community to develop techniques and resources that could solve scientific questions about the COVID-19 from scientific texts.

Speaker: Carlos Badenes-Olmedo

Despite the excitement about Large Language Models (LLM), these models suffer from hallucinations problems, e.g., generating factually incorrect text. These problems restrict the development of production-ready applications. This talk will highlight the importance of combining Knowledge Graphs with Large Language Models to develop industry-ready applications. We will present different approaches, from pragmatic to under-research approaches to threat and handle hallucination problems using Knowledge Graphs at different phases of the LLM lifecycle. We will accompany our presentation with use cases that Fraunhofer works with partners from big German industries under the OpenGPT-X project.

Speaker: Diego Collarana-Vargas

Many industries store vast amounts of information as natural language. Current methods for composing this text into knowledge graphs parse a small set of relations from within a larger document. The author’s specific diction is approximated by the vocabulary of the model. In domains where precise communication is critical, this approach is not sufficient. We propose a novel approach to encode natural language into a knowledge graph without any loss of context.

Speakers: Michael Miller

Natural language search over a knowledge graph presents unique challenges as the entities of a knowledge graph differ in structure compared to traditional documents. In this talk, we discuss methods of implementing natural language search over entity space within a knowledge graph using such techniques as entity expansion.

Speakers: Michael Iannelli

Signals are emerging pieces of information relevant to a given context and offer potential for strategic advantage in a multitude of domains. However, sorting the signal from noise on large textual data is a very tedious process for humans. We introduce a scalable approach that extracts signals from hundreds of crawled sources and maps their metadata to a knowledge graph by exploiting state-of-the-art neural models for natural language understanding.

Speakers: Tommaso Soru

Knowledge graphs (KG) play a crucial role in many modern applications. Industrial knowledge is scattered in large volumes of both structured and unstructured data sources and bringing them to a unified knowledge graph can bring a lot of value. However, automatically constructing a KG from natural language text is challenging due the ambiguity and impreciseness of the natural languages. Recently, many approaches have been proposed to transform natural language text to triples to construct KGs. Out of those, approaches based on transformer language models are predominantly leading in many subtasks related to knowledge graph construction such as entity and relation extraction. In this presentation, we will focus on the state of the art of transformer language model based methods, techniques and tools for constructing knowledge graphs from text, their capabilities, limitations and current challenges. It aims to summarize the research progress over the KG construction from text with a specific focus on the information acquisition branch entailing entity and relation extraction covering the state of the art transformer methods and tools. This will be useful for any practitioner who is interested in building knowledge graphs for their organizations.

Speakers: Jennifer D’Souza; Nandana Mihindukulasooriya

Significant portions of the data generated in enterprises are unstructured and text-based. This can span the entire product lifecycle, from early research to post-launch analysis. A major challenge for companies is managing these vast amounts of text data and extracting hidden and valuable information to gain actionable insights and make important business decisions. Thanks to recent advances in the field of natural language processing (NLP), this is now possible.

Speaker: Harsha Gurulingappa

Ontologies, Taxonomies, Data Modeling

Golden is building an open, transparent distributed knowledge graph of triples representing canonical reference data of billions of entities, using Web 3-based community reward structures for incentivizing, adding and verifying triples. It promises to address several of the issues that have stymied past open data efforts including skills, understanding and motivation.

This presentation shares the learning from building the extensible ontology to support such a community-based graph, which is both broader and more constrained than typical Enterprise Knowledge Graphs, while providing a service to them.
Key aspects include:
– Balancing business focus with logic and precision; and use of AI-based ingestion
– Choice of predicates and entity types (classes), supplemented with taxonomies
– Entity disambiguation via disambiguation predicates and external identifier mapping
– Use of qualifiers for supplementing the triples with temporal and provenance
– Working with the community for governing triple verification and ontology extension

Speaker: Pete Rivett

Business impact datasets from supply chains are modelled on KGs with emphasis on rich relationship context. KGs vary in scale from millions to billions of entities and require parallel processing techniques for building, traversing, and computation and are built on an in memory horizontally scalable graph computational platform developed at BASF in cooperation with DerwenAI. Visualisations of relationship context are presented using Graphistry.

Speaker: Janez Ales

Fraud hurts the integrity of US federal programs and erodes the public’s trust in the government. To assist agencies with combatting fraud and to improve its measurement through common definitions, GAO has developed the GAO Fraud Ontology. The model addresses the key elements of what occurs in a fraud scheme affecting the federal government, related elements, and its implications. It also serves as the basis for the AntiFraud Resource, a site focused on educating federal program officials about fraud and strategies for assessing and managing their fraud risks. This presentation will detail the process used to to develop the ontology, how it is supporting GAO work, and future directions the work may take.

Speaker: Leia Dickerson

Recommendation systems are at the heart of many products we use today, helping us discover new music, expand our wardrobes, and navigate the massive amounts of information on the Internet to answer our search queries. In a world where efficiency and accuracy is paramount, and processing power and availability of user data varies across industries, graph-powered engines and their ability to deliver continued performance at scale provide many advantages.

How are these recommendations determined? What conditions must exist in order to build a graph-based system that produces accurate, relevant results? What technologies are at play under the hood? Through the business use case of a leading national learning management system (LMS) in the healthcare industry, this presentation will answer these questions by providing an overview of the problem statement and the solution architecture, including technical dives into NLP taxonomy enrichment, knowledge graph development, and recommender logic.

Speakers: Fernando Aguilar; Holly Maykow

This presentation shows the approach of making use of Knowledge Graphs in Data Spaces and Data Markets to foster data- and semantic interoperability. Interoperability is the enabler of efficient and sustainable data sharing between organisations, either in a certain industry or across industries, either in the form of data trading or as data collaborations. This talk will explain the basic principles of Data Spaces, draw the problem statement of interoperability for value-added data sharing, and present the solution approach – including real world examples – of using Knowledge Graphs to foster Interoperability in Data Spaces.

Speaker: Martin Kaltenböck

Knowledge Graphs (KG) are rapidly gaining ground as a representation of choice for modeling highly linked data! If done well, using OWL to define the KG’s metamodel allows for the meaning of the concepts to be sharable, unambiguous, deeply logical, consistent, and capable of inferring new knowledge via automated reasoners. However, “doing an OWL model well” is often pretty difficult. That’s where gist can help. gist is Semantic Arts’ minimalist upper ontology, designed to provide the maximum coverage of typical business concepts with the fewest number of primitives. As such, it creates a great starting point for any new semantic KG project.

In this talk, we will:
– Introduce gist
– Give some small case studies of how it has jump-started many of our real-world KG projects
– Offer some Q&A time to ask questions about gist
– Explain how you can get involved in using and contributing to gist

Speaker: Mark Wallace

In the work, we present MatKG, the largest knowledge graph in the field of material science. It contains over 80,000 unique entities and over 5 million statements covering several topical fields such as inorganic oxides, functional materials, battery materials, metals and alloys, polymers, cements, high entropy alloys, biomaterials, and catalysts. The triples are generated autonomously through data driven natural language processing pipelines and extracted from a corpus of around 4 million published scientific articles. Several informational entities such as materials, properties, application areas, synthesis information, and characterization methods are integrated together with a hierarchical ontological schema, where the base relations are extracted through statistical correlations to which higher level ontologies are appended. We show that using a graph representation model we are able to perform link prediction allowing the correlation of materials with novel properties/application and vice versa.

Speaker: Vineeth Venugopal

Regulatory complexity is causing a heavy burden on financial institutions, especially when markets expect more rapid innovation to serve its needs. Meanwhile, regulators keep placing more and more expectations to protect the financial, economic and social systems. Despite AI’s great strides in text processing, the compliance burden stands to benefit the most from simpler, structured ways of encoding and sharing knowledge that fills the gap of modern risk-based, implementation-specific approaches.

Speaker: Rashif Rahman

Large enterprises maintain a multitude of data assets pertaining to their businesses. It’s arduous for engineers and data scientists to not only find the information they need, but also to ensure it’s accurate and up to date. This can lead up to data duplication, misuse of assets, and conflicting results. We address this problem by leveraging an enterprise ontology, which we assume users in a domain will intuitively understand. Our approach is novel in that we allow users to navigate over their data assets semantically, using the concepts and relationships of this ontology. We present a case study involving a client that uses an ontology comprising hundreds of concepts to efficiently search for and manage a set of data assets that number in the tens of thousands. This solution uses the RelationalAI Knowledge Graph Management System to power this search process. We include a live demonstration of the working solution and discuss some important lessons learned.

Speaker: Márton Búr

Semantic Layer

In financial services, a common language and data model are essential to not only meet regulatory needs but also to stay competitive by creating more products more quickly and monetizing on massive amounts of accumulated, heterogeneous data. In fact, we see an increasing number of semantic layer and modeling tools such as Legend, Morphir, and others coming into the open source realm and gaining adoption amongst other institutions to try to address this. Historically, however, there are challenges with integrating and executing these semantic layers within an existing data infrastructure ecosystem at scale. This often results in obstacles to adoption and difficulties in transitioning efforts to production.

In this talk, we will provide a specific example of how we use relational semantic layers to solve this challenge through a financial services use case. You’ll learn about semantic layers in financial services and how a relational semantic layer fits in a modern data stack. You’ll also get a technical review of an applied financial services use case involving PURE/Legend, and find out how the business benefits from having a generic model of representation and execution that spans all data sources and types (e.g., semistructured, graph, tabular, etc.). The talk will end with forward-looking thoughts on the industry and a chance for you to ask questions of some of the experts implementing these solutions.

Speakers: Gerald Berger; Michelle Yi

Tired of wondering why your critical business numbers don’t match? Overhead of propagating data knowledge in your organization getting you down? Hear from dbt Labs Director (Data & Community) Anna Filippova on how to apply a knowledge graph approach to managing data in an organization to help you 1) reduce decision making complexity, 2) minimize organizational inefficiencies and 3) empower more people in an organization to access and use data effectively. Using the example of the dbt Semantic Layer, the session will cover how to think about data like a structured graph (rather than a data swamp), why this will help you break down information silos in your organization, and practical advice for how your organization should evolve to embrace this transformation.

Speaker: Anna Filippova

Semantic Layers are a way for us as data practitioners to codifier our knowledge of what the data means in a way that allows other people to self serve insights better. This presentation is a survey of the current state of semantic layers, including LookML, Atscale, DBT, MetricFlow and others. We’ll also look at how those semantic layers are being adopted by presentation layers, such as Lightdash, Hex, and others, and try and find what we should do today to be ready for tomorrow.

Speaker: Dylan Watt

Business application developers put a wealth of business knowledge into their code. Too often, aside from executing, that information is left untapped for knowledge purposes. In fact, well-designed code is ripe with valuable information that can be extracted directly from the code for use in a variety of knowledge-based technologies. That can have direct benefits for regulated industries, like Finance, that are faced with a growing burden to demonstrate proper management of their data. This presentation explores the implementation of an entire U.S. financial regulation to demonstrate how well-designed code can be leveraged to integrate with knowledge tools like semantic, data catalog, data lineage, audit, and graph technologies.

Speaker: Stephen Goldbaum

Systems and Scale

Our framework is built on top of a distributed Hadoop/MapReduce/Hbase infrastructure capturing both “low-level” graph database operations as well as “higher level” algorithmic aspects such as vertex-vertex similarity, graph clustering, and “robust” vertex id stamping. We have been using the framework to de-duplicate the Goldman Sachs Knowledge Graph – in this talk we will report a few experimental results applying the framework to public datasets.

Speakers: Christos Boutsidis; Malik Magdon-Ismail

General Track

Personal Knowledge Graphs graphs are a new and still marginal breed of knowledge graphs. But maybe they hold the biggest potential to revolutionize the way we work and think. We can even rethink personal computing. In this talk, Ivo will review the current state of play of PKG, bring some new perspectives, and share visions for the possible future. The current applications of PKG range from user-controlled personal recommendations to healthcare systems to knowledge management. This talk will focus on the latter. The way PKG can support thinking, research and creativity can be transformative for personal and inter-personal knowledge management.

Speakers: Ivo Velitchkov