The Case for KGs: Private vs. Public Graphs – The Knowledge Graph Conference

By Paco Nathan

It’s an interesting time to be working with knowledge graphs. A time of stark contrasts in realizations. On the one hand, I talk with many people who would otherwise be quite expert about machine learning use cases who say, “Yeah, well, no one really uses KGs in production.” On the other hand, there are so many instances of large, private KGs used in industry.

We could tour through KG use cases at most all of the well-known tech companies, especially within ecommerce and social networks: Google – yes, yes, we’ve all heard about that one. Bing – ok, that makes sense too. Both of those companies use KGs for search, we get it. Amazon and eBay – um, sure, more product-oriented search. Microsoft/LinkedIn – obviously. Facebook, Twitter, Pinterest. IBM. Hey Siri. See also Uber, Lyft, Airbnb, Netflix, etc. For comparisons among some of these larger practices refer to “Industry-scale Knowledge Graphs: Lessons and Challenges” by Natasha Noy, et al., in ACM Queue 17:2 (2019). Typical use cases include: discovery, recommendations, data governance, compliance, conversational agents, and so on.

There’s a whole other realm of private KGs in scholarly infrastructure used for research, such as Scopus, Dimensions, and arguably other reference tools such as Wolfram Alpha fits here even though its usage is free.

Throughout private industry, there are also quite a range of KG use cases among the different business sectors … for example, Refinitiv in FinTech and AstraZeneca in Pharma.

While most all of the examples mentioned above are built and maintained by private firms for commercial uses, there are other large KGs based on public information, intended for public use. Often these result from community projects or government agencies.

Common Crawl maintains an open repository of the world’s web pages and Wikidata serves as the central storage for structured data in Wikipedia and its sister projects. While not quite Google or Bing search, they provide analogous data+metadata at scale.

In terms of research tools and scholarly infrastructure, Semantic Scholar, Research Gate, PubMed, OpenAIRE, Crossref, Unpaywall, and more similar services are all free to use, with open APIs – and quite good. Plus, ORCID has some analogies to LinkedIn among researchers and has been growing. There’s a Python open-source library richcontext.scholapi that federates searches and API integration across most of these services.

Last month here we looked at knowledge graphs in government, especially for geospatial data. While the following do not provide one-to-one comparisons with sophisticated financial services such as Refinitiv, there are open/community projects such as RePEc (economic research) and GDelt (worldwide news in 100+ languages).

The point of drawing those contrasts above was that much of the knowledge work mentioned above is driven by commercial needs. While there are public/government/community analogs in many cases, the private KGs tend to be larger and better-funded. This is the point where we should discuss Underlay at MIT. This is a project within the Knowledge Futures Group, led by Danny Hillis, Joel Gustafson, Samuel Klein, et al.

The Underlay is a global, distributed graph of public knowledge. Initial hosts will include universities and individuals, such that no single group controls the content. This is an attempt to replicate the richness of private knowledge graphs in a public, decentralized manner.
Underlay.org

The naming pun is that an “underlay” is the opposite of an “overlay”. Think of it as a counterbalance for the large private KGs driven by commercial entities. Underlay’s premise is that a KG can be constructed from distributed transactions which they called assertions: immutable statements that specify provenance, cryptographically signed for trust- or context-based filtering. These assertions are combined through a process called reduction: modifying the graph to produce a consistent state. The results are curated into groupings called collections: containers for useful scopes of sub-graphs. This “ARC” protocol is not quite a blockchain, although not far from blockchain’s foundational concept of a distributed ledger.

Imagine if many universities or other community projects around the world collaborated to host a distributed Underlay of the world’s knowledge. Then we wouldn’t necessarily rely on the good graces of advertising companies, e.g., Google, Facebook, etc., for representing unbiased knowledge resources to the public. Also, the governance for Underlay has been engineered to be much more sophisticated than earlier crowd-sourced resources such as Wikipedia – by using RFCs, much like how the Interwebs were originally built.

Overall, check out the full project repos at https://github.com/underlay/

Metadata Day 2020

ByFrançois Scharffe January 27, 2021

By Paco Nathan For a full day on December 14, 2020, LinkedIn sponsored a virtual workshop called Metadata Day, followed by a public online meetup called Metaspeak. View videos of the event on their website including a set of lightning talks provided by several of the speakers. The gist is that circa 2018, the pending…

KGC 2025 | Learning material

Building an Enterprise Information Architecture: Top-Down or Bottom-Up, Which Is Better?

ByThe metaphacts Team March 11, 2025

metaphacts is a Gold Sponsor of the Knowledge Graph Conference 2025. Get your ticket today or learn more about sponsorship opportunities. *** Executing any type of enterprise-wide change can seem intimidating due to the processes involved and the need for alignment among various stakeholders and departments.

Learning material

5 Questions with Michael Atkin

ByFrançois Scharffe October 12, 2020

1. Tell us about yourself. I have a strange and sordid history. I’ve been an analyst and advocate for data management since 1985. I started with publishers–the owners of intellectual property–as they discovered concepts associated with the principles of data management. After publishing, I concentrated my efforts on the financial industry, learning how data goes…

Conference news | Learning material

metaphacts & The Knowledge Graph Conference Launch the New Resource Hub

ByPaige Barrett December 18, 2024December 18, 2024

Learn all about the brand-new Knowledge Graph Conference Resource Hub (built by metaphacts) and explore past KGC talks, presentations, and workshops.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

By Paco Nathan

Similar Posts