Joshua Shinavier is a Research Scientist at Uber, and holds a PhD in Web science from RPIs Tetherless World Constellation. He co-founded what is now Apache TinkerPop, contributing to the first common APIs for graph databases, the RDF-based query language that preceded Gremlin, and the first tools which aligned the property graph and RDF data models, starting with neo4j-rdf-sail in 2008. As of 2017, he is part of the knowledge graph team at Uber, where he also leads a company-wide effort to unify data models and schemas across RPC, streaming, and storage. He feels, now as ever, that the research, business, and open source communities have a lot to learn from each other with respect to graphs and knowledge representation.
2019 Talk: Building an Enterprise Knowledge Graph at Uber: Lessons from Reality
The origins of graph databases, like the origins of digital knowledge representation and inference, can be traced as far back as the 1960s. So why do knowledge graphs seem like a new thing? While there is a compelling vision that has powered decades of research as well as the development of robust standards like RDF and SPARQL, it is only fairly recently that large technology companies have taken the first tentative steps toward making knowledge graphs core to their business. This entails an intricate process of adapting high-level design goals to the realities of existing data infrastructure, available tools, and developer culture. Controlled vocabulary, well-understood data models and query languages, and some form of rules or reasoning are all essential, but there is frequently a mismatch between research and standards on the one hand, and practical constraints on the other. In this talk, we will present an overview of Uber’s knowledge graph and use cases, together with a discussion of the demands of very large and rapidly changing datasets, the data modeling practices that work best in our environment, and the need to uphold user data rights.