OntoSpeak

A free fortnightly newsletter on Taxonomy, Thesauri & Ontology and Semantic Publishing



Uber Expert on Building Knowledge Graphs for Uber

Uber Expert on Building Knowledge Graphs for Uber

May 20, 2019

Joshua Shinavier, a research scientist at Uber spoke at a two-day conference on “knowledge graphs,” hosted by Columbia University’s School of Professional Studies. During the talk, he shared insights on how graphing tools were used to manage entities and relationships for the huge data management tasks at Uber. The focus of his presentation was on the organizational challenges in building a graph at an enterprise.

The scale of data at Uber is huge. The ride-sharing enterprise has 200,000 individually managed data sets. Having passed the ‘ten-billion-trip’ milestone in rides served last year, the company is on a daily basis amassing “low-thousands of entities,” which have to be included in its knowledge graph. Furthermore, the data being populated is messy as Uber drivers enter data manually using their phones.

As a first step, developing a knowledge graph involved establishing “some kind of system for a shared vocabulary”. In addition, only a few off-the-shelf tools were used and Uber took advantage of the dedicated infrastructure and teams it had to develop the knowledge graph.

The data Uber had was unique. Most of it was in relational schemas, and it was decided to use knowledge graphs to handle the alerts, notifications, migrations and other related functionalities. Uber created a three-layer knowledge graph. One level was an “OLTP graph,” that makes use of the open-source Cassandra data store. The second level was an “analytics-based graph” that used the Hadoop file system, with Cypher and Apache Spark. A third layer having “graph embeddings” followed the second layer.

One of the ongoing challenges that Uber is addressing is protecting the privacy of user data, especially in light of the European “GDPR” privacy legislation by building solid policies. However, this is tricky because “it’s fairly hard to define” what constitutes data that needs to be kept private.

Click here to read the article.

Brought to you by Scope e-Knowledge Center, an SPi Global Company, a trusted global partner for Digital Content Transformation Solutions, Knowledge Modeling (Taxonomies, Thesauri and Ontologies), Abstracting & Indexing (A&I), Metadata Enrichment and Entity Extraction.

Please give your feedback on this article or share a similar story for publishing by clicking here.

Comments are closed.

Start typing and press Enter to search