How knowledge graphs can be used to assess relationship risk by finding the shortest path to risky entities.

In mathematical academia, there exists a quantity called the Erdős Number. This number tells you the shortest number of connections you have to hop through in a knowledge graph of co-author relationships to get to the famously collaborative mathematician, Paul Erdős. For example, you can use this “Collaboration Distance” tool to see that I have an Erdős Number of 4, because I coauthored a paper with someone (1), who coauthored a paper with someone (2), who coauthored a paper with someone (3), who coauthored a paper with Paul Erdős (4).


Or, in other notation: (Stockton – Wiseman – Ambainis – Schulman – Erdős)

The tool allows you to change either end of the chain, and I can just as easily compute my Einstein number (5), or the “shortest path” between any two mathematicians.

A more popular version of this concept comes from the Bacon Number, or the law Six Degrees of Kevin Bacon, which asserts that most actors are less than six hops away from the famously collaborative actor Kevin Bacon in the graph of movie co-appearances. An example from Wikipedia shows that Elvis Presley has a Bacon number of 2 because: 

  • Elvis Presley was in Change of Habit (1969) with Edward Asner
  • Edward Asner was in JFK (1991) with Kevin Bacon

Or, in other notation: (Presley – Asner – Bacon)

And of course there is the inevitable Erdős–Bacon Number, which takes these non-serious concepts to their even less serious conclusion.

However, outside of academia and movies, financial relationships, and the risks that come with them, can have much more serious consequences. Banks have recently scrambled to comply with sanctions against Russian oligarchs by screening their customer and partner relationships for any connections to this set of Russian entities. There is also a burgeoning industry in supply chain security, where governments and companies are trying to understand the risks presented by each link in the chains of dependencies that determine the viability of critical operations.

Just as we can create an Erdős Number from academic articles, we can also create a Putin Number by finding the shortest path between any person or organization and Vladimir Putin. This can be done by creating a graph of relationships extracted from open source data, both public and commercial. For example, using our notation and our graph, we can see that a Western lawyer has a Putin number of 2, because of the following relationship chain:

(Alastair Tulloch [British Lawyer] linked to Alexander Mamut [Oligarch] linked to Vladimir Putin)

Although these relationships are sourced here from news articles, they can also be found in more trustworthy commercial sources.

Instead of asking, “What’s the shortest path to Putin?” or some other specific individual, we may also want to ask: “What’s the shortest path to any or all risk?” whether an investigator is looking for a particular risk like “Weapons Trafficking” or the collection of all risks in a risk taxonomy.

Quantifind’s Graphyte platform was built to unlock the power of pre-computed, shared knowledge graphs. By fusing many different data sources, unstructured and structured, we can see connections that other technologies cannot see. We can help our customers discover relationship risks that may not have been apparent by considering only each entity in isolation. One link in the chain may come from a public news data set and another may come from a private commercial registration data set, but both are critical for making the connection.

Of course, the concept of “relationship risk” comes with many caveats and potential pitfalls that need to be taken seriously by any solution. Every person in the world is going to be six or so degrees away from even the worst malign actors, Bin Laden or otherwise, and users may misinterpret the significance of a long chain. Some types of relationships do not imply consent or responsibility by both parties. Creating a high-quality graph is difficult and errors, due to entity resolution or misattribution, will propagate. (I.e., any chain is only as strong as its weakest link.) There are many different types of relationships, some stronger than others, and not all are created equal. Any comprehensive graph should fuse many heterogeneous types of data sources and normalize them into a common schema. And, finally, sometimes key links in the chain will come from sensitive information, which must be properly protected.

These challenges all make graph-based path-finding a difficult technology, design, and policy problem. However, these challenges can be overcome with a responsible balance of improved technical performance, transparency, data controls, policy review, and user training. Quantifind’s Graphyte platform, and its underlying knowledge graph, is the only risk-finding solution that addresses these challenges effectively and responsibly. Please email to learn more.