Why Graph Databases?

thumbnail

on April 20, 2020 • Back to Blog index

Why Graph Databases?

The biggest growth in PLM complexity lies in the connections between bits of information. In many complex products now, the number of potential configuration is greater than the number of parts. Which means that the relationships to resolve such structure requires a lot of relationships with critical values. We saw that growing complexity, we read all the studies about graph databases, and believed this is the right technology for PLM.

While most legacy PLM solutions are relying on SQL-based systems built with tables, we challenged this statu-quo and studied why Graph would beat SQL for PLM in the future.

Scalability

Scalability is the first long-term advantage we can imagine for Graph Database. This is proven with the core concept of graph databases well described by Neo4j:

Like a complex system grid or an air-traffic-control map, a graph database is represented as a network of nodes and connections called a labeled property graph. The nodes, which appear as circles or squares, represent entities such as people, products, companies or orders. In Neo4j, the connections between database nodes are called relationships, and those relationships are as important as the nodes they connect. Each relationship is directional and knows its beginning and ending node, and each node knows about all other nodes with which it has an inbound or outbound relationship—an advantage known as index-free adjacency.

Source : The Native Path to Graph Performance

The value of this core indexing model, gets better with the growing amount of data and relationships. Lets take a node with 5 relationships. If you add 1 million similar relationships connecting other nodes in the same database, traversing the graph from this node will not be affected from other relationships. From your starting node, you still have only 5 relationships to browse, while in a SQL database, the relation table would move from 5 to 1 million and 5 entries.

Here's a comparison table:

Graph Scalibility Source : Connected Data Cripples Relational Performance

Key Value : Ganister can be used from 2 users to thousands of users. The more complex the data becomes, the better the graph will handle large Ganister instances

Agility

Agility is something we live everyday at Ganister. Every concept we work on is quickly discussed on a whiteboard and translated into cypher queries to either create or browser data. Graph databases is the quickest database technology to move from whiteboard to implementation. The language is the same, nodes and edges.

Let's have a quick example: Graph Sample

The Challenge

One typical challenge here is the following: for any part in this diagram, retrieve the list of top assembly part that are used in a System.

The Cypher

If we believe that every node has a unique reference property called ref.

MATCH (s:system)-[]->(tp:part)-[:consumes..*]->(sp:part) 
WHERE sp.ref='Part I' 
RETURN tp

This is just an example but think about all the relationships you will navigate, all the complex queries that are required to process options and variants in relationships with requirement management. The easier it is to move from whiteboard to querying the graph the more agile you are. It for sure will reflect our ability to deliver new features along the digital thread in the coming months and years.

Key Value : Getting data out of complex structure gets easier and adding more features to Ganister is much faster than any other editor.

Access & Security

We, initially, had not plan to have so much value from the graph database for access & security management. But we actually did, as all our access management is defined by the graph itself. By looking at the graph database you can easily audit who does or does not have access to certain data. For this, we have applied the following access mechanism which is validated upon every request.

Graph Sample

Today, that's how we test every access in Ganister. Thanks to the native graph storage and processing, adding this check for every business object access does not affect performances. It also helps more dynamic management. If someone removes an access or modify a team, the next query will pick up the change immediately.

Key Value : Access Management is Efficient, and Auditable

Conclusion

As a quick conclusion before we leave you with an introduction video from Neo4j :
Every week we run Ganister presentations and at some points we need to list the key differentiators. One of which is performance and scalability. That's when we need to expose what technology we use. Graphs are growing and sooner or later all the PLM technology solutions will include such database technologies. The more natively the graph database is used the better the benefits are for your PLM use cases. Check out this archive from 2016 : Towards an enhancement of relationships browsing in mature PLM systems

What is Neo4j ?