11 Kasım 2012 Pazar

Getting started with Neo4J - a Beginners Tutorial

To contact us Click HERE
I've worked with databases for a long time.  Recently, a came across neo4J and cannot believe how awesome it is.  I want to devote this blog post to helping people get it installed and the fun you can have.

First, a little bit abut Graph Databases.  Graph databases are substantially different than RDBMS systems. As one person puts it, if you write, you can write code, if you can draw you can draw graphs.  It is really that simple.  A graph database starts with a root node.  The database is comprised of nodes, relationships, indexes and properties.  A simplification of this is the following chart:


A simple graph database.
A graph database keeps track of nodes and relationships as well as indexes (we'll get into that later).  For now think of this in kinderSpiele terms.  A graph is simply a drawing that show how things are connected.  Each connection might have a name.  each node might also have a name.  Here is a simplistic view of a graph.

In this simple graph, Duane - [LOVES] -> Neo4j, the latter of this has a property of being binary.  Now psychology aside (this in fact would be an unhealthy physical relationship), this captures several important concepts yet leaves out several very relevant ontological answers. Relationships organize Nodes into structures that allow a Graph to resemble many natural structures including a List, a Tree, a Map, or a compound Entity – any of which can be combined into yet more complex, richly inter-connected structures.  It is obvious that Duane loves Neo4J but here are some questions that are left unanswered.
1. Does neo4J love Duane back?2. Is Neo4J even aware that Duane loves it?3. Is Duane able to see that Neo4J is in fact a binary node and probably not suited for a proper relationship?
All are possible but undefined in this scenario.    That is why the next concept that must be introduced is a traversal mechanism.  Traversals allow navigation of graphs via statements that can select exact routing between many of these objects.  THese can be written in many languages such as cypher and allow a filter to be applied to find a path though the nodes and relationships to find answers to certain questions.  Such a question in the real world may be "How many friends do I have who enjoy eating spumonte ice crean while reading up on graph databases on Technoracle".  In reality that subset of the population is likely very small but when applied to something like Facebook or Google Plus, become highly relevant.
Here is a depiction of how traversals work. Again this is rudimentary.


Again,. this is very simple but you get the idea.  The traversal mechanism can take a set of instructions, then use if to find data it requires very efficiently.  An example might be that you use Facebook.  When you log in, it starts with the node of "you".  As the page loads, the javaScript on the page creates a backend query that says find all the nodes that are related to the user down to a layer of X deep.  Neo4J's Java API supports depth limits making it idea for this sort of operation.  Unlike an RDBSM system where an entire table might have to be walked, Neo4J allows you to set limits and take actions based on the current state.   Paths are predefined statements, often written in Cypher.
Keeping with the Facebook example, an INDEX is often a useful tool.  When a certain node is required a a start point over and over again, you can use it as an index to start with.  By contract, RDBS systems use a table and rows lookup to find the startpoint.  The index is simply a contextual based starting point.  Indexes can map directly to a node, a relationship or backwards from a property.  Instead of saying:
SELECT * FROM TABLES WHERE * EQUALS "Duane Nickull"....  you can tell a graphDB to "get Duane Nickull" then traverse outwards from him.  Simple and efficient.
Neo4J is a commercially supported, free and open source graph database that is going to rock the world.  Trust me on this.  Next post will be getting started.  All the sordid details (at least 3 easy steps) it takes to get up and running.




Hiç yorum yok:

Yorum Gönder