1.0.0.M3
This document is the reference guide for Spring Data Graph. It explains the underlying concepts, usage, infrastructure of the framework and the semantics for the used graph database.
For an introduction to graph databases, Spring, or Spring Data examples, please refer to Chapter 3, Getting Started. This documentation refers only to Spring Data Graph and assumes that the reader is familiar with Spring concepts.
NOSQL stores provide storage solutions that are more tailored to the specific data storage requirements of each project than just using a relational database as a "one-size-fits-all" solution.
Graph databases provide excellent support for graph-like data (networks); that is data that easily can be structured as connected nodes. Property graph databases like Neo4j support an arbitrary number of named properties on both nodes and relationships. Neo4j is highly performant when traversing large, complex datasets with millions of nodes and relationships, even on commodity hardware.
Neo4j is an open source graph database written in Java. It has excellent performance characteristics while providing ACID semantics and transactional support (both JTA and XA transactions). Neo4j can run as a lightweight embedded database as well as a standalone server that exposes the API via a rich REST interface.
The Spring Data Graph (or Spring Data Graph) framework makes it easy to integrate graph databases in existing or new Spring applications. It provides infrastructure that reduces the amount of boilerplate data access code and uses common patterns and idioms that are well known in the Spring Framework community such as declarative transaction managment. Those practices are based on a simple POJO programming model that leverages annotations to add metadata. It can be integrated in any part of a Spring application, like the web or service layers.
A special use case of Spring Data Graph is the cross-store functionality that can extend existing JPA data models with new, graph database backed parts (properties, entities, relationships). These parts are stored exclusively in the graph database while being transparently integrated with the JPA entities. This enables easy and seamless addition of new features that have not been available to JPA-based applications previously.
The Spring Data Graph 1.x binaries requires JDK level 6.0 or higher, and Spring Framework 3.0.x or higher.
For the graph database binary, Neo4j version 1.2 or higher is required. Neo4j has a dependency on Apache Lucene for indexing. Users are encouraged to use the latest version of Neo4j available.
For building the project, Apache Maven (version 2.10 and above) is strongly recommended.
NOSQL databases has come into focus only recently, even if some of them have existed for a few years by now. That's why this document will not only guide you through the relevant parts of the Spring Data Graph API, but also explain some key concepts of graph databases.
After reading this document, you should be able to integrate Spring Data Graph into your existing or future applications. If there are any issues that you don't understand or think are explained in a too complicated way, please report back any problems or suggestions. Your input will also benefit future readers of this documentation. For details on how to get help and provide feedback, see Section 3.2, “Need Help?”.
As explained in Chapter 1, Why Spring Data Graph?, Spring Data Graph (Spring Data Graph) provides integration between the Spring framework and graph databases. Familiarity with the Spring framework is assumed as stated in Part I, “Introduction”, and only minimally cross-referenced here. Graph databases and Neo4j in particular are explained in a bit more detail. The main focus of this document is however on explaining the steps needed to get a Spring Data Graph-backed application up and running.
Spring Data Graph makes heavy use of Spring Framework's core functionality, such as the IoC container, converter API and the AOP infrastructure. While it is less important to know the Spring APIs, understanding the concepts behind them is essential. The Spring Framework documentation home page is a good starting point for developers who want to become more familiar with Spring Framework.
The recent interest in NOSQL databases is mainly driven by the need to find the best suited storage solution for data structured in a specific way. It should fit the data, not the other way round. Another issue is the scalability of the database, especially with today's fast growing user bases. There are many NOSQL databases, and one should become familiar with the different concepts, advantages, and disadvantages before choosing a solution. A problem with the NOSQL databases is the different data access APIs that are provided. Spring Data aims at easing this burden by providing consistent abstractions over those APIs, leveraging SpringSource's experience and good reputation in this area.
Graph databases are a particularly good fit for large networks of connected information (objects). They map objects to nodes and connections to relationships. Examples of such datasets are social networks, geospatial information, network layouts, and hardware or dependency graphs. Neo4j is the first graph database that is tightly integrated with the Spring Data Graph project.
Spring Data Graph comes with a number of samples and unit test cases (if you accessed the sources via github or Maven).
For more information on the samples, see Chapter 8, Samples.
If you encounter issues or you are just looking for an advice, feel free to use one of the links below:
The Spring Data homepage provides all the necessary links for information, community forums and code repositories.
Professional, from-the-source support, with guaranteed response time, is available from SpringSource, the company behind Spring Data and Spring.
For information on the Spring Data source code repository, nightly builds and snapshot artifacts please see the Spring Data home page.
You can help make Spring Data best serve the needs of the Spring community by interacting with developers through the community forums.
If you encounter a bug or want to suggest an improvement, please create a ticket on the Spring Data Graph issue tracker.
To stay up to date with the latest news and announcements in the Spring eco system, subscribe to the Spring Community Portal.
Lastly, you can follow the SpringSource Data blog or the project team on Twitter (@SpringData)
This part of the reference documentation details the API, concepts, annotations, datastore, programming model and the cross-store approach of Spring Data Graph.
The Spring Data Graph project applies core Spring concepts to the development of solutions using a graph style data store. The basic approach is to mark simple POJO entities with Spring Data Graph annotations. That enables the AspectJ aspects that are contained with the framework to adapt the instantiation and field access to have them stored and retrieved from the graph store. Entities are mapped to nodes of the graph, references to other entities are represented by relationships. There are also special relationship entities that provide access to the properties of graph relationships.
For the developer of a Spring Data Graph backed application only the public annotations are relevant, basic knowledge of graph stores is needed to access advanced functionality like traversals. Traversal results can also be mapped to fields of entities.
Neo4j is a graph database, a fully transactional database that stores data structured as graphs. A graph is a flexible data structure that allows for a more agile and rapid style of development.
Neo4j has been in commercial development for 10 years and in production for over 7 years. It is a mature and robust graph database that provides:
In addition, Neo4j includes the usual database features: ACID transactions, durable persistence, concurrency control, transaction recovery, high availability and everything else you’d expect from an enterprise-strength database. Neo4j is released under a dual free software/commercial license model.
A graph database is a storage engine that is specialized in storing and retrieving vast networks of data. It efficiently stores nodes and relationship and allows high performance traversal of those structures. With property graphs it is possible to add an arbitrary number of properties to nodes and relationships which can be used directly or during traversals.
The interface org.neo4j.graphdb.GraphDatabaseService provides access to the storage engine. Its features include creating and retrieving Nodes and Relationships, managing indexes, via an IndexManager, database lifecycle callbacks, transation management and more.
The EmbeddedGraphDatabaseService is an implementation of GraphDatabaseService that is used to embed Neo4j in a Java application. This implmentation is used so as to provide the highest and tightest integration. There are other, remote implementations that provide access to Neo4j stores via REST.
Using the API of GraphDatabaseService it is easy to create nodes and relate them to each other. Relationships are named. Both nodes and relationships can have properties. Property values can be primitive Java types and Strings, byte arrays for binary data, or arrays of other Java primitives or Strings. Node creation and modification has to happen within a transaction, while reading from the graph store can be done with or without a transaction.
GraphDatabaseService graphDb = new EmbeddedGraphDatabase( "helloworld" );
Transaction tx = graphDb.beginTx();
try {
Node firstNode = graphDb.createNode();
Node secondNode = graphDb.createNode();
firstNode.setProperty( "message", "Hello, " );
secondNode.setProperty( "message", "world!" );
Relationship relationship = firstNode.createRelationshipTo( secondNode,
DynamicRelationshipType.of("KNOWS") );
relationship.setProperty( "message", "brave Neo4j " );
tx.success();
} finally {
tx.finish();
}
Getting a single node or relationship and examining it is not the main use case of a graph database. Fast graph traversal and application of graph algorithms are. Neo4j provides means via a concise DSL to define TraversalDescriptions that can then be applied to a start node and will produce a stream of nodes and/or relationships as a lazy result using an Iterable.
TraversalDescription traversalDescription = Traversal.description()
.depthFirst()
.relationships( KNOWS )
.relationships( LIKES, Direction.INCOMING )
.prune( Traversal.pruneAfterDepth( 5 ) );
for ( Path position : traversalDescription.traverse( myStartNode )) {
System.out.println( "Path from start node to current position is " + position );
}
The best way for retrieving start nodes for traversals is using Neo4j's index facilities. The GraphDatabaseService provides access to the IndexManager which in turn retrieves named indexes for nodes and relationships. Both can be indexed with property names and values. Retrieval is done by query methods on Index to return an IndexHits iterator.
IndexManager indexManager = graphDb.index();
Index<Node> nodeIndex = indexManager.forNodes("a-node-index");
nodeIndex.add(node, "property","value");
for (Node foundNode = nodeIndex.get("property","value")) {
assert node.getProperty("property").equals("value");
}
Note: Spring Data Graph provides auto-indexing via the @Indexed annotation, while this still is a manual process when using the Neo4j API.
This chapter covers the fundamentals of the programming model behind Spring Data Graph. It discusses the AspectJ features used and the annotations provided by Spring Data Graph and how to use them. Examples for this section are taken from the imdb project of Spring Data Graph examples.
Behind the scenes Spring Data Graph leverages AspectJ aspects to modify the behavior of simple POJO entities to be able to be backed by a graph store. Each entity is backed by a node that holds its properties and relationships to other entities. AspectJ is used to intercept field access and to reroute it to the backing state (either its properties or relationships). For relationship entities the fields are similarly mapped to properties. There are two specially annotated fields for the start and the end node of the relationship.
The aspect introduces some internal fields and some public methods to the entities for accessing the backing state via getUnderlyingState() and creating relationships with relateTo and retrieving relationship entities via getRelationshipTo. It also introduces finder methods like find(Class<? extends NodeEntity>, TraversalDescription) and equals and hashCode delegation.
Spring Data Graph internally uses an abstraction called EntityStateAccessors that the field access and instantiation advices of the aspect delegate to, keeping the aspect code very small and focused to the pointcuts and delegation code. The EntityStateAccessors then use a number of FieldAccessor factories to create a FieldAccessor instance per field that does the specific handling needed for the concrete field.
Entities are declared using the @NodeEntity annotation. Relationship entities use the @RelationshipEntity annotation.
The @NodeEntity annotation is used to declare a POJO entity to be backed by a node in the graph store. Simple fields on the entity
are mapped by default to properties of the node. Object references to other NodeEntities (whether single
or Collection) are mapped via relationships. If the annotation parameter useShortNames
is set to false, the properties and relationship names used will be prepended with the class name of the
entity. If the parameter fullIndex is set to true, all fields of the entity will be indexed. If the
partial parameter is set to true, this entity takes part in a cross-store setting where only
the parts of the entity not handled by JPA will be mapped to the graph store.
Entity fields can be annotated with @GraphProperty, @RelatedTo, @RelatedToVia, @Indexed and @GraphId
@NodeEntity
public class Movie {
String title;
}
To access the rich data model of graph relationships, POJOs can also be annotated with
@RelationshipEntity. Relationship entities can't be instantiated directly but are rather accessed via
node entities, either by @RelatedToVia fields or by the relateTo
or getRelationshipTo methods.
Relationship entities may contain fields that are mapped to properties and two special fields that are
annotated with @StartNode and @EndNode which point to the start and end node entities respectively. These fields are treated as read only fields.
@RelationshipEntity
public class Role {
@StartNode
private Actor actor;
@EndNode
private Movie movie;
}
It is not necessary to annotate fields as they are persisted by default; all fields that contain primitive values are persisted directly to the graph. All fields convertible to String using the Spring conversion services will be stored as a string. Transient fields are not persisted. This annotation is mainly used for cross-store persistence.
Relationships to other NodeEntities are mapped to graph relationships. Those can either be single
relationships (1:1) or multiple relationships (1:n). In most cases single relationships to other
node entities don't have to be annotated as Spring Data Graph can extract all necessary information from the field
using reflection. In the case of
multiple relationships, the elementClass parameter of @RelatedTo must be specified because of type erasure.
The direction (default OUTGOING) and type
(inferred from field name) parameters of the annotation are optional.
Relationships to single node entities are created when setting the field and deleted when setting it to null. For multi-relationships the field provides a managed collection (Set) that handles addition and removal of node entities and reflects those in the graph relationships.
@NodeEntity
public class Movie {
private Actor topActor;
}
@NodeEntity
public class Person {
@RelatedTo(type = "topActor", direction = Direction.INCOMING)
private Movie wasTopActorIn;
}
@NodeEntity
public class Actor {
@RelatedTo(type = "ACTS_IN", elementClass = Movie.class)
private Set<Movie> movies;
}
To provide easy programmatic access to the richer relationship entities of the data model a different annotation @RelatedToVia can be declared on fields of Iterables of the relationship entity type. These Iterables then provide read only access to instances of the entity that backs the relationship of this relationship type. Those instances are initialized with the properties of the relationship and the start and end node.
@NodeEntity
public class Actor {
@RelatedToVia(type = "ACTS_IN", elementClass = Role.class)
private Iterable<Role> roles;
}
The @Indexed annotation can be declared on fields that are intended to be indexed by the Neo4j IndexManager, triggered by value modification. The resulting index can be used to later retrieve nodes or relationships that contain a certain property value (for example a name). Often an index is used to establish the start node for a traversal. Indexes are accessed by a Finder for a particular NodeEntity or RelationshipEntity, created via a FinderFactory.
GraphDatabaseContext exposes the indexes for Nodes and Relationships. Indexes can be named, for instance to keep separate domain concepts in separate indexes. That's why it is possible to specifiy an index name with the @Indexed annotation. It can also be specified at the entity level, this name is then the default index name for all fields of the entity. If no index name is specified, it defaults to the one configured with Neo4j ("node" and "relationship").
The @GraphTraversal annotation leverages the delegation infrastructure used by the Spring Data Graph aspects.
It provides dynamic fields
which, when accessed, return an Iterable of NodeEntities that are the result of a traversal starting at the
current NodeEntity.
The TraversalDescription used for this is created by a TraversalDescriptionBuilder whose class is
referred to by the
traversalBuilder
attribute of the annotation. The class of the expected NodeEntities is provided with the
elementClass
attribute.
Spring Data Graph also comes with a type bound Repository-like Finder implementation that provides methods for locating nodes and relationships:
using direct access findById(id) ,
iterating over all nodes of a node entity type (findAll),
counting the instances of a node entity type (count),
iterating over all indexed instances with a certain property value (findAllByPropertyValue),
getting a single instance with a certain property value (findByPropertyValue),
iterating over all indexed instances within a certain numerical range (inclusive) (findAllByRange),
iterating over a traversal result (findAllByTraversal).
The Finder instances are created via a FinderFactory to be bound to a concrete node or relationship entity class. The FinderFactory is created in the Spring context and can be injected.
NodeFinder<Person> finder = finderFactory.createNodeEntityFinder(Person.class);
Person dave=finder.findById(123);
int people = finder.count();
Person mark = finder.findByPropertyValue("name", "mark");
Iterable<Person> devs = finder.findAllByProperyValue("occupation","developer");
Iterable<Person> davesFriends = finder.findAllByTraversal(dave,
Traversal.description().pruneAfterDepth(1)
.relationships(KNOWS).filter(returnAllButStartNode()));
There are several ways to represent the Java type hierarchy of the data model in the graph. In general for all node and relationship
entities type information is needed to perform certain repository operations. That's why the hierarchy up to java.lang.Object of all
these classes will be persisted in the graph. Implementations of NodeTypeStrategy take care of persisting this information on entity instance
creation. They also provide the repository methods that use this type information to perform their operations like findAll, count etc.
The current implementation uses nodes to represent the Java type hierarchy which are connected via SUBCLASS_OF relationships to their superclass nodes and via INSTANCE_OF relationships to the concrete node entity instance node.
An alternative approach could use indexing operations to perform the same functionality. Or one could skip the NodeTypeStrategy altogether if no strict checks on type conformity are needed, which would allow for a much more flexible data model.
The node and relationship aspects introduce (via ITD - inter type declaration) several methods to the entities that make common tasks easier. Unfortunately these methods are not generified yet, so the results have to be casted to the correct return type.
nodeEntity.getNodeId() and relationshipEntity.getRelationshipId()
entity.getUnderlyingState()
entity.equals() and entity.hashCode()
nodeEntity.relateTo(targetEntity, relationshipClass, relationshipType)
nodeEntity.getRelationshipTo(targetEnttiy, relationshipClass, relationshipType)
nodeEntity.removeRelationshipTo(targetEntity, relationshipType)
entity.remove()
entity.projectTo(targetClass)
nodeEntity.findAllByTraversal(targetType, traversalDescription)
As the underlying data model of a graph database doesn't imply and enforce strict type constraints like a relational model does, it offers much more flexibility on how to model your domain classes and which of those to use in different contexts.
For instance an order can be used in these contexts: customer, procurement, logistics, billing, fulfillment and many more. Each of those contexts requires its distinct set of attributes and operations. As Java doesn't support mixins one would put the sum of all of those into the entity class and thereby making it very big, brittle and hard to understand. Being able to take a basic order and project it to a different (not related in the inheritance hierarchy or even an interface) order type that is valid in the current context and only offers the attributes and methods needed here would be very benefitial.
Spring Data Graph offers initial support for projecting node and relationship entities to different target types. All instances of this projected entity share the same backing node or relationship, so data changes are reflected immediately.
This could for instance also be used to handle nodes of a traversal with a unified (simpler) type (e.g. for reporting or auditing) and only project them to a concrete, more functional target type when the business logic requires it.
// not related to Person at all
@NodeEntity
class Trainee {
String name;
@RelatedTo(elementClass=Training.class);
Set<Training> trainings;
}
for (Person person : finder.findAllByProperyValue("occupation","developer")) {
Developer developer=person.projectTo(Developer.class)
if (developer.isJavaDeveloper()) {
trainInSpringData(developer.projectTo(Trainee.class));
}
}
The Neo4jTemplate offers the convenient API of Spring templates for the Neo4j graph database.
There are methods for creating nodes and relationships that automatically set provided properties and optionally
index certain fields. Other methods ( index , autoindex) will index them.
For the querying operations Neo4jTemplate unifies the result with the Path abstraction that comes from Neo4j.
Much like a resultset a path contains nodes() and relationships()
starting at a startNode() and
ending with a endNode(), the lastRelationship() is also available separately.
The Path abstraction also wraps results that contain just nodes or relationships.
Using implementations of PathMapper<T> and PathMapper.WithoutResult (comparable with RowMapper and
RowCallbackHandler) the paths can be converted to Java objects.
Query methods either take a field / value combination to look for exact matches in the index or a lucene query object or string to handle more complex queries.
Traversal methods are the bread and butter of graph operations. As such, they are fully supported in the Neo4jTemplate.
The traverseNext method traverses to the direct neighbours of the start node filtering the relationships according
to its parameters.
The traverse method covers the full fledged traversal operation that takes a powerful TraversalDescription
(most probably built from the Traversal.description() DSL) and runs it from the start node. Each path that is returned
via the traversal is passed to the PathMapper to be processed accordingly.
The Neo4jTemplate provides configurable implicit transactions for all its methods. By default it creates a transaction
for each call (which is a no-op if there is already a transaction running). If you call the constructor
with the useExplicitTransactions parameter set to true, it won't create any transactions so you have to
provide them using @Transactional or the TransactionTemplate.
Neo4jOperations neo = new Neo4jTemplate(grapDatabase);
Node michael = neo.createNode(_("name","Michael"),"name");
Node mark = neo.createNode(_("name","Mark"));
Node thomas = neo.createNode(_("name","Thomas"));
neo.createRelationship(mark,thomas, WORKS_WITH, _("project","spring-data"));
neo.index("devs",thomas, "name","Thomas");
neo.autoIndex("devs",mark, "name");
assert "Mark".equals(neo.query("devs","name","Mark",new NodeNamePathMapper()));
The Neo4j graph database can use different index providers for exact lookups and fulltext searches. Lucene is used as a index provider implementation. There is support for distinct indexes for nodes and relationships which can be configured to be of fulltext or exact types.
Using the standard Neo4j API, Nodes and Relationships and their indexed field-value combinations
have to be added manually to the appropriate index. When using Spring Data Graph, this task is simplified by eased by applying an @Indexed annotation on entity fields. This will result in updates to the index on
every change. Numerical fields are indexed numerically so that they are available for range queries. All other
fields are indexed with their string representation. The @Indexed annotation can also set the index-name to be used.
If @Indexed annotates the entity class, the index-name for the whole entity is preset to that value. Not providing
index names defaults them to "node" and "relationship" respectively.
Query access to the index happens with the Node- and RelationshipFinders that are created via an instance of org.springframework.data.graph.neo4j.finder.FinderFactory.
The methods findByPropertyValue and findAllByPropertyValue work on the exact indexes and return the first or all
matches. To do range queries, use findAllByRange (please note that currently both values are inclusive).
@NodeEntity
class Person {
@Indexed(indexName = "people")
String name;
// automatically indexed numerically
@Indexed
int age;
}
@NodeEntity
@Indexed(indexName="groups")
class Group {
@Indexed
String name;
@RelatedTo(elementClass = Person.class, type = "people" )
Set<Person> people;
}
NodeFinder<Person> finder = finderFactory.createNodeEntityFinder(Person.class);
// exact finder
Person mark = finder.findByProperyValue("people","name","mark");
// numeric range queries
for (Person middleAgedDeveloper : finder.findAllByRange(null, "age", 20, 40)) {
Developer developer=middleAgedDeveloper.projectTo(Developer.class);
}
Neo4jTemplate also offers index support, providing auto-indexing for fields at creation time of nodes and relationships.
There is an autoIndex method that can also add indexes for a set of fields in one go.
For querying the index, the template offers query-methods that take either the exact match parameters or a query object /
query expression and push the results wrapped uniformly as Paths to the supplied PathMapper to be converted or collected.
Neo4j is a transactional datastore which only allows modifications within transaction boundaries and fullfills the ACID properties. Reading from the store is also possible outside of transactions.
Spring Data Graph integrates with transaction managers configured using Spring. The simplest scenario of
just running the graph database uses a SpringTransactionManager provided by the Neo4j kernel to be used
with Spring's JtaTransactionManager.
Note: The explicit XML configuration given below is encoded in the Neo4jConfiguration configuration bean that uses Spring's @Configuration functioanlity. This simplifies the configuration. An example is shown further below.
<bean id="transactionManager" class="org.springframework.transaction.jta.JtaTransactionManager">
<property name="transactionManager">
<bean class="org.neo4j.kernel.impl.transaction.SpringTransactionManager">
<constructor-arg ref="graphDatabaseService"/>
</bean>
</property>
<property name="userTransaction">
<bean class="org.neo4j.kernel.impl.transaction.UserTransactionImpl">
<constructor-arg ref="graphDatabaseService"/>
</bean>
</property>
</bean>
<tx:annotation-driven mode="aspectj" transaction-manager="transactionManager"/>
For scenarios running multiple transactional resources there are two options. First of all you can have Neo4j participate in the externally set up transaction manager using the new SpringProvider by enabling the configuration parameter for your graph database. Either via the spring config or the configuration file (neo4j.properties).
<context:annotation-config />
<context:spring-configured/>
<bean id="transactionManager" class="org.springframework.transaction.jta.JtaTransactionManager">
<property name="transactionManager">
<bean id="jotm" class="org.springframework.data.graph.neo4j.transaction.JotmFactoryBean"/>
</property>
</bean>
<bean class="org.neo4j.kernel.EmbeddedGraphDatabase" destroy-method="shutdown">
<constructor-arg value="target/test-db"/>
<constructor-arg>
<map>
<entry key="tx_manager_impl" value="spring-jta"/>
</map>
</constructor-arg>
</bean>
<tx:annotation-driven mode="aspectj" transaction-manager="transactionManager"/>
You can configure a stock XA transaction manager to be used with Neo4j and the other resources (e.g. Atomikos, JOTM,
App-Server-TM). For a bit less secure but fast 1 phase commit best effort, use the implementation coming
with Spring Data Graph (ChainedTransactionManager).
It takes a list of transaction-managers as constructor params and will handle them in order for transaction
start and commit (or rollback) in the reverse order.
<bean id="transactionManager" class="org.springframework.data.graph.neo4j
.transaction.ChainedTransactionManager" >
<constructor-arg>
<list>
<bean class="org.springframework.orm.jpa.JpaTransactionManager" id="jpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>
<bean
class="org.springframework.transaction.jta.JtaTransactionManager">
<property name="transactionManager">
<bean class="org.neo4j.kernel.impl.transaction.SpringTransactionManager">
<constructor-arg ref="graphDatabaseService" />
</bean>
</property>
<property name="userTransaction">
<bean class="org.neo4j.kernel.impl.transaction.UserTransactionImpl">
<constructor-arg ref="graphDatabaseService" />
</bean>
</property>
</bean>
</list>
</constructor-arg>
</bean>
Spring Data Graph supports property based validation support. So whenever a property is changed, it is checked against the annotated constraints (.e.g @Min, @Max, @Size, etc). Validation errors throw a ValidationException. For evaluating the constraints the validation support that comes with Spring is used. To use it a validator has to be registered with the GraphDatabaseContext, if there is none, no validation will be performed (any registered Validator or (Local)ValidatorFactoryBean will be used).
@NodeEntity
class Person {
@Size(min = 3, max = 20)
String name;
@Min(0)
@Max(100)
int age;
}
By default node entities that are created in a transaction are immediately attached to the graph database. All subsequent write operation will be instantly reflected on the node and its relationships. This is also the default behaviour for entities created outside of transactions. This is achieved by running implicit transactions for modifying operations. Those transactions participate in existing transactions with no additional cost but create a transactional context otherwise (like the "required" propagation mode).
For certain contexts (e.g. web layer) it is required that entities are created in a detached mode (it is possible that those entities
are never persisted at all). That is achieved by annotating the entities with the autoAttach attribute set to false.
All node entities are equipped with an additional attach() method that will attach the entity to the graph
database (if that has not already happend) and also flush the state changes to the graph. The flush operation checks for
concurrent modifications of the data and fails if a conflict is detected.
Please keep in mind that the session handling behaviour is still heavily developed. The defaults and also other aspects of the behaviour are likely to change in subsequent releases. At the moment there is no support for the creation of relationships outside of transactions and also more complex operations like creating whole subgraphs is not supported.
@NodeEntity(autoAttach = false)
class Person {
String name;
}
Person p = new Person().attach();
To use Spring Data Graph in your application, some setup is required. For building the application the necessary Maven dependencies must be included and for the AspectJ weaving some extensions of the compile goal are necessary. This chapter also discusses the Spring configuration needed to set up Spring Data Graph. Examples for this setup can be found in the Spring Data Graph examples.
As stated in the requirements chapter, Spring Data Graph projects are easiest to build with Apache Maven. The main dependencies are Spring Data Graph itself, Spring Data Commons, some parts of the Spring Framework and of course the Neo4j graph database.
The milestone releases of Spring Data Graph are available from the dedicated milestone repository. Neo4j releases and milestones are available from Maven Central.
<repository> <id>spring-maven-milestone</id> <name>Springframework Maven Repository</name> <url>http://maven.springframework.org/milestone</url> </repository>
The dependency on spring-data-neo4j
should transitively pull in Spring Framework (core, context, aop,
aspects, tx), Aspectj, Neo4j and Spring Data Commons. If you already use these (or different versions of
these) in your project, then include those dependencies on your own.
<dependency> <groupId>org.springframework.data</groupId> <artifactId>spring-data-neo4j</artifactId> <version>1.0.0.M3</version> </dependency> <dependency> <groupId>org.aspectj</groupId> <artifactId>aspectjrt</artifactId> <version>1.6.11.M2</version> </dependency>
As Spring Data Graph uses AspectJ for build time aspect weaving of your entities, it is necessary to add the aspectj-plugin to the build phases. The plugin has its own dependencies. You also need to explicitely specifiy libraries containing aspects (spring-aspects and spring-data-neo4j)
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>aspectj-maven-plugin</artifactId>
<version>1.0</version>
<dependencies>
<!-- NB: You must use Maven 2.0.9 or above or these are ignored (see MNG-2972) -->
<dependency>
<groupId>org.aspectj</groupId>
<artifactId>aspectjrt</artifactId>
<version>1.6.11.M2</version>
</dependency>
<dependency>
<groupId>org.aspectj</groupId>
<artifactId>aspectjtools</artifactId>
<version>1.6.11.M2</version>
</dependency>
</dependencies>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>test-compile</goal>
</goals>
</execution>
</executions>
<configuration>
<outxml>true</outxml>
<aspectLibraries>
<aspectLibrary>
<groupId>org.springframework</groupId>
<artifactId>spring-aspects</artifactId>
</aspectLibrary>
<aspectLibrary>
<groupId>org.springframework.data</groupId>
<artifactId>spring-datastore-neo4j</artifactId>
</aspectLibrary>
</aspectLibraries>
<source>1.6</source>
<target>1.6</target>
</configuration>
</plugin>
The concrete configuration for Spring Data Graph is quite verbose as there is no autowiring involved. It sets up the following parts.
GraphDatabaseService, IndexManager for the embedded Neo4j storage engine
Spring transaction manager, Neo4j transaction manager
aspects and instantiators for node and relationship entities
EntityStateAccessors and FieldAccessFactories needed for the different field handling
Conversion services
Finder factory
an appropriate NodeTypeStrategy
To simplify the configuration we provide a xml namespace datagraph that allows configuration of any
Spring Data Graph project with a single line of xml code. There are three possible parameters. You can use storeDirectory
or the reference to graphDatabaseService alternatively. For cross-store configuration just refer
to an entityManagerFactory.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:context="http://www.springframework.org/schema/context"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:datagraph="http://www.springframework.org/schema/data/graph"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context-3.0.xsd
http://www.springframework.org/schema/data/graph
http://www.springframework.org/schema/data/graph/datagraph-1.0.xsd
">
<context:annotation-config/>
<datagraph:config storeDirectory="target/config-test"/>
</beans>
<context:annotation-config/>
<bean id="graphDatabaseService" class="org.neo4j.kernel.EmbeddedGraphDatabase"
destroy-method="shutdown">
<constructor-arg index="0" value="target/config-test" />
</bean>
<datagraph:config graphDatabaseService="graphDatabaseService"/>
<context:annotation-config/>
<datagraph:config storeDirectory="target/config-test"
entityManagerFactory="entityManagerFactory"/>
<bean class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean"
id="entityManagerFactory">
<property name="dataSource" ref="dataSource"/>
<property name="persistenceXmlLocation" value="classpath:META-INF/persistence.xml"/>
</bean>
You can also configure Spring Data Graph using Java based bean metadata.
For those not familiar with how to configure the Spring container using Java based bean metadata instead of XML based metadata see the high level introduction in the reference docs here as well as the detailed documentation here.
To help configure Spring Data Graph using Java based bean metadata the class Neo4jConfiguration is registerd with the context either explicitly in the XML config or via classpath scanning for classes that have the @Configuration annotation. The only thing that must be provided in addition is the GraphDatabaseService configured with a datastore directory. The example below shows using XML to register the Neo4jConfiguration @Configuration class as well as Spring's ConfigurationClassPostProcessor that transforms the @Configuration class to bean definitions.
<beans>
...
<tx:annotation-driven mode="aspectj" transaction-manager="transactionManager"/>
<bean class="org.springframework.data.graph.neo4j.config.Neo4jConfiguration"/>
<bean class="org.springframework.context.annotation.ConfigurationClassPostProcessor"/>
<bean id="graphDatabaseService" class="org.neo4j.kernel.EmbeddedGraphDatabase"
destroy-method="shutdown" scope="singleton">
<constructor-arg index="0" value="target/config-test"/>
</bean>
...
</beans>
The Spring Data Graph project support cross-store persistence which allows parts of the data mode to be stored in a traditional JPA datastore (RDBMS) and other parts of the data model (even partial entites, that is some properties or relationships) in a graph store.
This allows existing JPA-based applications to embrace NOSQL data stores to evolve certain parts of their model. Possible use cases are adding social network or geospatial information to existing applications.
Partial graph persistence is achieved by restricting the Spring Data Graph aspects to explicitly annotated parts of the entity. Those fields have to be made transient so that JPA ignores them and won't try to persist those attributes.
A backing node in the graph store is created when the entity has been assigned a JPA id. Only then will the connection between the two stores be kept. Until the entity has been persisted, its state is just kept inside the POJO and flushed to the backing graph store afterwards.
The connection between the two entities is kept via a FOREIGN_ID field in the node that contains the JPA id (currently only single value ids are supported). The entity class can be resolved via the NodeTypeStrategy that preserves the Java type hierarchy within the graph. With the id and class, you can then retrieve the appropriate JPA entity for a given node.
The other direction is handled by indexing the Node with the FOREIGN_ID index which contains a concatenation of the fully qualified class name of the JPA entity and the id. So it is possible on instantiation of a JPA id via the entity manager (or some other means like creating the POJO and setting its id manually) to find the matching node using the index facilities and reconnect them.
Using those mechanisms and the Spring Data Graph aspects a single POJO can contain fields that are handled by JPA and other fields (which might be relationships as well) that are handled by Spring Data Graph.
When annotating an entity with partial true, Spring Data Graph assumes that this is a cross-store entity. So it is only responsible for the fields annotated with Spring Data Graph annotations. JPA should not take care of these fields (they should be annotated with @Transient). In this mode of operation Spring Data Graph also handles the cross-store connection via the content of the JPA id field.
For common fields containing primitive or convertible values that wouldn't have to be annotated in exclusive Spring Data Graph operations this explicit declaration is necessary to be sure that they are intended to be stored in the graph. These fields should then be made transient so that JPA doesn't try to take care of them as well.
The following example is taken from the Spring Data Graph examples, it is contained in the myrestaurant-social project.
@Entity
@Table(name = "user_account")
@NodeEntity(partial = true)
public class UserAccount {
private String userName;
private String firstName;
private String lastName;
@GraphProperty
@Transient
String nickname;
@RelatedTo(type = "friends", elementClass = UserAccount.class)
@Transient
Set<UserAccount> friends;
@RelatedToVia(type = "recommends", elementClass = Recommendation.class)
@Transient
Iterable<Recommendation> recommendations;
@Temporal(TemporalType.TIMESTAMP)
@DateTimeFormat(style = "S-")
private Date birthDate;
@ManyToMany(cascade = CascadeType.ALL)
private Set<Restaurant> favorites;
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
@Column(name = "id")
private Long id;
@Transactional
public void knows(UserAccount friend) {
relateTo(friend, DynamicRelationshipType.withName("friends"));
}
@Transactional
public Recommendation rate(Restaurant restaurant, int stars, String comment) {
Recommendation recommendation = (Recommendation) relateTo(restaurant,
Recommendation.class, "recommends");
recommendation.rate(stars, comment);
return recommendation;
}
public Iterable<Recommendation> getRecommendations() {
return recommendations;
}
}
Configuring cross-store persistence is done similarly to the default Spring Data Graph operations. As soon as you refer
to an entityManagerFactory in the xml-namespace it is set up for cross-store persistence.
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:context="http://www.springframework.org/schema/context"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:datagraph="http://www.springframework.org/schema/data/graph"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context-3.0.xsd
http://www.springframework.org/schema/data/graph
http://www.springframework.org/schema/data/graph/datagraph-1.0.xsd
">
<context:annotation-config/>
<datagraph:config storeDirectory="target/config-test"
entityManagerFactory="entityManagerFactory"/>
<bean class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean"
id="entityManagerFactory">
<property name="dataSource" ref="dataSource"/>
<property name="persistenceXmlLocation" value="classpath:META-INF/persistence.xml"/>
</bean>
</beans>
Spring Data Graph comes with a number of samples. The source code of the samples is found on GitHub. The different sample projects are introduced below.
The Hello Worlds sample application is a simple console application with unit tests, that creates some Worlds (entities / nodes) and Rocket Routes (relationships) in a Galaxy (graph) and then reads them back and prints them out.
The unit tests demonstrate some other features of Spring Data Graph. The sample comes with a minimal configuration for Maven and spring to get up and running quickly.
Executing the application creates the following graph in the Graph Database:

A web application that imports datasets from the Internet Movie Database (IMDB) into the graph database. It allows listings of movies with their actors and actors with their roles in different movies. It also uses graph traversal operations to calculate the Kevin Bacon number (distance to an actor that has acted with Kevin Bacon). This sample application shows the basic usage of Spring Data Graph in a more complex setting with several annotated entities and relationships as well as usage of indices and graph traversal.
See the readme file for instruction on how to compile and run the application.
An excerpt of the data stored in the Graph Database after executing the application:
