Neo4j Connector for Apache Spark
Neo4j Connector for Apache Kafka
Change Data Capture (CDC)
BigQuery to Neo4j
Google Cloud to Neo4j
Cypher is Neo4j’s declarative and
GQL conformant
query language.
Available as open source via
The openCypher project
, Cypher is
similar to SQL
, but optimized for graphs.
Intuitive and close to natural language, Cypher provides a visual way of matching patterns and relationships by having its own design based on ASCII-art type of syntax:
(:nodes)-[:ARE_CONNECTED_TO]->(:otherNodes)
Round brackets are used to represent
(:Nodes)
, and
-[:ARROWS]→
to represent a relationship between the
(:Nodes)
.
With this query syntax, you can perform create, read, update, or delete (CRUD) operations on your graph.
For a quick look with no installation required, get a free
Aura instance
.
Use the graduation cap icon on the top right section to access the interactive guides.
The "Query fundamentals" gives you a hands-on introduction to Cypher.
Neo4j’s graph model is composed of
nodes
and
relationships
, which may also have assigned
properties
.
With nodes and relationships, you can build powerful patterns that can express simple or complex patterns.
Pattern recognition is a key fundamental cognitive process, making Cypher, which utilizes pattern matching, intuitive and easy to learn.
If you were to represent the data in this graph in English, it might read as something like:
"Sally likes Graphs. Sally is friends with John. Sally works for Neo4j."
Now, if you were to write this same information in Cypher, then it would look like this:
(:Sally)-[:LIKES]->(:Graphs)
(:Sally)-[:IS_FRIENDS_WITH]->(:John)
(:Sally)-[:WORKS_FOR]->(:Neo4j)
However, in order to have this information in the graph, first you need to represent it as nodes and relationships.
In a property graph model, the main components are nodes and relationships.
Nodes are often used to represent nouns or objects in your data model.
In the previous example,
Sally
,
John
,
Graphs
, and
Neo4j
are the nodes:
In Cypher, you can depict a node by surrounding it with parentheses, e.g.
(node)
.
The parentheses are a representation of the circles that compose the nodes in the visualization.
Nodes can be grouped together through a
label
.
They work like tags and allow you to specify certain types of entities to look for or to create.
Labels also help Cypher distinguish between entities and optimize execution for your queries.
In the example, both
Sally
and
John
can be grouped under a
Person
label,
Graphs
can receive a
Technology
label, and
Neo4j
can be labeled as
Company
:
Figure 4. Nodes grouped in labels. Note that
Sally
,
John
,
Graphs
, and
Neo4j
are now
properties
instead.
In a relational database context, this would be the same as telling SQL which table to look for the particular row.
The same way you can tell SQL to query a person’s information from a
Person
table, you can also tell Cypher to only check the
Person
label for that information.
If you do not specify a label for Cypher to filter out non-matching node categories, the query will check all of the nodes in the database.
This can affect performance in very large graphs.
Though not mandatory, variables are particularly useful when querying a database, as they allow referencing specified nodes in subsequent clauses without writing their label in full.
Variables can be single letters or words, and should be written in lower-case.
For example, if you want to bind all nodes labeled
Person
to the variable
p
, you write
(p:Person)
.
Likewise, if you want to use a full word, then you can write
(person:Person)
.
In a
MATCH
query to retrieve all nodes labeled
Person
, this is how it looks like:
Note that in the example without a variable, the node
Person
is preceded by a colon (
:
).
This is how you prevent a type or label of becoming a variable.
In case you forget to add a colon and write the query like this:
MATCH (Person)
RETURN Person
Then
Person
would be a variable, not a type or label.
One of the benefits of graph databases is that you can store information about how elements (nodes) are related to each other in the form of relationships.
In Cypher, relationships are represented as square brackets and an arrow connecting two nodes (e.g.
(Node1)-[]→(Node2)
).
In the example, the lines containing
:LIKES
,
:IS_FRIENDS_WITH
, and
:WORKS_FOR
represent the relationship between the nodes:
Relationships
always
have a direction which is indicated by an arrow.
They can go from left to right:
(p:Person)-[:LIKES]->(t:Technology)
From right to left:
(p:Person)<-[:LIKES]-(t:Technology)
Or be undirected (where the direction is
not
specified):
MATCH (p:Person)-[:LIKES]-(t:Technology)
An undirected relationship does not mean that it doesn’t have a direction, but that it can be traversed in
either
direction.
While you can’t
create
relationships without a direction, you can
query
them undirected (in the example, using the
MATCH
clause).
Using undirected relationships in queries is particularly useful when you don’t know the direction, since Cypher won’t return anything if you write a query with the wrong direction.
Cypher will therefore retrieve
all
nodes connected by the specified relationship type, regardless of direction.
Because undirected relationships in queries are traversed twice (once for each direction), the same pattern will be returned twice.
This may impact the performance of the query.
Relationship types categorize and add meaning to a relationship, similar to how labels group nodes together.
It is considered best practice to use verbs or derivatives for the relationship type.
The type describes how the nodes relate to each other.
This way, Cypher is almost like natural language, where nodes are the subjects and objects (nouns), and the relationships (verbs) are the action words that relate them.
In the previous example, the relationship types are:
Variables can be used for relationships in the same way as for nodes.
Once you specify a variable, you can use it later in the query to reference the relationship.
Take this example:
MATCH (p:Person)-[r:LIKES]->(t:Technology)
RETURN p,r,t
This query specifies variables for both the node labels (
p
for
Person
and
t
for
Technology
) and the relationship type (
r
for
:LIKES
).
In the return clause, you can then use the variables (i.e.
p
,
r
, and
t
) to return the bound entities.
This would be your result:
Property values can be added both to nodes and relationships and be of a variety of data types.
For a full list of values and types, see
Cypher manual → Values and types
.
Another way to organize the data in the previous example would be to add a
property
,
name
, and
Sally
and
John
as
property values
on
Person
-labeled nodes:
Properties are enclosed by curly brackets (
{}
), the key is followed by a colon, and the value is enclosed by single or double quotation marks.
In case you have already added Sally and John as node labels, but want to change them into node properties, you need to refactor your graph.
Refactoring is a strategy in
data modeling
that you can learn more about in
this tutorial
.
Graph pattern matching sits at the very core of Cypher.
It is the mechanism used to navigate, describe, and extract data from a graph by applying a declarative pattern.
Consider this example:
(p:Person {name: "Sally"})-[r:LIKES]->(g:Technology {type: "Graphs"})
This bit of Cypher represents a pattern, but it is not a query.
It only expresses that a
Person
node with
Sally
as its
name
property has a
LIKES
relationship to the
Technology
node with
Graphs
as its
type
property.
In order to
do
something with this pattern, such as adding it to or retrieving it from the graph, you need to
query
the database.
For example, you can add this information to the database using the
CREATE
clause:
CREATE (p:Person {name: "Sally"})-[r:LIKES]->(t:Technology {type: "Graphs"})
And once this data is written to the database, you can retrieve it with this pattern:
MATCH (p:Person {name: "Sally"})-[r:LIKES]->(t:Technology {type: "Graphs"})
RETURN p,r,t
In the same way as nodes and relationships, you can also use variables for patterns.
For more information, refer to
Cypher manual → Patterns → Syntax and Semantics
.
Now that the basic Cypher concepts have been introduced, you can take the tutorial on how to
Get started with Cypher
to learn how to write your own queries.
In the
Cypher manual
, you can find more information on:
How
patterns
work and how you can use them to navigate, describe and extract data from a graph.
What
values and types
, and
functions
are available in Cypher.
For more suggestions on how to expand your knowledge about Cypher, refer to
Resources
.
Glossary
label
Marks a node as a member of a named and indexed subset. A node may be assigned zero or more labels.
labels
A label marks a node as a member of a named and indexed subset. A node may be assigned zero or more labels.
A node represents an entity or discrete object in your graph data model. Nodes can be connected by relationships, hold data in properties, and are classified by labels.
nodes
A node represents an entity or discrete object in your graph data model. Nodes can be connected by relationships, hold data in properties, and are classified by labels.
relationship
A relationship represents a connection between nodes in your graph data model. Relationships connect a source node to a target node, hold data in properties, and are classified by type.
relationships
A relationship represents a connection between nodes in your graph data model. Relationships connect a source node to a target node, hold data in properties, and are classified by type.
property
Properties are key-value pairs that are used for storing data on nodes and relationships.
properties
Properties are key-value pairs that are used for storing data on nodes and relationships.
cluster
A Neo4j DBMS that spans multiple servers working together to increase fault tolerance and/or read scalability. Databases on a cluster may be configured to replicate across servers in the cluster thus achieving read scalability or high availability.
clusters
A Neo4j DBMS that spans multiple servers working together to increase fault tolerance and/or read scalability. Databases on a cluster may be configured to replicate across servers in the cluster thus achieving read scalability or high availability.
graph
A logical representation of a set of nodes where some pairs are connected by relationships.
graphs
A logical representation of a set of nodes where some pairs are connected by relationships.
schema
The prescribed property existence and datatypes for nodes and relationships.
schemas
The prescribed property existence and datatypes for nodes and relationships.
[[database schema]]database schema
The prescribed property existence and datatypes for nodes and relationships.
indexes
Data structure that improves read performance of a database.
Read more about supported categories of indexes
.
indexed
Data structure that improves read performance of a database.
Read more about supported categories of indexes
.
constraints
Constraints are sets of data modeling rules that ensure the data is consistent and reliable.
See what constraints are available in Cypher
.
data model
A data model defines how information is organized in a database. A good data model will make querying and understanding your data easier. In Neo4j, the data models have a graph structure.
data models
A data model defines how information is organized in a database. A good data model will make querying and understanding your data easier. In Neo4j, the data models have a graph structure.
The Call for Papers is now open and we want to hear about your graph-related projects. Submit your talks by June 15
Submit your talk
©
Neo4j, Inc.
Terms
|
Privacy
|
Sitemap
Neo4j
®
, Neo Technology
®
, Cypher
®
, Neo4j
®
Bloom
™
and
Neo4j
®
Aura
™
are registered trademarks
of Neo4j, Inc. All other marks are owned by their respective companies.