YouTube Transcript:
Intro to Graph Databases Episode #5 - Cypher, the Graph Query Language

Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.

AutoDub

Understand YouTube Foreign Videos

Immersive YouTube Voice Translation

Break language barriers, embrace global quality content

Solve Foreign Video Barriers Instantly

Video Transcript

Video Summary

Summary

Core Theme

This content introduces Cypher, a declarative graph query language for Neo4j, explaining its origins, core concepts like pattern matching, and basic operations for creating, querying, and managing data within a property graph.

Mind Map

Click to expand

Click to explore the full interactive mind map • Zoom, pan, and navigate

hey everyone. Welcome back to the Intro to Graph Databases Series. This is episode five,

and my name is Ryan Boyd from the Neo4j developer relations team. I'll be your guide today to

teach you an introduction to Cypher, the graph query language. In order to understand the

value in Cypher as a graph query language, it's important for you to understand why we

created Cypher. And that has to do with the history of the Neo4j developer surface. When Neo4j

first started in around year 2000, we had an embeddable Java API. This Java API allowed

you to imperatively traverse a graph, create new relationship and nodes, and that sort

of thing, but was only accessible within Java. Come to around the 1X series of Neo4j in

2010 and we wanted to extend these capabilities from Java into other clients - other clients

that are acting outside of the database server and they're on the network. And for that,

we created a REST-based API. And this REST-based API was still pretty low level. We had some

different requirements, though as we expanded Neo4j's adoption and Neo4j became more popular by

different communities. So we really wanted a declarative query language that's readable

and expressive. Very similar to how many developers use SQL when interacting tables, we wanted

a similar declarative language for graphs. We wanted it to be able to do all CRUD operations.

Create, read, update, delete, all the main operations. Not just be a querying language.

We also wanted to base it on patterns. The core patterns that you're looking for in a

graph. And we wanted to make it really powerful so that you could actually convert people

from using the more imperative languages over to this declarative language. And we wanted

to allow it to be opened up and adopted by other graph technology.

So in order to do this, we invented the next set of developer surface for Neo4j and that

was Cypher over HGP, the Cypher query language that we're going to review today. And that

Cypher over HGP library also allowed you to access Neo4j remotely from outside of the

Java environment, but provided this declarative language in order to access it. And then as

we advanced on and did a 3.0, 3.1, 3.2 series of Neo4j, we also launched the Bolt Protocol,

a binary protocol that makes it much more type safe to interact with Neo4j and provides

a series of official language drivers - such as Java, .NET, Python, and JavaScript - and

user-defined procedures and functions so that you could still do those interactions with

the Java API if you want to, but call them from within Cypher. All of the procedures

and functions can be called from within Cypher. And that was very important to us. From an

openist perspective, we did release Cypher under the openCypher Project. openCypher aims

to deliver a full and open specification of the industry's most widely adopted graph database

query language Cypher. You can visit opencypher.org to find out more about the openCypher Project

and other databases such as SAP HANA, which have adopted the Cypher technology.

Now, as I review Cypher, I first want to give you a reminder little bit of a recap on what

property graphs are. A property graph has a concept of a node and a relationship and

properties on both those nodes and relationships. So in the case here, we have Anne, who loves

Dan, Dan who loves Anne back. Anne lives with Dan and Anne drives a car which is owned by

Dan. Dan also drives that same car. So you can see here that it's very easy to read this

graph even without having a full understanding of graph databases, and Cypher, and property

graphs. It's really easy to read what's happening here. And this is an important characteristic

of Neo4j and graph databases is the Whiteboard Model is the physical model. We try to reduce

the number of translations between the business owner, and the developer, and the underlying

system, which is executing and storing data. So we create the nodes and relationships in

the underlying data store as sort of the nouns and the verbs and create the properties on

the nodes as sort of the adjectives and the properties on the relationships as sort of

the adverbs. So that's the overview or recap of the Property Graph Model.

Cypher as a query language is based off of patterns. It's about creating patterns in

a graph, patterns of nodes and relationships, and then it's about finding those patterns

when you're doing your queries. So here is an example of one pattern that you might specify.

It's a fairly complex pattern, but very easy to read, and understand, and even code in

Cypher. So we want to find out who drives a car that is owned by a lover. And in this

case, we just write it out. Match a person who drives a car which is owned by another

person and the original person loves that other person. It's a really simple query to

read and write, even though the question is a little bit more complex. Now, patterns in

Cypher use ASCII-Art. And for those of you who weren't around back in the day, ASCII-Art

is basically using the keys on the keyboard in order to generate graphics. And in this

case, ASCII-Art for nodes means that you're using parentheses to surround nodes. And you

can either just use a blank set of parentheses if you'll never need to refer to that node

again after that part in the query, or you can specify an alias inside the node, such

as P here, in order to refer to that node later in the query. There are also labels

or tags on nodes. These allow you to group nodes together by rolls and types.

So in the case here of a person, a person is also a mammal, so this person has a second

label that is a mammal, and we're going to still refer to that person as P as the alias.

Now, nodes can also have properties. So, for instance, you might want to set the name on

a person. In this case here, we're setting the name on a person as the string value Veronica.

These properties can have a wide variety of different types of values, including a lot

of the basic Java types and arrays of the basic types. Now, ASCII-Art for relationships.

Well, relationships are wrapped with hyphens or square brackets. [So you?] you can see

here. Let's say you were trying to talk about the hired relationships. So Joe hired John.

You can see here that we could either specify the relationship: hired with an alias H to

refer to that relationship later in the query, or we can just say that there is a relationship

without specifying the type and without specifying an alias. The direction of the relationship

is specified with less than and greater than simples. So you could see here person one.

Let's say, Kate, hired person two, let's say, John, or the vice versa. And in the vice versa

case, we have Kate was hired by John or John hired Kate. But it's just showing the opposite

direction with the less than symbol instead of the greater than symbol.

Relationships can also have properties which can be specified using the [stance of light?]

syntax here. In this case, specifying that that person was hired as a type fulltime employee.

Now I've mentioned the words aliases over time here, and I want to reemphasize this.

So the H in hired here, the P-one, the P-two, these all represent references. So you're

defining references or aliases such that, later in the query, you can access those references.

So in this case here, if these were in MATCH statement, we might want to return the person

back to the-- as a response to the query. Or if these were in a MATCH statement, maybe

we want to use that person that we found, and we want to add additional properties

or delete the person, or something along those lines. So these are simply aliases that make

it easier to access the MATCH nodes or relationships later in the query.

Now let's give you a basic create statement as well as a basic query statement. These

are very simple examples here. Later on, we'll get more and more complex, and we'll give

you more complex query operations as well. To create data inside the graph, you simply

specify it very much the same way as you would if you were trying to query for that data.

So if we wanted to create two nodes and a relationship between the nodes, in this case,

we wanted to create two-person nodes as the label, and have named properties on each of

them, and the love's relationship in between, we simply use the CREATE statement in Cypher

and we say, "CREATE: person brace and that [Jade?] unlike syntax name Ann loves another

person brace and the Jade syntax named Dan." And that's all there really is to it. This

is all that it takes to create these two nodes and the corresponding relationship and the

properties on the node in the graph. Now let's say we wanted to run a pretty basic

query and say, "Okay, we know Ann loves someone. Who does Ann love? Or whom does Ann love?"

And that query is actually quite simple here. We just say, "Find me a person named Ann who

loves another person." And then we're-- you can see how we're using the alias here called

OP to return that other person back as a result of the query. And this case, we're returning

the node and we get Dan of course so Ann loves Dan. And this is a really simple

example here, but it does show you the power of Cypher.

What happens if we want to add more properties to our graph? Well, let's find Ann's car.

We're going to find a person whose name is Ann. And you can see here, I use single quotes

here when specifying Ann. You can use either single quotes or double quotes. And let's

find a person whose name is Ann who drives a car, and let's figure out the car that Ann

drives. We can actually do this in another way. The first example was using the JSON-like

syntax for specifying the name of the person that we're searching for. In this case, we're

actually just saying that the pattern that we're searching for is a person who drives

a car, and then we're restricting the traversal by specifying the name as Ann. Both of these

queries should really have the same performance, but different ones are more readable than

others by different people. So take those two different options and understand that

they mean the same thing. But in this case, we're returning the car that Ann drives.

Now, let's say we wanted to add greater description to that car when you wanted to indicate the

brand and the model of the car. We can easily do that with the set operations. So the set

syntax in Cypher allows us to set additional properties on the node that we found. So it's

the same match statement. We're trying to find in the graph a person who drives a car,

where the person's name is Ann. And we're also trying to return that car. But before

we return it, we're going to set those two additional properties, the brand and the model,

and then the returned car will have those properties set. One important aspect of dealing

with graphs is dealing with the integrity of the graph, the integrity of the data, and

Neo4j really focuses on being a transactional database. An OLTP has ACID compliance. We're

all about making sure that the data that you set and the data that you return is all predictable.

And one of the aspects of that is ensuring uniqueness in the graph. We don't want you

to have a bunch of different Anns in your graph. Let's say that your graph looked like

this. How would you know how to differentiate one Ann from another Ann? That would be very

difficult. So if we're going to assume here that the name Ann is unique amongst the population

in our graph, we can actually ensure that, that there can only be one Ann.

We can do that with constraints. In this case here, we're going to say create constraint

on the person-labeled nodes, assert that the person's name is unique. It's a very simple

constraint but it's very powerful to prevent us from adding multiple Anns. So let's say

we did try to add another Ann. If we try to add another Ann by creating another person-labeled

node with the name of Ann, we would actually get an error that looks something like this.

Constraint validation failed, indicating that there's another node in our graph already

with the same label and the same property, and that violates our constraint that we set. Of

course, this is all fine and dandy, but we don't want to actually experience this error.

We want to actually be able to ensure uniqueness at the time that we create our nodes and relationships.

So let's say that we want Ann to have a pet dog. So we want to say Ann has pet and named

Sam. If we did something like this, create a person named Ann, has a pet dog named Sam,

we'd actually experience the same error as we talked about before, a constraint validation

error, because Ann already exsists in our database. So instrad of using two create statements

here, we can, instead, use a merged statement and saying merged person named Ann. Basically,

what this does is it looks in the graph for a person named Ann and operates on that person

node. If it does not exist, it will create it. And that way, when you execute the next

step in terms of creating the relationships, you can create that dog attached to either

the existing person or a new person. In this case here, what we're doing is saying,

"Let me find an existing person in the graph whose name is Ann. And if I do not find that,

then when I create a new person with the name of Ann, set the Twitter property to Ann's

Twitter handle. And then you'll notice the last statement here has actually changed the

create statement for the pet to a merged statement. And what this will do is create a pet-- or

has pet relationship from Ann to a dog named Sam if that exact pattern doesn't already

exist in the graph, and only if that exact pattern does not already exist in the graph.

So this will not ever result in Ann having two pets, both as dogs named Sam.

All right. So we have created our graph, and our graph has our Ann in it, has Sam in it,

saying Ann has that pet Sam, very basic introduction to Cypher and some of the create operations,

a little bit of querying and a little bit of modifying operations. And the next video

in the series-- okay. I probably shouldn't that again considering the delay between the

last couple of videos. But we will teach you, either through the next video or through our

documentation tutorials in the Neo4j sandbox, how you can do many more complicated queries

on your graph. You can also learn through our Neo4j online training at neo4j.com/graphacademy,

or through classroom training which is accessible there as well. So thank you very much and

I hope you have a fantastic day. And feel free to reach out if you have any questions

Click on any text or timestamp to jump to that moment in the video

Most transcripts ready in under 5 seconds

One-Click Copy125+ LanguagesSearch ContentJump to Timestamps

Paste YouTube URL

Enter any YouTube video link to get the full transcript

Most transcripts ready in under 5 seconds

Get Our Chrome Extension

Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.

Add to Chrome — Free

Works with YouTube, Coursera, Udemy and more educational platforms

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube TranscriptPreparing your results…

YouTube Transcript:Intro to Graph Databases Episode #5 - Cypher, the Graph Query Language