0:02 recently Vector databases got a lot of
0:05 Fame with companies raising hundreds of
0:07 millions of dollars to build them and
0:10 people calling it a new kind of database
0:13 for the AI era on the other hand for
0:15 many projects it might be an Overkill
0:17 solution and using a traditional
0:20 database or even just a numpy ND array
0:22 might work just fine but there is no
0:24 doubt the vector databases are extremely
0:26 fascinating and allow many great
0:28 applications especially when you want to
0:31 give large language models like gpt4
0:33 long-term memory so in this video I will
0:35 explain in a very beginner friendly way
0:37 what Vector databases are and how they
0:40 work we will go over some use cases for
0:41 them and then I will briefly show you
0:44 some options you can use so let's get
0:46 started so let's start with the why over
0:49 80 percent of the data out there is
0:52 unstructured such as social media posts
0:54 images videos or audio data you cannot
0:56 easily fit them into a relational
0:59 database let's take an image as an
1:00 example if you want want to put this
1:03 into a relational database in order to
1:05 search for similar images what ends up
1:07 happening is that often we manually
1:10 assign keywords or tags to it because
1:12 from the pixel values alone we cannot
1:14 really search for similar images and the
1:16 same holds true for unstructured text
1:19 blobs or audio and video data so we
1:21 either have to assign tags or attributes
1:24 to it often manually or we can find a
1:26 different representation to store the
1:28 data and this brings us to vector
1:30 embeddings and Vector databases in short
1:33 a vector database indexes and stores
1:36 Vector embeddings for fast retrieval and
1:38 similarity search so let's take a step
1:40 back and look at those two important
1:43 components first it uses clever
1:45 algorithms to calculate the so-called
1:47 Vector embeddings this is done by
1:49 Machine learning models a vector
1:51 embedding is just a list of numbers that
1:53 represents the data in a different way
1:55 for example you can calculate an
1:57 embedding for a single word a whole
1:59 sentence or an image and now we have
2:02 numerical data that the computer can
2:04 understand one easy possibility we get
2:07 with vectors is to find similar vectors
2:09 by calculating the distances and doing a
2:11 nearest neighbor search so we can easily
2:14 find similar items for Simplicity I
2:17 display a 2d case here but in reality of
2:19 course those vectors can have hundreds
2:21 of Dimensions but just storing the data
2:23 as embeddings is not enough performing a
2:26 query across thousands of vectors based
2:27 on its distance metric would be
2:30 extremely slow and this is why those
2:32 vectors also need to be indexed so the
2:35 indexing process is the second key
2:37 element of a vector database an index is
2:40 a data structure that facilitates the
2:43 search process so the indexing step Maps
2:45 the vectors to a new data structure that
2:48 will enable faster searching this is a
2:50 whole research field on its own and
2:52 different ways to calculate indexes
2:54 exist so I won't go into details here
2:57 just know that indexes are needed for
2:59 efficient search so let's go over some
3:01 use cases I already mentioned that we
3:03 can use Vector databases to equip large
3:06 language models with long-term memory
3:08 this is for example what you can easily
3:10 Implement with Lang chain we can use it
3:12 for semantic search when we need to
3:14 search not for exact string matches but
3:16 rather based on the meaning or context
3:19 of our question we can also use it for
3:21 similarity search for images audio or
3:24 video data so we can say hey find me a
3:26 similar image to this one and we don't
3:28 need to use some keywords or text to
3:30 describe the image and we can use a
3:32 vector database as a ranking and
3:34 recommendation engine for example for
3:36 online retailers it can be used to
3:39 suggest items similar to past purchases
3:41 of a customer since we can simply
3:43 identify the nearest neighbors of an
3:45 item in our database so now that you
3:47 know some use cases let's go over some
3:50 options you can use as a vector database
3:52 there are a number of vector databases
3:55 available for example we have Pinecone
3:58 vv8 chroma redis also has a virtual
4:02 database cool trans milvis or Vespa AI
4:04 so I won't go into details here but if
4:06 you want to see a separate video with an
4:08 in-depth comparison then let me know in
4:09 the comments below alright I hope you
4:11 now have a good understanding of what
4:14 Vector databases are and what you can do
4:15 with them if you want to see more
4:18 explainer videos and AI tutorials then
4:20 make sure to subscribe to our Channel
4:21 and then I hope to see you in the next