Hang tight while we fetch the video data and transcripts. This only takes a moment.
Connecting to YouTube player…
Fetching transcript data…
We’ll display the transcript, summary, and all view options as soon as everything loads.
Next steps
Loading transcript tools…
Complete Detailed Roadmap To Learn AI In 2025-26 by an AI Researcher | Harkirat Singh | YouTubeToText
YouTube Transcript: Complete Detailed Roadmap To Learn AI In 2025-26 by an AI Researcher
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
Video Summary
Summary
Core Theme
This content outlines a comprehensive roadmap for aspiring AI professionals, differentiating between core AI research and applied AI development, and emphasizing practical project-based learning and continuous adaptation to the rapidly evolving field.
Mind Map
Click to expand
Click to explore the full interactive mind map • Zoom, pan, and navigate
All right, let me set the context for
the video. In this video, we have
Rishab. Rishab is an AI engineer at a
US-based startup working out of India
remotely. He's around 20 years old. I
think he started doing research around
1.5 years ago and he has sprinted
through this journey of AI research
probably because he's really good at
maths since his Jed days. Um, he's one
of the few people I've known in the
country who actually do core AI and
machine learning research. Um, so I
thought it makes sense to bring him on
the channel to introduce to you guys how
can you follow a similar path and learn
about core machine learning and AI
research. He's divided the video into
two parts applied AI and AI research.
You can pick which one makes more sense
for you. Generally though, unless you're
decently good at math, I would not go
down the AI research path. If you are
good at full stack development, want to
get go into machine learning or AI as an
applied engineer, then the second path
is for you. With that, let's get into
the video.
>> Hey everyone and welcome to the AI road
map video. We have seen a lot of these
comments under our videos. People asking
for different AI road map. So here is
the one. A very big disclaimer before I
start the video is that this is the path
which worked for me might not work for
you. But the things I cover here are
very general things which everyone
should know if you're working in AI.
I'll be dividing this into two parts.
Core AI and applied AI. Core AI is where
you build the models, use different
architectures to make them better,
different optimization techniques,
different inference optimizations. While
applied AI is more of building on top of
these LLM APIs like chat GPD, cloud
gemini. New stuff which came recently or
in the past couple of months is AI
agents and MCPS. These also comes under
the category of applied AI. Before
diving deep into the road map, I think
we should first discuss the problems
which you might be facing right now. It
is very common for you to fell in the
pitfall and never actually start to
learn AI. So there's actually new tools
dropping every couple of weeks. It's
very common for people to get confused
of what to learn, when to learn, and
where to learn. Everyone says something
different. I mean, people have their
opinions, and I'll be sharing mine.
Since I've worked on both side of the
industries, research and applied AI,
I'll be able to provide you a very wide
perspective of what you guys should
follow for landing a good job. I think
the first and foremost thing you guys
should definitely learn is Python
fundamentals which consist of
conditional loops classes and objects
and common libraries numpy pandas pytor
I think this code snippet will give you
a very good idea if you are able to
understand this code snippet what it is
doing can you modify it can you add new
things to it I think you're good to go
with Python this is a very fundamental
because everything will be built on top
of Python since it is the widely used
language in the AI industry Moving on to
the next thing is classical ML. I'll not
ask you to dive very deep into this
since this might take a long time and if
you are very short on time you should
basically just get the cracks of what it
is and how it is done. No need to dive
deep into implementation for now but it
will be better if you do it in future.
It consists of your regression,
classification, different classification
techniques, different clustering
techniques while also giving you
different ideas of evaluations like
accuracy, precision, recall, F1 score,
etc. It will give you very common ideas
of machine learning like cross
validation set, train and test split,
how do we optimize things, etc. This
will build a very good base for you if
you're going to get very deeper into
here. And my suggestion is that if you
have good enough time, you should
definitely start with the first building
block which is to go deep into classical
ML. I think the very next step after
just getting over classical ML is
getting into deep learning which is just
a subset of machine learning. In this
basically you should be able to
understand what is a neuron basically a
perceptron. What is a neural network?
These are built with number of different
neurons or perceptrons. You should
understand the main algorithm which is
back propagation which makes the whole
neural network learn something new by
changing it weights via back
propagation. You should know about MLP's
activation function loss functions
optimization regularization and again
I've already mentioned different
libraries to learn. PyTorch is the holy
bible of AI if you are writing
implementation of different
architectures or if you're writing GPU
kernels and the library you should focus
the most on this is PyTorch since you'll
be writing a lot of PyTorch to actually
write all these architectures. Now that
you have built a decent base you
understand what machine learning is what
their different concepts are and how you
can use them. You understand what a
neural network is, how does it work,
what is activation functions, what is
loss functions. The very next step for
you should be to explore different
parts. I'm not saying you should explore
all these at the same time. But yeah,
pick one, move to next, pick another,
move to next. I've mentioned basically
four parts here, but you can definitely
find more. Those are computer vision,
natural language processing,
reinforcement learning, audio, speech
and audio processing. What comes under
these are computer vision basically
works with images and video data. So
it's like working with CNN's or
convolutional neural networks which help
you understand about an image. It helps
you build models which can detect object
or do segmentation of different objects.
NLP is basically working with textual
data. You pre-process the text data. You
convert the text into embeddings. You
learn about sequence models. You learn
about RNN's LSTM GRU. You learn about
named entity recognization etc. These
all comes under the entity of natural
language processing. Reinforcement
learning is a very old field and
suddenly got popular into the world of
large language models. But the
reinforcement learning we use in LLM is
very different from what classical
reinforcement learning is. It is good
for you to dive deeper into both. I
think if you are very good in RL or
understand the concepts very deeply, the
very best option for you is to work
under some robotic company or maybe make
something of your own. This goes through
concepts like MDPs, markov decision
processes, deep Q networks, policy
gradient methods etc. While the next
step or the next path you guys can
follow is speech and audio processing
which is your like the next path you
guys can follow is speech and audio
processing which consist of text to
speech, automatic speech recognition
etc. You should explore all these path
individually. Give them their time. I
think after you have given them enough
time you'll be able to understand what
is the best for you. Either you are good
with CV, are you good with RL, are you
good with NLP? If I talk about my
experience, I also dive deep into very
different fields. I tried computer
vision, I tried RL, I tried NLP. What
attracted me most was NLP and RL. Moving
on to the holy grail which changed
everything is transformers. You should
understand how it works. What's the
whole architecture? How does data flow
inside it? What is multi head attention?
What is attention mechanism? What is
self attention? Followed by latest
advancements like mixture of experts,
multi head latent detention etc. I built
a primer of this earlier on this channel
where you can learn about how data flow
inside transformers and how does it
internally work. We incorporate
incorporate the meaning of hello into
this one. We okay so we need some kind
of structure or pattern by watching that
video. I think you'll be able to learn
how transformers work intuitively and
also how they are able to process and
actually learn all this data. Moving on
to the hot topic right now which is
large language models such as GPD, claw
or gemini. All of these are considered
under the category of large language
models since they have very large number
of parameters in them. So they are large
language models. I've divided them into
three stages. These are basically the
training stages of a model. A model goes
through three training stages which are
pre-training, mid-training and
post-raining. In each step, model learns
something new. Pre-training is when the
model learns from scratch. We give it a
massive amount of data, architecture,
optimization techniques and make it
learn something which gives a base model
followed by training. When we give this
model domain specific knowledge like
math, science etc. This can also be
named as continual pre-training since
the model will now learn more of these
domain specific knowledge and will
converge on these. The next part of the
pipeline is post- training where the
model is tuned to work as a chatbot.
Before this step, the model is just an
expert predictor. What instruct tuning
do is it makes it works like a chatbot.
So it understand that user ask a
question and I have to answer it in a
certain way. RLHF, RLVR, test time
computer all comes under the category of
post- training where the model learns to
not give harmful advices or not give
harmful content while also we induce the
capability of thinking which is testing
compute to model at this stage. All of
these three stages need different
experts to work on them. When you
understand how a transformer work, how a
large language model work, you'll be
able to understand what pre-training,
mid-training, post- trainining means.
Becoming an expert in any one of these
fields can result you a very good pay
while also being very respected. If you
become or if you learn about any of
these field and get into it very deep,
you'll be able to crack a job easily.
One other thing you guys can learn about
is finetuning. Finetuning is basically
training your model on some specific
data so that it gives the best result
based on that data. It can be done via
some common libraries like sloth or
hugging face TRL. It goes if you want to
understand more of how does it works
there's different techniques like low
rank optimization quantized low rank
optimizations how does adapters work etc
all these together make you understand
how does finetuning work which is
basically updating some set of
parameters so model can learn from a new
data I think this kind of sums up what
core AI is n I've not gone very deep
into this and each step of this pipeline
is divided into many subsets once you
understand these you'll be able to dive
into deeper concept yourself. Moving on
to some easy stuff, comparatively easy
stuff which is basically applied AI. The
very first thing you need to understand
is working with LM APIs like code chat
GBT any API. You should be able to call
it make it response to your input. What
you can learn here is basically prompt
engineering function calling tool use
etc. Prompt engineering basically
engineering a prompt which gives a
certain output when given a certain
input and basically increase the chance
of right output. A function calling in
is when you give a model a specific
function which the model can use and
give the output in JSON so that you can
further use it in your pipeline. Moving
on to embeddings and vector databases.
So the concept behind this is that
converting a textual data into a vector
database. So think of it in a
n-dimensional space where you can put
different data and similar data gets
clustered into same domain. Here you
need to learn about how does similarity
matrix works. What is indexing? How can
I do best retrieval so that you can if
you learn these very well you'll be able
to understand rag easily which stands
for retrieval augmented generation. RA
consists of loading a document,
pre-processing it, using different
chunking strategies, using re-ranking
models, using hybrid search like dense
search, BM25 search to actually use data
from a local database, pass into your
LLM and get a better result. Why this is
done? All the data is not used to train
LLMs. Some data might be private or some
data might be new after the model has
gone through its training stage. So this
is when rack comes in. Intuitively, what
rag does is it provides an external
database which consists of these new
data sources or any private data sources
which the LLM can query and get the
output from the text data and use it in
its prompt to give a final output.
You'll have to do a lot of evaluations
to make sure your rag model is working
well without any hallucinations cuz
hallucinations can break the further
pipelines. Moving on to the hot stuff
right now which are AI agents. What are
AI agents should be your first question.
So you'd be able to understand what AI
agents are which basically stands for
anything which can perceive, reason and
act. You need to understand what tool
use is, what function calling is, how
can you give an agent a tool, how can it
use it, what is planning, what is
memory, how can I keep more memory, how
can I make a consistent agent which
keeps the previous memories, how can I
build a multi- aent system? Good
framework for this is langraph, langen,
camel, etc. Few months ago, Anthropic
came with model context protocol which
is like a universal USB you can put into
a AI model and it basically standardize
everything. So you can connect anything
with it either it's your DB or some API
or you want to execute your code. You
can basically just make an MCP server of
it and just connect it to your LLM. Your
LLM will then be able to use this and
fetch or do anything with this
information. Before MCP you might need
to have different API for everything but
what MCP does is basically a middleware
between these different apps and your
LLMs which standardize everything. ML
option pro is the easiest field if
you're currently working as a web to
engineer or basic backend engineer since
here you'll be working with docker
kubernetes to deploy these built models
doing experiment tracking inference
optimizations via ulm tensor and
monitoring your models for any breakage
or changes. Now moving on to the most
important part of this video. Some
general advice and takeaways. General
advice from me would be to be up to date
with this field. The sources you guys
can refer to are Twitter, archive,
hugging face spaces and hugging face
news. The key to keep on learning is
always be curious and learn the next
thing. Build the next project and do
that thing. In this field, you have to
be a rapid developer adapting to things
fast and executing things fast. That is
what will make you unique and might help
you getting a job soon. I think the one
thing you should take away from this
video is that don't just keep on
learning a theory or just watching a
video, completing a lecture series.
Instead, start building a project.
You'll learn a lot more by building
something than you are learning by just
watching some video or reading some
blogs. This was a very brief AI road map
which will give you an overall idea of
how things are currently working in the
AI space. After this, you'll be able to
yourself dive into deeper concepts, look
at new tech and just understand them how
they work. You'll become a rapid
developer in the field of AI by using
AI. You'll know where to find, what to
find and when to find. If I talk from my
experience, what the industry currently
needs is someone who understand concepts
very deeply. If you're working in core
AI, while working in applied AI, what I
got to know is that you have to be a
rapid developer. You have to understand
new tech fast. Integrate it into your
system. Build docs on top of it. Build
projects on top of it while using AI
since it has given a boost to all the
developers. So now you have to adapt to
it. With this, I'd like to end the
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.