YouTube Transcript:
Complete Detailed Roadmap To Learn AI In 2025-26 by an AI Researcher

Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.

AutoDub

Understand YouTube Foreign Videos

Immersive YouTube Voice Translation

Break language barriers, embrace global quality content

Solve Foreign Video Barriers Instantly

Video Transcript

Video Summary

Summary

Core Theme

This content outlines a comprehensive roadmap for aspiring AI professionals, differentiating between core AI research and applied AI development, and emphasizing practical project-based learning and continuous adaptation to the rapidly evolving field.

Mind Map

Click to expand

Click to explore the full interactive mind map • Zoom, pan, and navigate

All right, let me set the context for

the video. In this video, we have

Rishab. Rishab is an AI engineer at a

US-based startup working out of India

remotely. He's around 20 years old. I

think he started doing research around

1.5 years ago and he has sprinted

through this journey of AI research

probably because he's really good at

maths since his Jed days. Um, he's one

of the few people I've known in the

country who actually do core AI and

machine learning research. Um, so I

thought it makes sense to bring him on

the channel to introduce to you guys how

can you follow a similar path and learn

about core machine learning and AI

research. He's divided the video into

two parts applied AI and AI research.

You can pick which one makes more sense

for you. Generally though, unless you're

decently good at math, I would not go

down the AI research path. If you are

good at full stack development, want to

get go into machine learning or AI as an

applied engineer, then the second path

is for you. With that, let's get into

the video.

>> Hey everyone and welcome to the AI road

map video. We have seen a lot of these

comments under our videos. People asking

for different AI road map. So here is

the one. A very big disclaimer before I

start the video is that this is the path

which worked for me might not work for

you. But the things I cover here are

very general things which everyone

should know if you're working in AI.

I'll be dividing this into two parts.

Core AI and applied AI. Core AI is where

you build the models, use different

architectures to make them better,

different optimization techniques,

different inference optimizations. While

applied AI is more of building on top of

these LLM APIs like chat GPD, cloud

gemini. New stuff which came recently or

in the past couple of months is AI

agents and MCPS. These also comes under

the category of applied AI. Before

diving deep into the road map, I think

we should first discuss the problems

which you might be facing right now. It

is very common for you to fell in the

pitfall and never actually start to

learn AI. So there's actually new tools

dropping every couple of weeks. It's

very common for people to get confused

of what to learn, when to learn, and

where to learn. Everyone says something

different. I mean, people have their

opinions, and I'll be sharing mine.

Since I've worked on both side of the

industries, research and applied AI,

I'll be able to provide you a very wide

perspective of what you guys should

follow for landing a good job. I think

the first and foremost thing you guys

should definitely learn is Python

fundamentals which consist of

conditional loops classes and objects

and common libraries numpy pandas pytor

I think this code snippet will give you

a very good idea if you are able to

understand this code snippet what it is

doing can you modify it can you add new

things to it I think you're good to go

with Python this is a very fundamental

because everything will be built on top

of Python since it is the widely used

language in the AI industry Moving on to

the next thing is classical ML. I'll not

ask you to dive very deep into this

since this might take a long time and if

you are very short on time you should

basically just get the cracks of what it

is and how it is done. No need to dive

deep into implementation for now but it

will be better if you do it in future.

It consists of your regression,

classification, different classification

techniques, different clustering

techniques while also giving you

different ideas of evaluations like

accuracy, precision, recall, F1 score,

etc. It will give you very common ideas

of machine learning like cross

validation set, train and test split,

how do we optimize things, etc. This

will build a very good base for you if

you're going to get very deeper into

here. And my suggestion is that if you

have good enough time, you should

definitely start with the first building

block which is to go deep into classical

ML. I think the very next step after

just getting over classical ML is

getting into deep learning which is just

a subset of machine learning. In this

basically you should be able to

understand what is a neuron basically a

perceptron. What is a neural network?

These are built with number of different

neurons or perceptrons. You should

understand the main algorithm which is

back propagation which makes the whole

neural network learn something new by

changing it weights via back

propagation. You should know about MLP's

activation function loss functions

optimization regularization and again

I've already mentioned different

libraries to learn. PyTorch is the holy

bible of AI if you are writing

implementation of different

architectures or if you're writing GPU

kernels and the library you should focus

the most on this is PyTorch since you'll

be writing a lot of PyTorch to actually

write all these architectures. Now that

you have built a decent base you

understand what machine learning is what

their different concepts are and how you

can use them. You understand what a

neural network is, how does it work,

what is activation functions, what is

loss functions. The very next step for

you should be to explore different

parts. I'm not saying you should explore

all these at the same time. But yeah,

pick one, move to next, pick another,

move to next. I've mentioned basically

four parts here, but you can definitely

find more. Those are computer vision,

natural language processing,

reinforcement learning, audio, speech

and audio processing. What comes under

these are computer vision basically

works with images and video data. So

it's like working with CNN's or

convolutional neural networks which help

you understand about an image. It helps

you build models which can detect object

or do segmentation of different objects.

NLP is basically working with textual

data. You pre-process the text data. You

convert the text into embeddings. You

learn about sequence models. You learn

about RNN's LSTM GRU. You learn about

named entity recognization etc. These

all comes under the entity of natural

language processing. Reinforcement

learning is a very old field and

suddenly got popular into the world of

large language models. But the

reinforcement learning we use in LLM is

very different from what classical

reinforcement learning is. It is good

for you to dive deeper into both. I

think if you are very good in RL or

understand the concepts very deeply, the

very best option for you is to work

under some robotic company or maybe make

something of your own. This goes through

concepts like MDPs, markov decision

processes, deep Q networks, policy

gradient methods etc. While the next

step or the next path you guys can

follow is speech and audio processing

which is your like the next path you

guys can follow is speech and audio

processing which consist of text to

speech, automatic speech recognition

etc. You should explore all these path

individually. Give them their time. I

think after you have given them enough

time you'll be able to understand what

is the best for you. Either you are good

with CV, are you good with RL, are you

good with NLP? If I talk about my

experience, I also dive deep into very

different fields. I tried computer

vision, I tried RL, I tried NLP. What

attracted me most was NLP and RL. Moving

on to the holy grail which changed

everything is transformers. You should

understand how it works. What's the

whole architecture? How does data flow

inside it? What is multi head attention?

What is attention mechanism? What is

self attention? Followed by latest

advancements like mixture of experts,

multi head latent detention etc. I built

a primer of this earlier on this channel

where you can learn about how data flow

inside transformers and how does it

internally work. We incorporate

incorporate the meaning of hello into

this one. We okay so we need some kind

of structure or pattern by watching that

video. I think you'll be able to learn

how transformers work intuitively and

also how they are able to process and

actually learn all this data. Moving on

to the hot topic right now which is

large language models such as GPD, claw

or gemini. All of these are considered

under the category of large language

models since they have very large number

of parameters in them. So they are large

language models. I've divided them into

three stages. These are basically the

training stages of a model. A model goes

through three training stages which are

pre-training, mid-training and

post-raining. In each step, model learns

something new. Pre-training is when the

model learns from scratch. We give it a

massive amount of data, architecture,

optimization techniques and make it

learn something which gives a base model

followed by training. When we give this

model domain specific knowledge like

math, science etc. This can also be

named as continual pre-training since

the model will now learn more of these

domain specific knowledge and will

converge on these. The next part of the

pipeline is post- training where the

model is tuned to work as a chatbot.

Before this step, the model is just an

expert predictor. What instruct tuning

do is it makes it works like a chatbot.

So it understand that user ask a

question and I have to answer it in a

certain way. RLHF, RLVR, test time

computer all comes under the category of

post- training where the model learns to

not give harmful advices or not give

harmful content while also we induce the

capability of thinking which is testing

compute to model at this stage. All of

these three stages need different

experts to work on them. When you

understand how a transformer work, how a

large language model work, you'll be

able to understand what pre-training,

mid-training, post- trainining means.

Becoming an expert in any one of these

fields can result you a very good pay

while also being very respected. If you

become or if you learn about any of

these field and get into it very deep,

you'll be able to crack a job easily.

One other thing you guys can learn about

is finetuning. Finetuning is basically

training your model on some specific

data so that it gives the best result

based on that data. It can be done via

some common libraries like sloth or

hugging face TRL. It goes if you want to

understand more of how does it works

there's different techniques like low

rank optimization quantized low rank

optimizations how does adapters work etc

all these together make you understand

how does finetuning work which is

basically updating some set of

parameters so model can learn from a new

data I think this kind of sums up what

core AI is n I've not gone very deep

into this and each step of this pipeline

is divided into many subsets once you

understand these you'll be able to dive

into deeper concept yourself. Moving on

to some easy stuff, comparatively easy

stuff which is basically applied AI. The

very first thing you need to understand

is working with LM APIs like code chat

GBT any API. You should be able to call

it make it response to your input. What

you can learn here is basically prompt

engineering function calling tool use

etc. Prompt engineering basically

engineering a prompt which gives a

certain output when given a certain

input and basically increase the chance

of right output. A function calling in

is when you give a model a specific

function which the model can use and

give the output in JSON so that you can

further use it in your pipeline. Moving

on to embeddings and vector databases.

So the concept behind this is that

converting a textual data into a vector

database. So think of it in a

n-dimensional space where you can put

different data and similar data gets

clustered into same domain. Here you

need to learn about how does similarity

matrix works. What is indexing? How can

I do best retrieval so that you can if

you learn these very well you'll be able

to understand rag easily which stands

for retrieval augmented generation. RA

consists of loading a document,

pre-processing it, using different

chunking strategies, using re-ranking

models, using hybrid search like dense

search, BM25 search to actually use data

from a local database, pass into your

LLM and get a better result. Why this is

done? All the data is not used to train

LLMs. Some data might be private or some

data might be new after the model has

gone through its training stage. So this

is when rack comes in. Intuitively, what

rag does is it provides an external

database which consists of these new

data sources or any private data sources

which the LLM can query and get the

output from the text data and use it in

its prompt to give a final output.

You'll have to do a lot of evaluations

to make sure your rag model is working

well without any hallucinations cuz

hallucinations can break the further

pipelines. Moving on to the hot stuff

right now which are AI agents. What are

AI agents should be your first question.

So you'd be able to understand what AI

agents are which basically stands for

anything which can perceive, reason and

act. You need to understand what tool

use is, what function calling is, how

can you give an agent a tool, how can it

use it, what is planning, what is

memory, how can I keep more memory, how

can I make a consistent agent which

keeps the previous memories, how can I

build a multi- aent system? Good

framework for this is langraph, langen,

camel, etc. Few months ago, Anthropic

came with model context protocol which

is like a universal USB you can put into

a AI model and it basically standardize

everything. So you can connect anything

with it either it's your DB or some API

or you want to execute your code. You

can basically just make an MCP server of

it and just connect it to your LLM. Your

LLM will then be able to use this and

fetch or do anything with this

information. Before MCP you might need

to have different API for everything but

what MCP does is basically a middleware

between these different apps and your

LLMs which standardize everything. ML

option pro is the easiest field if

you're currently working as a web to

engineer or basic backend engineer since

here you'll be working with docker

kubernetes to deploy these built models

doing experiment tracking inference

optimizations via ulm tensor and

monitoring your models for any breakage

or changes. Now moving on to the most

important part of this video. Some

general advice and takeaways. General

advice from me would be to be up to date

with this field. The sources you guys

can refer to are Twitter, archive,

hugging face spaces and hugging face

news. The key to keep on learning is

always be curious and learn the next

thing. Build the next project and do

that thing. In this field, you have to

be a rapid developer adapting to things

fast and executing things fast. That is

what will make you unique and might help

you getting a job soon. I think the one

thing you should take away from this

video is that don't just keep on

learning a theory or just watching a

video, completing a lecture series.

Instead, start building a project.

You'll learn a lot more by building

something than you are learning by just

watching some video or reading some

blogs. This was a very brief AI road map

which will give you an overall idea of

how things are currently working in the

AI space. After this, you'll be able to

yourself dive into deeper concepts, look

at new tech and just understand them how

they work. You'll become a rapid

developer in the field of AI by using

AI. You'll know where to find, what to

find and when to find. If I talk from my

experience, what the industry currently

needs is someone who understand concepts

very deeply. If you're working in core

AI, while working in applied AI, what I

got to know is that you have to be a

rapid developer. You have to understand

new tech fast. Integrate it into your

system. Build docs on top of it. Build

projects on top of it while using AI

since it has given a boost to all the

developers. So now you have to adapt to

it. With this, I'd like to end the

Click on any text or timestamp to jump to that moment in the video

Most transcripts ready in under 5 seconds

One-Click Copy125+ LanguagesSearch ContentJump to Timestamps

Paste YouTube URL

Enter any YouTube video link to get the full transcript

Most transcripts ready in under 5 seconds

Get Our Chrome Extension

Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.

Add to Chrome — Free

Works with YouTube, Coursera, Udemy and more educational platforms

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube TranscriptPreparing your results…

YouTube Transcript:Complete Detailed Roadmap To Learn AI In 2025-26 by an AI Researcher