Hang tight while we fetch the video data and transcripts. This only takes a moment.
Connecting to YouTube player…
Fetching transcript data…
We’ll display the transcript, summary, and all view options as soon as everything loads.
Next steps
Loading transcript tools…
Deep learning project end to end | Potato Disease Classification - 4 : FastAPI/tf serving Backend | codebasics | YouTubeToText
YouTube Transcript: Deep learning project end to end | Potato Disease Classification - 4 : FastAPI/tf serving Backend
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
Video Summary
Summary
Core Theme
This content details the practical implementation of deploying a machine learning model for potato disease detection. It guides viewers through building a FastAPI server to serve a pre-trained TensorFlow model, first directly and then by integrating with TensorFlow Serving for more robust deployment.
Mind Map
Click to expand
Click to explore the full interactive mind map • Zoom, pan, and navigate
what do you want to learn baby today?
Well I understand baby Yoda's language
what he's saying is he wants to learn
TF serving and fast API today and he
wants to go through some mlops concepts.
In first two videos we
trained a model
we exported it on a disk. In this video
we will be writing a fast API server
around that model and will have a
working http server ready which we'll use
to for the deployment in the production
and this is how the big companies do the
deployment. So you'll learn a lot of
practical useful tips today so make sure
you watch till then. Couple of
prerequisites for this video. You need to
know about fast API.I made a very simple
fast API tutorial that even a high
school student can understand it easily.
So you need to watch that the second
prerequisite is.
I made a simple video on TF serving, like
what's the purpose of tf serving, how it
is useful in ML ops. You need to watch
that as well and obviously this is part
three of this project series. So I'm
assuming you have watched the part one
and part two. So let's get started in our
last video. We exported models to a
models directory
which is this particular directory and
I'm going to rename it to saved model
because that I feel is an appropriate
name
and I'm going to create
an API folder here and in this API
folder. We are going to write our fast
API based server. Now let's install some
prerequisites, some modules
to write this API server. So if you go
to my github page on potato this is
classification go to API directory and
requirement.txt file. You can either get
clone it or
you can right click here.
Create
requirements dot txt
and
once you have that file
I will just
copy paste this here
so requirement.txt is used to list down
all your Python dependencies
and then you can go to command prompt
here.
go to API directory. API directory I see
the
requirements.txt file. See
actually I need to save it so I have
saved this now
and it has all these requirements and I
can simply run pip install
hyphen r requirements dot txt and it
will install
all these modules. You can install them
in the individually but this approach
is a little better. So now all our
modules are installed.
I'm into Pyjam here
in API folder. I will go right click
create a file called
main.pi and here
first thing I'll do is I'll go in a zen
mode. I will start meditating now
and from
fast API
import
if you've seen my first API tutorial
these are like
this like bare bone that you need so
here you will create an app
which will be an instance of
fast api and let's write a very simple
you know ping type of routine so you
will say async
def
ping
okay and
you can just return
hello I am alive. I am writing this
routine
just to make sure that my server is
alive you know we can call this ping
method and make sure our server is alive
it's not crashed or it is not stopped
and here you can say app.get
and this is how you specify an endpoint
and I will say ping you can say hello hi
whatever
so this is
barebone fast API ping server ready so
let's test this now
to run this server you can use uicon
command based on my fast API tutorial.
you can say main because that's the name
of the file right? Main dot pi so you'll
see main column app which is this
variable here app
and
reload. You can run it this way but I'm
going to run it in a little different
way and that way is this- I will import
uvicon
as a module
and you can
say if
your standard Python way you know if
name is equal to main
uvcon
dot run
and here
you will specify your app, your port the
support is
let's say 8 000 you know
and your host
your host is
let's say localhost
and I can right click run it
okay?
Oh this there is a syntax error
so I can run it the server is ready to
run and now
I can go to my browser
and type in
localhost 8000
ping
when applying say I get this- hello I am
alive which means my server is ready you
can also look at docs
and that will give you all the
documentation so I have ping method
you can say try it out execute and that
is another way of testing your server.
See I just got this response- hello I am
alive so my basic bare bone is set up.
now I am going to write my actual
predict method so let me just
copy paste here
and instead of get this will be a post
a
record and trying to get something
you are doing model prediction
and probably post is an appropriate
method for that.
You will call this entry point predict
and this one also predict by the way
these two functions.
These two can be different you know. I
can do like predict foo as well they
don't have to be same really.
Now,
what will be the argument of this. Okay,
let's think about that.
So this will be a file
sent by your mobile application or
website. It will be an image of a potato
plant leaf
and that image since fast API provides
inbuilt validation. If you use an upload
file as a data type and let's do some
Google search on what upload file really
is. So if you do
fast API upload file you find this
documentation
and you can read through it. But the idea
is when you use something like this
the
fast API will make sure
that whoever is calling this predict
method
they have to send the
image or the file as an input. If you
send let's say an integer or a string as
an input it's not going to work. So I'm
just going to copy paste
this thing here.
Usually in Python you have syntax scene.
Syntax like this you just give a
variable name but when you do colon
it is type hint. It is this is a data
type basically. You are saying upload
file is my data type and when you say
equal to this it means this is your
default
value okay.
So now right click
run it.
I will show you what happens, okay. File
is not defined because I need to import
and upload file
here and I can run it so now this is
running. It is ready.
Let me just stop it here. Actually I will
put a break point at
each
stage and I will show you in the
debugger how it looks. So this is how you
put a break point right click debug
and I will go to
okay what I will do is
now
localhost see my documentation shows
this and I can use
this web UI to post the actual request
now see fast API is telling that my
input has to be a file so you need to
choose a file here. So this is one way
of testing your
API
the other better way which I like is
using postman. So
go to Google and just say download
postman you find this website
download the app based on what OS you
have and once you have that
app installed you can go here and run
postman. okay.
I have this postman running here
and I'm going to use a post method so
here
select post
and type in
http
localhost predict
Okay so local host
because it's running on 8 000 and I'm
calling a predict method
and it is expecting
file as a parameter. So here I will go to
body
and I will say form data because we are
sending a form data when UI sends a data
it will send a form data here and this
will be file.
This is file because this is also file.
If this is let's say x y z then this
will also be
xyz. You get the point right?
So that's why this is file.
and
here you select the file. So I'm
putting let's say
late blight image here.
Okay and when you say send since I have
this server running in debugger mode it
has now stopped here and you can see the
data type of files. This file in the
variable pan here it says it's an upload
file.
Beautiful!
So
fast API is good in this sense if you
were writing a flash server you'll have
to do lot of manual validation. So fast
API is much better. Okay now I have this
upload file I need to convert it to a
numpy array or a tensor so that my model
can do prediction. So my next step
naturally is going to be how do I
convert this file
into numpy array. Well first what I need
to do is
let me just stop this here.
I need to do file read I need to read
the files you know because that way I
get my bytes back
okay? As simple as that.
And you can read the documentation of
this on fast API.
Now this is an async routine so I will
call await. The benefit of doing await is
that if 100 callers are calling your
server let's say you have one server
running and there are 100 mobile
applications. You know 100 farmers are
using your mobile application they all
are sending
predict requests and they're attaching a
huge file. Now let's say to read the file
it takes two seconds. If I don't have
async here let's say and if I don't have
a weight here what's gonna happen is
my first request let's say it takes two
seconds to read my other request will be
waiting
but if I do
await an async
while
my first request takes two seconds to
read the file, it will put this function
in a suspend suspend mode and my second
request can be served. You can go to
youtube and watch some tutorials on
async io so you will get an
understanding of
why we are using async and oh wait so I
will put a breakpoint here again I
always run it in debug mode and once I
run it in debug mode I go to postman
send it here
and you see bytes
I love breakpoints and debugging because
that way I can evolve my program step by
step. So now I have all the content of
the file read as bytes and I need to
convert these bytes into numpy array.
Let me write
a simple function
called read file as image
and that returns me.
Basically image is nothing but a numpy
array.
Here this will be a regular Python
function which takes data which is
basically bytes as an input and it
returns np dot nd array
as an
output, okay?
Now here I need to import numpy
as
np
and then
I will also import couple of things so
this data that that's coming in
is bytes
and you can use bytes IO Python module.
Let me import that python model here
and when you do bytes IO
like this
you can supply this thing to
[Music]
pil directory so pll sorry pi mod model
pillow module is a model that is used to
read images
in Python and from below module I have
imported image class and when you do
this
image
dot open
it will
open
it will read those bytes as an image
as a pillow image and to convert pillow
image into numpy array you can just say
np array
like this
and that will be your numpy array which
you can return as an image
and that's it. So now let's put a
breakpoint again.
So I put a breakpoint here
again
clicking on my bug icon
and again sending a request so now let's
see what do I have an image? Hooray
party!
India rate 256 by 256 that's my x and y
of image and 3 is for rgb each of these
values are between 0 to 255 range so my
nd array is ready.
Now I need to
load my saved tensorflow model
so to say load my tensorflow model first
I need to import
tensorflow
and then I will create a global variable
you know let's call it model
and tf dot
keras dot
models
dot load model.
So that will load the model. Now, let's
give our directory path of this model. If
you look at my
directory structure I'm here okay in API.
I have to go to a parent directory then
go to saved model and I will load any
model these three models are same
in practical situation you might have
different versions of model with you
know different accuracy or
you know they will have different
metadata but I will use this one model.
So to go to parent directory you see you
will have to do dot dot then you will do
saved models
then you will do one I'm just you know
using version one. We will also look into
using a tensorflow serving by which we
can dynamically
load version 1 2 and so on. But right now
I'm just keeping things simple.
I'm also going to create
a variable not called class names which
will have all these three class name. Now
these three need to be consistent with
what you had in your notebook when you
were training the model. Now here I will
do model dot predict
image
but
this image it doesn't accept a single
image it has to be a batch image and you
can
basically.. I have see
I have this image okay 256 by 256x3
this predict function doesn't take
single image as an input. It takes
multiple images so it has to be
like this.
Okay and the way you can do that is by
doing
np
dot expand dims
and you can give image
and zero.
So if you read the documentation of this
thing
cnp expand dims. What it is doing? It's a
simple
APi friends. Nothing confusing. See you
have one dimensional array when you do
expand this, it is just adding one more
dimension. That's it. It's making it two
dimensional and if the axis is one it
will add that dimension at a column
level. So read the documentation is
super duper
easy
API.
So I will
call this
an image batch
and that image badge goes to predict
as an input
and what you get as a return is
predictions.
Okay I'll stop here and I'll again run
it and see what happens.
So I clicked on debug icon here
now model is loading. So it will take
some time. So have some patience
and you need to see a prompt here which
will say
uv con running on this then only you can
run your postman. You send it
and now see this is also taking time
because prediction is a little bit time
consuming. See my predictions are
one by three so it's an array
because my batch has only one image
so the second dimension
is
my actual prediction and if you look at
this prediction
you know the second one is
see it has three values. Let me just show
you
point twenty nine e raised to minus nine
so this is very
very small number. Zero point zero zero
something
nine point nine. So this is almost 0.99
and this is again you see e raised to
minus 0 6.
So I have three classes. So the first
number corresponds to early blight
second number corresponds to late blight
third one is healthy. So looks like this
is late blight because the second number
is a highest value.
So you what you have to do here
is
you have to always take
So I will call this predictions and in
this prediction you need to take
zeroth prediction because it's a batch.
And in batch you have given only one
image so
take zeroth prediction
and
the class. The class name okay class name
would be
whoever is the maximum value.
So what np dot argmax will do is this
see I had three values, correct?
I had zero point zero zero something
then I had zero point nine nine
something and I had zero point again
zero zero something.
So when you do rp np dot arg max it will
look for maximum value. Maximum value is
this what is the index of this index is
So this is 0 1 2 so the index is 1 so it
will return
1
if my value here is let's say something
like this
then it will return to-
Okay so that's my np dot argmax function
and this index when you give it
to
this particular array it will tell you
the actual class name
Okay so here I will say predicted
class is equal to whatever is this index
and
class name you just give it as an index
to this
then the confidence will be the np dot
max so
prediction zero.
Okay so np dot max again
I'll just show you
you have a value like this
you know
then what is the maximum value 0.99. So
this function returns point
99 it is as simple as that and that I
will store in a variable called
confidence
and you can just
return
these two
in a simple dictionary. So let's run this
and see what happens. So I'm going to run
this
and whenever it's ready
postman send it
see late blight now you can
easily
taste it this is lead blood okay let's
taste healthy
healthy leaves and it says healthy with
98 percent
confidence. So you can try different
images you know
like early blight for example
early blight. Let's say this one
I'll leave let's see the score the
confidence is very high and my
model is perfect perfectly predicting
the accurate class for each of these
images.
Now
loading the model from one specific
saved version might work okay for demo
applications but in real life in big
corporate companies you have new
versions of these models being built all
the time.
And let's say
version number one is a production model
it is stable but then you come up with
version number two which is beta model
which view you want to try it out on
your beta users.
So now you are running kind of
a b testing scenario where you're
comparing the performance of version one
with version two and you want to
dynamically route the traffic to
some traffic. Let's say 10 traffic to
beta users and remaining traffic to
your production users now you can do all
of that in this core you can just maybe
call it like a broad model
and you can call
let's say
version two a beta model and dynamically
you
use one or the other based on the user
type
but there is a better way to handle this
situation and that better way is using
tf serving. If you go to Youtube code
basics TF serving you'll find my video I
highly recommend you watch it because
I'm not going to cover all the concepts
behind TF serving but just to give you a
gist of it you can have
in your UI code or client code you can
have this type of urls or endpoints for
your TF serving you can say prod URL is
labels you know you can use labels. You
don't need to use specific version you
can say
my tf serving endpoint label is prod
versus beta and the corresponding config
file might look this like
this is prod actually because this is
production
versus prod. But this is broad and this
should be broad
so that prod is pointing to specific
version and things get dynamically
loaded version management becomes very
very easy
if you want to.
Let's build a new version let's say
version 3
and point that to beta all you need to
do is just change this config file. You
don't need to change your code when you
change a code you have to do a lot of
testing. It might break things. It's risky
but if you change config file
it's little safer
again I
friends you have to watch this video to
get an understanding of TF serving
now we are going to change our
architecture a little bit and we are
going to do this.
What we are going to do is
our
UI website or right now we are using
postman we have not built this website
yet but the postman will call fast API
server. fast API will do
maybe numpy conversion and all those
things but for actual prediction it will
call a TF serving
server which is running on localhost
8501
to run tf serving. I will start windows
powershell because I like to run it that
way
and I will use this command so I'll
explain
okay let me
if you have seen. By the way if you've
seen my tf serving video you will get an
idea you need to install docker and you
need to get a docker image for TF
serving. That's why if you watch
this video code basics TF serving
you would get all the understanding on
how what to install and you know how
different concepts work. So now we are
running docker run we are saying
port 8501 on my host system maps to
eight five zero one on my docker
container image.
And my directory is c code potato
disease that is mapping to slash potato
dishes directory inside my docker
This is the name of the container. So I'm
running tensorflow serving
my rest API port is eight five zero one.
So if you look at this image see eight
five zero one that's where my TF serving
is running and my model config file is
potato. This is dot config okay. So potato
disease model dot config. So here this is
how the file looks like I'm saying okay
run all the versions which are available
so I will run all the versions and when
you do
prediction by default. It will use latest
version which version 3 but you can
explicitly specify version as well this
way if you
you know build a new version 4 it will
automatically use that for version. I'm
using a very simple models config file
but
you can use
this kind of fancy file where
you know you have different versions for
production beta and so on
all right
So now
here I will say end point let me
let me go into meditation. I love
meditation okay point end point what is
my endpoint?
You have any guess?
See this is how you will use version two
but I don't really want to use version
two. I want to use
latest version so this way
it will use latest
version basically.
So if I build a version 4 it will use
version 4. If I build a version 5 it
will use version 5. So it's dynamically
loaded or loading.
All those things okay
All right! What did I do okay? Let me make
you know let me make a new
file actually.
I will keep this so that you have a
reference
this was a model
and I will copy with ctrl c, ctrl v
and I will call it
main
TF serving dot pi
and in here
I will
make those changes. So my
endpoint is this
of course I'm not using any specific
version
[Music]
and
now
here
[Music]
okay?
So we'll use a request module so I will
import request model to make that http
call
and in this request so instead of
model.predict. You are using request.post
request dot post
and in that post
you are specifying your end point you
the
in json here. You need to specify json
data. Okay what is my json data by the
way?
This is my actual request okay? How does
my actual request look like?
If you have seen my TF serving video. You
know that
the weightiest learning works? Is it will
expect instances. You know as a
dictionary key and here you need to
supply
your image batch.
You can say image batch to list just
convert it to list this is the format
that it expects basically.
And what you get as a result is your
response. So let's do just this much-
Let's set a breakpoint and once again
test everything from the postman. So I'll
set a breakpoint here
I will stop everything. I will
do debugging I'll start debugging here
and once the prompt comes up when it
says local host this is ready I go to my
postman sender request
and in the response
say 200
http request 200 response 200 minutes
everything is fine
and
it has predictions here.
So you need to
[Music]
use
I think json here okay so predictions
looks okay so our request everything
worked fine. So let's further enhance
this code and kind of
complete it so here
the response dot json will contain the
actual response in that
there was an element called predictions
in that.
I want to look at the zeroth image see
we are supplying a batch of images but
actually we have a batch of only one
image. So zeroth location is your first
response. Okay so I will say I will call
it prediction not predictions
singular because zeroth image. Okay now
here
we already saw previously that if you do
np dot arg max on prediction
and np dot max
on prediction there are two two type of
things. What do you get?
Well you have already seen this video
right? Here you get the confidence
and here you get
the predicted
class
and this is something you return back
as a
dictionary.
So predicted class is
this one here
and confidence is
this guy here
this is pretty much it. I can
just run the server now
my server is running okay
send a request internal server error
I think I'm realizing the problem the
predictions here. I think I need to do
this np dot array
so I need to convert this into a np.
Otherwise I can't call this npr
arg max function you know
I converted it to that
run it again
stop and run
just set a breakpoint. Here I'm trying to
see where exactly that error is
occurring so until this point is okay
see my response looks good
and here
if I do response.js
let's load that step by step
so response.json
see
response.json also looks good
and now let's do
not able to edit this
so response.json is a dictionary and in
that dictionary you are saying
predictions from my spelling is most
likely okay
and zeroth prediction
and you're doing np dot alright so I
think that should work okay.
Oh so that that worked okay? Actually
okay.
Now
dot arg max let's see if that works
Okay that works. Okay confidence also
worked okay
So yes giving
these two back
what happened was
I previously had a session running on
zero 8501 port
for my docker tensorflow survey. So I'm
now running it on eight five zero two so
I change eight five zero to s five zero
to here eight five zero two
and I will
just
this and see this is now running
it says this is running okay
and here
I'm just changing my port
8502
I'm not sure if this is what is the
cause for the error but
i'll run it anyway.
Okay that wasn't
that wasn't a cause okay. Oh I'm not
supplying the file okay let me just
apply the file.
Now I got the same error
Let me try different file
Okay I'm realizing the error here I have
to use class name actually
because it was a numpy number numpy is
data type is not supported by first api
that's why so this is the
you know actual class name so now when I
run this
hopefully let's see if it works
correct. See it worked only blight you
can
attach an image of a healthy potato
plant send it not say healthy so things
are working just perfectly okay right
now so just to go over it again.
We sent a request here using request
module
and went to localhost 8502 for TF
serving okay and that returned us a
response which we are returning back to
the UI
we have for today in the next video we
will build a website in React.js where
you can drag and drop potato plant leaf
image and it will call fast EPS server
that we wrote today
for the prediction.
Check video description below I have
provided all the useful link including
the link of this playlist and I have the
by the way same series available in
hindi. So if you have issues with English
you can also watch this series in Hindi.
Thank you.
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.