YouTube Transcript:
Deep learning project end to end | Potato Disease Classification - 4 : FastAPI/tf serving Backend

Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.

AutoDub

Understand YouTube Foreign Videos

Immersive YouTube Voice Translation

Break language barriers, embrace global quality content

Solve Foreign Video Barriers Instantly

Video Transcript

Video Summary

Summary

Core Theme

This content details the practical implementation of deploying a machine learning model for potato disease detection. It guides viewers through building a FastAPI server to serve a pre-trained TensorFlow model, first directly and then by integrating with TensorFlow Serving for more robust deployment.

Mind Map

Click to expand

Click to explore the full interactive mind map • Zoom, pan, and navigate

what do you want to learn baby today?

Well I understand baby Yoda's language

what he's saying is he wants to learn

TF serving and fast API today and he

wants to go through some mlops concepts.

In first two videos we

trained a model

we exported it on a disk. In this video

we will be writing a fast API server

around that model and will have a

working http server ready which we'll use

to for the deployment in the production

and this is how the big companies do the

deployment. So you'll learn a lot of

practical useful tips today so make sure

you watch till then. Couple of

prerequisites for this video. You need to

know about fast API.I made a very simple

fast API tutorial that even a high

school student can understand it easily.

So you need to watch that the second

prerequisite is.

I made a simple video on TF serving, like

what's the purpose of tf serving, how it

is useful in ML ops. You need to watch

that as well and obviously this is part

three of this project series. So I'm

assuming you have watched the part one

and part two. So let's get started in our

last video. We exported models to a

models directory

which is this particular directory and

I'm going to rename it to saved model

because that I feel is an appropriate

name

and I'm going to create

an API folder here and in this API

folder. We are going to write our fast

API based server. Now let's install some

prerequisites, some modules

to write this API server. So if you go

to my github page on potato this is

classification go to API directory and

requirement.txt file. You can either get

clone it or

you can right click here.

Create

requirements dot txt

and

once you have that file

I will just

copy paste this here

so requirement.txt is used to list down

all your Python dependencies

and then you can go to command prompt

here.

go to API directory. API directory I see

the

requirements.txt file. See

actually I need to save it so I have

saved this now

and it has all these requirements and I

can simply run pip install

hyphen r requirements dot txt and it

will install

all these modules. You can install them

in the individually but this approach

is a little better. So now all our

modules are installed.

I'm into Pyjam here

in API folder. I will go right click

create a file called

main.pi and here

first thing I'll do is I'll go in a zen

mode. I will start meditating now

and from

fast API

import

if you've seen my first API tutorial

these are like

this like bare bone that you need so

here you will create an app

which will be an instance of

fast api and let's write a very simple

you know ping type of routine so you

will say async

def

ping

okay and

you can just return

hello I am alive. I am writing this

routine

just to make sure that my server is

alive you know we can call this ping

method and make sure our server is alive

it's not crashed or it is not stopped

and here you can say app.get

and this is how you specify an endpoint

and I will say ping you can say hello hi

whatever

so this is

barebone fast API ping server ready so

let's test this now

to run this server you can use uicon

command based on my fast API tutorial.

you can say main because that's the name

of the file right? Main dot pi so you'll

see main column app which is this

variable here app

and

reload. You can run it this way but I'm

going to run it in a little different

way and that way is this- I will import

uvicon

as a module

and you can

say if

your standard Python way you know if

name is equal to main

uvcon

dot run

and here

you will specify your app, your port the

support is

let's say 8 000 you know

and your host

your host is

let's say localhost

and I can right click run it

okay?

Oh this there is a syntax error

so I can run it the server is ready to

run and now

I can go to my browser

and type in

localhost 8000

ping

when applying say I get this- hello I am

alive which means my server is ready you

can also look at docs

and that will give you all the

documentation so I have ping method

you can say try it out execute and that

is another way of testing your server.

See I just got this response- hello I am

alive so my basic bare bone is set up.

now I am going to write my actual

predict method so let me just

copy paste here

and instead of get this will be a post

record and trying to get something

you are doing model prediction

and probably post is an appropriate

method for that.

You will call this entry point predict

and this one also predict by the way

these two functions.

These two can be different you know. I

can do like predict foo as well they

don't have to be same really.

Now,

what will be the argument of this. Okay,

let's think about that.

So this will be a file

sent by your mobile application or

website. It will be an image of a potato

plant leaf

and that image since fast API provides

inbuilt validation. If you use an upload

file as a data type and let's do some

Google search on what upload file really

is. So if you do

fast API upload file you find this

documentation

and you can read through it. But the idea

is when you use something like this

the

fast API will make sure

that whoever is calling this predict

method

they have to send the

image or the file as an input. If you

send let's say an integer or a string as

an input it's not going to work. So I'm

just going to copy paste

this thing here.

Usually in Python you have syntax scene.

Syntax like this you just give a

variable name but when you do colon

it is type hint. It is this is a data

type basically. You are saying upload

file is my data type and when you say

equal to this it means this is your

default

value okay.

So now right click

run it.

I will show you what happens, okay. File

is not defined because I need to import

and upload file

here and I can run it so now this is

running. It is ready.

Let me just stop it here. Actually I will

put a break point at

each

stage and I will show you in the

debugger how it looks. So this is how you

put a break point right click debug

and I will go to

okay what I will do is

now

localhost see my documentation shows

this and I can use

this web UI to post the actual request

now see fast API is telling that my

input has to be a file so you need to

choose a file here. So this is one way

of testing your

API

the other better way which I like is

using postman. So

go to Google and just say download

postman you find this website

download the app based on what OS you

have and once you have that

app installed you can go here and run

postman. okay.

I have this postman running here

and I'm going to use a post method so

here

select post

and type in

http

localhost predict

Okay so local host

because it's running on 8 000 and I'm

calling a predict method

and it is expecting

file as a parameter. So here I will go to

body

and I will say form data because we are

sending a form data when UI sends a data

it will send a form data here and this

will be file.

This is file because this is also file.

If this is let's say x y z then this

will also be

xyz. You get the point right?

So that's why this is file.

and

here you select the file. So I'm

putting let's say

late blight image here.

Okay and when you say send since I have

this server running in debugger mode it

has now stopped here and you can see the

data type of files. This file in the

variable pan here it says it's an upload

file.

Beautiful!

fast API is good in this sense if you

were writing a flash server you'll have

to do lot of manual validation. So fast

API is much better. Okay now I have this

upload file I need to convert it to a

numpy array or a tensor so that my model

can do prediction. So my next step

naturally is going to be how do I

convert this file

into numpy array. Well first what I need

to do is

let me just stop this here.

I need to do file read I need to read

the files you know because that way I

get my bytes back

okay? As simple as that.

And you can read the documentation of

this on fast API.

Now this is an async routine so I will

call await. The benefit of doing await is

that if 100 callers are calling your

server let's say you have one server

running and there are 100 mobile

applications. You know 100 farmers are

using your mobile application they all

are sending

predict requests and they're attaching a

huge file. Now let's say to read the file

it takes two seconds. If I don't have

async here let's say and if I don't have

a weight here what's gonna happen is

my first request let's say it takes two

seconds to read my other request will be

waiting

but if I do

await an async

while

my first request takes two seconds to

read the file, it will put this function

in a suspend suspend mode and my second

request can be served. You can go to

youtube and watch some tutorials on

async io so you will get an

understanding of

why we are using async and oh wait so I

will put a breakpoint here again I

always run it in debug mode and once I

run it in debug mode I go to postman

send it here

and you see bytes

I love breakpoints and debugging because

that way I can evolve my program step by

step. So now I have all the content of

the file read as bytes and I need to

convert these bytes into numpy array.

Let me write

a simple function

called read file as image

and that returns me.

Basically image is nothing but a numpy

array.

Here this will be a regular Python

function which takes data which is

basically bytes as an input and it

returns np dot nd array

as an

output, okay?

Now here I need to import numpy

and then

I will also import couple of things so

this data that that's coming in

is bytes

and you can use bytes IO Python module.

Let me import that python model here

and when you do bytes IO

like this

you can supply this thing to

[Music]

pil directory so pll sorry pi mod model

pillow module is a model that is used to

read images

in Python and from below module I have

imported image class and when you do

this

image

dot open

it will

open

it will read those bytes as an image

as a pillow image and to convert pillow

image into numpy array you can just say

np array

like this

and that will be your numpy array which

you can return as an image

and that's it. So now let's put a

breakpoint again.

So I put a breakpoint here

again

clicking on my bug icon

and again sending a request so now let's

see what do I have an image? Hooray

party!

India rate 256 by 256 that's my x and y

of image and 3 is for rgb each of these

values are between 0 to 255 range so my

nd array is ready.

Now I need to

load my saved tensorflow model

so to say load my tensorflow model first

I need to import

tensorflow

and then I will create a global variable

you know let's call it model

and tf dot

keras dot

models

dot load model.

So that will load the model. Now, let's

give our directory path of this model. If

you look at my

directory structure I'm here okay in API.

I have to go to a parent directory then

go to saved model and I will load any

model these three models are same

in practical situation you might have

different versions of model with you

know different accuracy or

you know they will have different

metadata but I will use this one model.

So to go to parent directory you see you

will have to do dot dot then you will do

saved models

then you will do one I'm just you know

using version one. We will also look into

using a tensorflow serving by which we

can dynamically

load version 1 2 and so on. But right now

I'm just keeping things simple.

I'm also going to create

a variable not called class names which

will have all these three class name. Now

these three need to be consistent with

what you had in your notebook when you

were training the model. Now here I will

do model dot predict

image

but

this image it doesn't accept a single

image it has to be a batch image and you

can

basically.. I have see

I have this image okay 256 by 256x3

this predict function doesn't take

single image as an input. It takes

multiple images so it has to be

like this.

Okay and the way you can do that is by

doing

dot expand dims

and you can give image

and zero.

So if you read the documentation of this

thing

cnp expand dims. What it is doing? It's a

simple

APi friends. Nothing confusing. See you

have one dimensional array when you do

expand this, it is just adding one more

dimension. That's it. It's making it two

dimensional and if the axis is one it

will add that dimension at a column

level. So read the documentation is

super duper

easy

API.

So I will

call this

an image batch

and that image badge goes to predict

as an input

and what you get as a return is

predictions.

Okay I'll stop here and I'll again run

it and see what happens.

So I clicked on debug icon here

now model is loading. So it will take

some time. So have some patience

and you need to see a prompt here which

will say

uv con running on this then only you can

run your postman. You send it

and now see this is also taking time

because prediction is a little bit time

consuming. See my predictions are

one by three so it's an array

because my batch has only one image

so the second dimension

my actual prediction and if you look at

this prediction

you know the second one is

see it has three values. Let me just show

you

point twenty nine e raised to minus nine

so this is very

very small number. Zero point zero zero

something

nine point nine. So this is almost 0.99

and this is again you see e raised to

minus 0 6.

So I have three classes. So the first

number corresponds to early blight

second number corresponds to late blight

third one is healthy. So looks like this

is late blight because the second number

is a highest value.

So you what you have to do here

you have to always take

So I will call this predictions and in

this prediction you need to take

zeroth prediction because it's a batch.

And in batch you have given only one

image so

take zeroth prediction

and

the class. The class name okay class name

would be

whoever is the maximum value.

So what np dot argmax will do is this

see I had three values, correct?

I had zero point zero zero something

then I had zero point nine nine

something and I had zero point again

zero zero something.

So when you do rp np dot arg max it will

look for maximum value. Maximum value is

this what is the index of this index is

So this is 0 1 2 so the index is 1 so it

will return

if my value here is let's say something

like this

then it will return to-

Okay so that's my np dot argmax function

and this index when you give it

this particular array it will tell you

the actual class name

Okay so here I will say predicted

class is equal to whatever is this index

and

class name you just give it as an index

to this

then the confidence will be the np dot

max so

prediction zero.

Okay so np dot max again

I'll just show you

you have a value like this

you know

then what is the maximum value 0.99. So

this function returns point

99 it is as simple as that and that I

will store in a variable called

confidence

and you can just

return

these two

in a simple dictionary. So let's run this

and see what happens. So I'm going to run

this

and whenever it's ready

postman send it

see late blight now you can

easily

taste it this is lead blood okay let's

taste healthy

healthy leaves and it says healthy with

98 percent

confidence. So you can try different

images you know

like early blight for example

early blight. Let's say this one

I'll leave let's see the score the

confidence is very high and my

model is perfect perfectly predicting

the accurate class for each of these

images.

Now

loading the model from one specific

saved version might work okay for demo

applications but in real life in big

corporate companies you have new

versions of these models being built all

the time.

And let's say

version number one is a production model

it is stable but then you come up with

version number two which is beta model

which view you want to try it out on

your beta users.

So now you are running kind of

a b testing scenario where you're

comparing the performance of version one

with version two and you want to

dynamically route the traffic to

some traffic. Let's say 10 traffic to

beta users and remaining traffic to

your production users now you can do all

of that in this core you can just maybe

call it like a broad model

and you can call

let's say

version two a beta model and dynamically

you

use one or the other based on the user

type

but there is a better way to handle this

situation and that better way is using

tf serving. If you go to Youtube code

basics TF serving you'll find my video I

highly recommend you watch it because

I'm not going to cover all the concepts

behind TF serving but just to give you a

gist of it you can have

in your UI code or client code you can

have this type of urls or endpoints for

your TF serving you can say prod URL is

labels you know you can use labels. You

don't need to use specific version you

can say

my tf serving endpoint label is prod

versus beta and the corresponding config

file might look this like

this is prod actually because this is

production

versus prod. But this is broad and this

should be broad

so that prod is pointing to specific

version and things get dynamically

loaded version management becomes very

very easy

if you want to.

Let's build a new version let's say

version 3

and point that to beta all you need to

do is just change this config file. You

don't need to change your code when you

change a code you have to do a lot of

testing. It might break things. It's risky

but if you change config file

it's little safer

again I

friends you have to watch this video to

get an understanding of TF serving

now we are going to change our

architecture a little bit and we are

going to do this.

What we are going to do is

our

UI website or right now we are using

postman we have not built this website

yet but the postman will call fast API

server. fast API will do

maybe numpy conversion and all those

things but for actual prediction it will

call a TF serving

server which is running on localhost

8501

to run tf serving. I will start windows

powershell because I like to run it that

way

and I will use this command so I'll

explain

okay let me

if you have seen. By the way if you've

seen my tf serving video you will get an

idea you need to install docker and you

need to get a docker image for TF

serving. That's why if you watch

this video code basics TF serving

you would get all the understanding on

how what to install and you know how

different concepts work. So now we are

running docker run we are saying

port 8501 on my host system maps to

eight five zero one on my docker

container image.

And my directory is c code potato

disease that is mapping to slash potato

dishes directory inside my docker

This is the name of the container. So I'm

running tensorflow serving

my rest API port is eight five zero one.

So if you look at this image see eight

five zero one that's where my TF serving

is running and my model config file is

potato. This is dot config okay. So potato

disease model dot config. So here this is

how the file looks like I'm saying okay

run all the versions which are available

so I will run all the versions and when

you do

prediction by default. It will use latest

version which version 3 but you can

explicitly specify version as well this

way if you

you know build a new version 4 it will

automatically use that for version. I'm

using a very simple models config file

but

you can use

this kind of fancy file where

you know you have different versions for

production beta and so on

all right

So now

here I will say end point let me

let me go into meditation. I love

meditation okay point end point what is

my endpoint?

You have any guess?

See this is how you will use version two

but I don't really want to use version

two. I want to use

latest version so this way

it will use latest

version basically.

So if I build a version 4 it will use

version 4. If I build a version 5 it

will use version 5. So it's dynamically

loaded or loading.

All those things okay

All right! What did I do okay? Let me make

you know let me make a new

file actually.

I will keep this so that you have a

reference

this was a model

and I will copy with ctrl c, ctrl v

and I will call it

main

TF serving dot pi

and in here

I will

make those changes. So my

endpoint is this

of course I'm not using any specific

version

[Music]

and

now

here

[Music]

okay?

So we'll use a request module so I will

import request model to make that http

call

and in this request so instead of

model.predict. You are using request.post

request dot post

and in that post

you are specifying your end point you

the

in json here. You need to specify json

data. Okay what is my json data by the

way?

This is my actual request okay? How does

my actual request look like?

If you have seen my TF serving video. You

know that

the weightiest learning works? Is it will

expect instances. You know as a

dictionary key and here you need to

supply

your image batch.

You can say image batch to list just

convert it to list this is the format

that it expects basically.

And what you get as a result is your

response. So let's do just this much-

Let's set a breakpoint and once again

test everything from the postman. So I'll

set a breakpoint here

I will stop everything. I will

do debugging I'll start debugging here

and once the prompt comes up when it

says local host this is ready I go to my

postman sender request

and in the response

say 200

http request 200 response 200 minutes

everything is fine

and

it has predictions here.

So you need to

[Music]

use

I think json here okay so predictions

looks okay so our request everything

worked fine. So let's further enhance

this code and kind of

complete it so here

the response dot json will contain the

actual response in that

there was an element called predictions

in that.

I want to look at the zeroth image see

we are supplying a batch of images but

actually we have a batch of only one

image. So zeroth location is your first

response. Okay so I will say I will call

it prediction not predictions

singular because zeroth image. Okay now

here

we already saw previously that if you do

np dot arg max on prediction

and np dot max

on prediction there are two two type of

things. What do you get?

Well you have already seen this video

right? Here you get the confidence

and here you get

the predicted

class

and this is something you return back

as a

dictionary.

So predicted class is

this one here

and confidence is

this guy here

this is pretty much it. I can

just run the server now

my server is running okay

send a request internal server error

I think I'm realizing the problem the

predictions here. I think I need to do

this np dot array

so I need to convert this into a np.

Otherwise I can't call this npr

arg max function you know

I converted it to that

run it again

stop and run

just set a breakpoint. Here I'm trying to

see where exactly that error is

occurring so until this point is okay

see my response looks good

and here

if I do response.js

let's load that step by step

so response.json

see

response.json also looks good

and now let's do

not able to edit this

so response.json is a dictionary and in

that dictionary you are saying

predictions from my spelling is most

likely okay

and zeroth prediction

and you're doing np dot alright so I

think that should work okay.

Oh so that that worked okay? Actually

okay.

Now

dot arg max let's see if that works

Okay that works. Okay confidence also

worked okay

So yes giving

these two back

what happened was

I previously had a session running on

zero 8501 port

for my docker tensorflow survey. So I'm

now running it on eight five zero two so

I change eight five zero to s five zero

to here eight five zero two

and I will

just

this and see this is now running

it says this is running okay

and here

I'm just changing my port

8502

I'm not sure if this is what is the

cause for the error but

i'll run it anyway.

Okay that wasn't

that wasn't a cause okay. Oh I'm not

supplying the file okay let me just

apply the file.

Now I got the same error

Let me try different file

Okay I'm realizing the error here I have

to use class name actually

because it was a numpy number numpy is

data type is not supported by first api

that's why so this is the

you know actual class name so now when I

run this

hopefully let's see if it works

correct. See it worked only blight you

can

attach an image of a healthy potato

plant send it not say healthy so things

are working just perfectly okay right

now so just to go over it again.

We sent a request here using request

module

and went to localhost 8502 for TF

serving okay and that returned us a

response which we are returning back to

the UI

we have for today in the next video we

will build a website in React.js where

you can drag and drop potato plant leaf

image and it will call fast EPS server

that we wrote today

for the prediction.

Check video description below I have

provided all the useful link including

the link of this playlist and I have the

by the way same series available in

hindi. So if you have issues with English

you can also watch this series in Hindi.

Thank you.

Click on any text or timestamp to jump to that moment in the video

Most transcripts ready in under 5 seconds

One-Click Copy125+ LanguagesSearch ContentJump to Timestamps

Paste YouTube URL

Enter any YouTube video link to get the full transcript

Most transcripts ready in under 5 seconds

Get Our Chrome Extension

Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.

Add to Chrome — Free

Works with YouTube, Coursera, Udemy and more educational platforms

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube TranscriptPreparing your results…

YouTube Transcript:Deep learning project end to end | Potato Disease Classification - 4 : FastAPI/tf serving Backend

AutoDub

Video Transcript

Summary

Core Theme

Paste YouTube URL

Transcript Extraction Form

Get Our Chrome Extension

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube Transcript:
Deep learning project end to end | Potato Disease Classification - 4 : FastAPI/tf serving Backend