Transcript YouTube:
6.4210 Fall 2023 Lecture 6: Geometric Perception (Pt. 1)

Không cần xem cả video — lấy toàn bộ transcript, tìm kiếm từ khóa và sao chép chỉ với một cú nhấp.

Chia sẻ:

AutoDub

Hiểu Video YouTube Nước Ngoài

Lồng tiếng YouTube Immersive bằng tiếng Việt

Vượt qua rào cản ngôn ngữ, khám phá nội dung chất lượng toàn cầu

Dùng Miễn Phí

Transcript Video

Tóm Tắt Video

Summary

Core Theme

This content introduces the fundamental concepts of geometric perception for robotics, transitioning from an assumed "perception oracle" to using real-world sensor data (cameras) to locate objects. It focuses on the geometric problem of aligning 3D point clouds to determine an object's pose, laying the groundwork for more advanced perception techniques.

Mind Map

Nhấn để mở rộng

Nhấn để khám phá sơ đồ tư duy tương tác đầy đủ

all right welcome back

everybody if anybody's out there so I'm

I'm I could use the ethernet cable if

anybody's back there in the

booth but uh somehow my Wi-Fi is not working

working

again okay so today we're transitioning

into perception so let me even just as a

quick setup

right last time we built an almost

complete manipulation system if you will

right we

we had our Hardware abstraction our Hardware

simulation and we built on top of

that our differential inverse

kinematics block

which well after some

integration sent us EA

commands EA positions

um the diffi block needed to know the

current EA state so we had an important line

here okay but the whole thing was predicated

predicated

on something over here which

just had a gripper velocity trajectory coming

in okay and although

that's that's a complete closed loop

system hidden inside when we designed

the gripper trajectories and the and

then differentiated them to get the

gripper velocities we made a big

assumption which is that we knew where

the red brick was

okay so everything was sort of based on

this someone told me exactly where in

the world I needed to reach to and

that's just not good enough right at

some point we have to now uh use our

sensors to figure out where the brick is

to do the work so so far we assumed a

perception Oracle right someone you

could just query and tell us where the

the object was in the world

frame and we were using some of the the

the cheat ports like we could tell uh

the body pose directly from of any

object in the world right in our

Hardware station and so the goal today

is to stop using those cheat ports and

start using cameras instead and close

the simplest Loop we can around that whole

whole

system that's that could actually be run

robot

so I'm going to download

a ridiculously big mesh cat animation

over my over my phone's uh 5G and burn

my data plan for the month but hey here we

go it's ridiculously big just because

the mesh the

the the ycb objects if you know what

these are these this is a mustard bottle

from the ycb objects okay

database that's a little funny here

okay the meshes or the sorry the

material files the texture maps for

these files are like 50

megabytes so it takes a minute to

download okay this is roughly what we're

doing almost the same as what we did

last time the big difference is that

we're going to have cameras now in the

scene these are

cameras I brought some I don't always

have cameras in my pocket

but today I do okay yeah they're like

this okay this is the 415 this is the

435 real sense camera ceras okay super

you just USB plug them in um you're good

to go they're amazing depth cameras I'm

going to tell you a little bit about

them today okay what what's going on

here but that's what you see in the

scene are these death cameras sprinkled

throughout a bunch of them actually we

put three on each bin you'll understand

why okay and then mostly this is just

the same kind of thing um we did before

out where the mustard bottle is that's

the the role of perception we then we do

our standard gripper trajectory and make

our move from part A to point B now what

the heck is this thing that's getting

left behind Okay that thing is a point

Cloud which is computed by reading the

cameras at Time Zero okay and doing some

initial processing on the data that you

get back from those cameras in order to

make the decision about where to grasp

okay so it's I at the moment of

perception I went ahead and um you know

put those in okay I have observations

like this and then I have a model of

what those I expect those observations

to be and they end up matching okay and

we we'll do all that in by the end of

the lecture okay so we going to think in

again all right so like I said today is

the first day of perception we're going

to do a lot of perception throughout the

course today um is kind of the more

geometric view of perception we're not

just learning a deep Network that goes

from image to um to whatever

representation we want we're going to

start by doing the geometry version

now even though going from image with

through a deep Network directly is I

think by all accounts the best way to do

things today um I there's still a lot of

the a lot of perception tools that have

baked in them whether they neural

network or not some of the fundamentals of

geometry it focus it it builds

beautifully on what we did last time I

think on in the kinematics okay and if

you're if you're interested in uh ideas

like neural Radiance fields or um

putting baking 3D priors GE geometric

priors into your neural networks and

everything like this that this is going

to be the foundation you need to do that

kind of

work so today will do it's a slightly

old school version of perception but

it's the foundation okay so we'll do

yes we will we'll tell talk about where

Nerf comes into the stack later y

sense okay so um everybody knows about

the Deep learning Revolution I think a

few less people realize how much of a

geometry Revolution we've had at the

same time okay and it's it was powered

partly by Deep neural networks but even

before that it was powered by I don't

know autonomous driving companies really

caring about where the pedestrians were

in space people building a lot better

sensors I mean virtual reality and

augmented reality were a big driver too

okay and we started getting pretty

incredible things that had no no nural

networks involved okay this is even uh I

don't know eight years ago or something

like this Dynamic Fusion where we could

we started having algorithms that could

run in real time and build instantaneous

3D reconstructions of the perceptions

they were seeing and track you know

people moving around building sort of

these beautiful 3D models and that was a

culmination of algorithmic work of of new

new

sensors um and in particular the sensors

were not only higher quality but they were

were

faster faster to the point where if the

world doesn't change too much between

each frame then you can write a simpler

algorithm to do tracking and

reconstruction okay so there was just

this massive revolution in Geometry

that's happened sort of alongside uh the

Deep learning Revolution and

interestingly they've come together so

now you know there's people baking in

geometric priors into into neural

like okay so it started with um sensors

that we're thinking you know maybe

autonomous driving related so you see a

vadine actually spot has a vadine you

can stick on top of it it's still you

know relevant sensor today these are are

liars laser being shot out and bouncing

back and estimating the distance from

each point of light to the to the Target

okay and some of these reconstructions

are just absolutely amazing you'll see

an autonomous car driving through you

know City street and hundreds of yards

into the future you can see like a cat

walking around a bush or something like

that just incredible what kind of

resolution and range that Those sensors

um have that's like the luminar in

particular 500 meter range crazy okay

indoors though you'll tend to see a

different type of of laser scanner these

hakuu were were very very popular are

still very very popular for uh sort of

indoor navigation where the lighting

conditions are not as severe the

payloads can be a lot smaller the energy

budget is maybe smaller okay uh but lar

is definitely one of the tools that sort

of enabled this sort of

Revolution but alongside that were

cameras that um didn't only return depth

you know lists of numbers that are just

the depth but coupled RGB red green blue

you know color images with depth images

and you you get those in a handful of

different ways some of them are um

actually just using stereo processing of

the of a of the depth of the RGB images

so if you take an image with your right

eye and an image with your left eye and

you know the relative position of your

eyes you can do some quick stereo

matching that says well this block over

here looks a lot like the blocks on over

here and therefore the depth of that

those pixels must be at a certain range

okay um people do that now on fpgas for

instance on Specialized Hardware so that

you can package it nicely into a into a

block that basically is just putting

both an image and a depth this is the

the Carnegie multi sense is actually the

head that we carried around in Atlas the

entire time uh you'll see bumblebe point

grce from bumblebee there's a lot of of

systems that are doing like that okay

let me go through the the first round

first okay um another line of of the of

tools like this the connect sensor was a

big deal when that came out not only

because it worked very well but because

it was so inexpensive right so so

somehow the home entertainment uh gaming

World kind of revolutionized robotics by

building a sensor that we needed okay

that was fantastic uh the the first

version of connect worked with um

structured light so it's actually about

as simple I remember I mean this was an

idea for for decades and decades okay

but it became practical with the

Microsoft Connect where you actually

just project patterns the patterns will

deform on the object you can back out

from the geometry something about the

depth exgon there's a bunch of different

Hardware manufacturers that that built

uh structured light based depth cameras

the ones that I showed you here like the

415 we'll talk the most about is

actually uh projected texture stereo

okay so the problem with just taking two

camera images is if you're looking at a

white wall for instance there's nothing

to compare and contrast between the two

images okay so but if if you put if you

project in infrared for instance that

you know something that will just puts a

little bit of a pattern then you can

even in low contrast situations you can

get depth to come back out okay one of

the reasons that's nice compared to some

of the other cam options like the time

of flight I'll show you next um is that

these these uh cameras can work even if

they're looking at the same scene and

they won't interfere with each other

right so sometimes some of these cameras

if they're sending out pings or

something like this they can actually

actively interfere with each other

unless you synchronize them very very

carefully uh one of the projected

texture you can just point them and forget

okay um and then these days uh there

really is a a massive movement that was

can just go straight from RGB okay so a

lot of times even if you only have a

cell phone camera and you don't have it

actually the cell phone cameras have

depth sensors a lot of lot of them true

depth on the iPhone for instance okay A

lot of times you can actually build

beautiful 3D models even from a single

camera you don't even need the stereo

pair there's there's enough cues of

course from movement and other things

but there's also uh there's additional

information that can be available if

you've you know read everything on the

internet that's not what's happening in

this particular one this is a neural

Radiance field we'll talk about later

but just to say this was one of the

first videos of the neural radi field

showing that just from taking camera

images you could stitch together into

something that could project and

generate new images from new uh viewing

angles and there's also lines of work in

monocular depth that for instance will

just have uh you know a very clever

thing to do for instance is to drive a

car around with two cameras on it and

then just use the second camera to to

you know use stereo matching from there

but just learn a model from the single

camera to the depth

and then you know when it's time to make

a 100 million of them because you're

going to make it to a product you just

take away that extra camera and just use

the Learned mapping from image to depth

right so monocular depth is a a

surprisingly effective technology

now okay now is a good time if you have

I don't actually know what's in the iPad

so the I know that there's a true depth

sensor here if you look you know at the

back of these there's they're projecting

something here and they have the ability

to do uh you you do have a depth camera

on your iPhone I don't I don't have an

okay so the question was the second

question was about the light are on spot

so typically the the the vines on a car

or on spot I didn't the spot I brought

did not have a vadine on it it just had

the C surround cameras but yes typically

those are scanning liars and you have to

do some careful work to think about

blurring from Fast motions and timing it

true I think I think spot's pretty good

yes so um so the question is um you know

I mentioned fpga and the Cary head

uh yes I I think that basically um if

you're doing block matching stereo

that's the simplest algorithm and it's

they're doing more than that in those

heads but that's I think the the essence

of the of the hardness in the

computation is that you're doing a

relatively simple computation taking

like an 8 by8 block of pixels comparing

it to another 8 by8 block of pixels but

you have to do that for all possible

pairs in a row for instance like this so

it's a it's a very trivially paralyzed

operation and in order to get operating

at full frame rates I mean computers

have gotten faster and faster but it's

just a beautiful solution for for

specialized Hardware yes

yeah depends on the technology so um so

like Nerf out out of the box the neural

Radiance Fields uh don't have scale

unless you do some work a priori to to

tell them about the relative poses so

you can run a a different geometry

processing algorithm first most of the

projected um texture or or anything

that's projecting light actively does

have absolute scale they do have maximum

range and minimum range so a lot of them

um are actually it's kind of it can be

frustrating to use them you know you put

a beautiful sensor on your wrist and

then you realize that the minimum range

of like the D415 is3 M or something like

that so you know the last when you're

when you're getting close to the object

that camera becomes blind uh to the in

okay alongside this you know many

sources of of camera input we also have

a um a lot of different ways to store

that data when it comes in lots of

representations

a bit like we talked about

representations of

rotation there's a there's just a

handful of of different formats for

instance and um some of them are good

for some kind type of computation and

some are good for the other and you can

convert you should expect to convert

back and forth between them and figure

out which one's best for for particular sources

sources

so the image the the image that you get

directly out of a depth camera that has

RGB you think of that as an g b plus d

image so it' have four channels some the

first three are color values the last

one is just the depth and at every pixel

that's kind of the output of these

cameras by default okay so that's you

know one depth per pixel

right we're going to uh take those RGB

images which is a perfectly good

representation and convert them them

okay so while this is a you know 4

by uh size of the image representation a

of points in

3D possibly annotated with color values

attributes you compare that to some of

the ones you might have seen from a

graphics software Graphics course you'll

meshes or there's volumetric meshes

triangle meshes are surface

meshes and you can you can have

volumetric meshes in addition to surface

meshes you can think about s distance

presentation and increasingly now people

are choosing to store those

nerve is the neural Radiance

Fields I was mentioning before

is almost a sign distance function we'll

get to the nuances of that when we get

closer okay but this would be I mean

we're going to we're going to go into

each of these when it becomes most

relevant but I want to just sort of get

the the landscape out here first okay

you can also

grids okay there's lots of different

ways to represent

3D uh data like this okay some of the

algorithms you know will will really

more Naturally Fit with one versus the

other and most of the time you can go

oh no I I think by default the depth

channel the depth always will give you

some will have some minimum range some

maximum range and some resolution in

inside that range of course but you

should think of it as an image that has

for every pixel a depth specified so in

yeah yep yep you should think of every

yes um so so there's a lot of algorithms

in fact the one we we'll talk about

today doesn't use color to start

okay but you can potentially do better

values some algorithms will only use the

D part in fact I would say many of the

algorithms before deep learning really

came in would would have only very

limited use of the RGB values because

RGB I mean computer vision is hard let

me let just let's just take a second to

think to remember why computer vision is

hard right so if I take two slightly

different images okay of of of of a

similar scene if the lighting changes at

all right the color values are going to

go are going to be wildly different for

pixels that correspond to the same point

in in real space or if I take the same

object and put it in two different rooms

right the same point on the same object

is going to come up with has very

different color values okay but having

said that if you were to um I remember

the the time when I was I was with my

students and we were you know enjoying

how well RG techniques were starting to

work and I said okay today if you were

to pick if if someone could only give

you depth or only give you RGB what

would you pick and nowadays it's RGB all

the way RGB is so much more informative

there's so many things that you can't

see through a depth camera that you can

see in RGB and humans of course are very

very good at

that so so I think if you have a method

that's only limited using depth it's

it's probably limited it's probably not

the State ofthe art

okay all right so um let's dig in to

sort of this part of the pipeline first

go from RGB to to point cloud and start

seeing the connections between the

geometry of of these camera

representations and the geometry of

spatial transforms and the like and how

do we write optimization problems over

so this

but I could you could argue that

perception is just a hard kinematics

problem at least the problem the first

version of perception we're going to do

today okay it's certainly a controls

problem but uh but even uh even before

that we're going to think of perception

today as a as a kinema

problem what do I mean by that okay so

let me say I've got an

object in space okay so I'll do 2D

objects here

because my artistic abilities are

limiting in that way okay so let's say I

have an object in space and it's got

some coordinate system some canonical

my coordin frame o and this will be the

X and this will be the y axis

okay and I'm going to

say the first thing I want to do is um

represent this geometry I could

represent it as a series of of bases for

instance that would be most similar to

the triangle mesh in 2D it would just be

line segments okay but but instead I'm

going to use a Point Cloud

representation of that object okay so I

want to represent this object with a

boundary they're going to be points in

in now for my example here a 2d

space and they're going to be written in

object okay so I'll call those points

points

sort of my my model of the object

and and I'll write them as

um points right by P for for Point here

if I have model point I here and I'll

say that there my model exists in the object's

object's

frame okay position of the

frame okay now I have a camera that's

kicking out some other points hopefully

they're relatively similar okay similar

the space but maybe I have something

that is coming

out like this

this

okay by the

way you rarely get all the points from a

camera but we'll assume that for just a

points SI for scene

and I get what I get out of my camera is

frame and let's say I took great care

when I mounted my camera so maybe we can

say that the location of the

known if it's bolted to your hand or

something like that that could be also

just a forward kinematics problem to

figure out where the location of the

yeah it can yeah absolutely if it's if

it's bolted to the robot then you would

you would definitely as a function of

the joint positions right so far this is

just a a pose not a rotation or no not a velocity

velocity

out the transform of the object in the

world right you have a lot of the pieces

we have points in the world we have

camera points in the world model points

and scen points and then we have the the

yeah great great yes so um so the point

Cloud res so that's a really good

question so the question was you know

the camera I think of a camera image as

giving me points in a 2d picture right

but if I have a depth Channel inside

that then it's the first step that I

didn't I should have said is I'm going

to take those 2D points in in the the

camera and I'm going to project them

into a 3D point right by just applying

my my in in my camera frame that's easy

I just say that it's at at some depth in

the in the immediate frame now there's a

couple steps that go invol that are

involved in that so first of all there's

like intrinsics in the camera you have

to take out any Distortion from the lens

or something like this but this is

something we know a lot about it's still

a pain but it's uh but it's something we

know a lot about and then the other

thing is the extrinsics which is to take

those points in the camera frame and

bring it into the world frame and that

would be this this xwc that's the camera extrinsics

extrinsics

okay good but all those are are quite

doable to go from the 2D picture with a

ahead so um let's say this is exactly

the right the same object right my model

my model was perfect right so um and

that my sensor had zero noise okay

there's still things that can get in the way

way

right which is that like those points

might have been sampled in different

places along the you know there's

there's reasons why that's almost never

going to be perfect but as a toy problem

to start I'm actually going to say let's

consider the case where we've just taken

the model points and translated it

someone translated them through an

unknown transform and we're going to try

to get that back okay that's the easy

case and we'll look at the harder case

where there's noise and and outliers and

other things in between great

question okay there's another thing that

we have to assume to to well that we

will assume to get started

okay these are all just yellow dots okay

and these are all just yellow dots and

your incredible brain knows how to map

the you know knows that this yellow dot

probably corresponds to that yellow bot

dot okay but if it's just a list of

numbers on the computer that mapping the

correspondence it's called between those

points and this points is not given and

in general it has to be acquired by some

sort of logic right you have to figure

out which of these if you just have a

pile of points over here and a pile of

points over here figuring out those

correspondences is a massive part of the

problem okay but let's just start by

assuming that someone said that the I

point here matches the I point over here

and we'll solve the second part of that

yeah so so if let's say I had a cad

model I could take the these points

directly from the cad model so if you if

that's a helpful that's why I'm using

the word model sometimes the way you get

it is you you know put your your object

down in a nice situation and you get one

scan and then you use that as your model

for finding it in other things okay but

but think of this as like you've got a

cad model and then this is the real

object out in the world that I got returns

returns

scene

okay so step one

one will

will

correspondences sometimes the word I

feel like I forget to Define it but just

the mapping I would have defined it in

symbols in a minute but the mapping from

those points to these points are the correspondences

right

okay so in that case you can sort of

see that we have a nice little

kinematics problem an optimization

that the E Point um of the model should

correspond with um

the oops

oops [Music]

[Music] zero

zero

Mi that's the model in object frame

mapped to the world frame should correspond

to in the simple exactly one: one

correspondence problem these are the

obvious kinematic equations okay and in

unknown so the question becomes how do I

extract the pose given a list of you

correspondences now if we dig in just a

little bit to Theos repr presentation

you remember from our spatial

transforms that uh

doing this is equivalent to both the

okay I'm leaving off the W for my short hand

hand

but the W is

everywhere okay so really if the

unknowns are both the translation inside

there and the rotation the rotation

you'll remember we have lots of choices

about how to represent it this this is

always three numbers basically in 3D in

2D would be the two numbers here we have

many different ways we could possibly

represent the 3D

rotations but somehow we need to search over

over

these okay and that's the the

question so solving this is is actually

an inverse kinematics problem right

we're trying to figure out given the um

given the uh the data the points in

space we're trying to back out the

orientations and rotations so this is an

problem now let's stop and think for a

second here so last time we didn't

actually do inverse kinematics we did

differential inverse kinematics I said

inverse kinematics is

harder we're going to defer that to

later but this time we're going to go

directly I haven't written any

differential kinematics yet I've just

written a kinematics problem so this is

like you know the the positions and

orientations inside here is like our

generalized coordinates that we're

trying to back

out so what is the difference why am I

why did I advocate immediately def I k

for moving the arm around but I'm saying

we're going to have to solve the full

yeah I I think that's right so he says

you don't have the ground truth right so

effectively the the magic that happens

in the in the robot case is that you

know the initial conditions and you want

to change those initial conditions so

you have an initial q and you're making

small ch changes to that so we have a

place to linearize around okay in

perception you know at least once you

have to wake up and figure out where the

objects are right you don't have an

initial unless someone gives you a good

initial guess then you could be in the

land of differential kinematics inverse

kinematics okay but we're saying you

know at least once you have to figure

out the hard problem find the needle and

the Hast stack potentially okay now once

you solve that once I actually would

Advocate differential inverse kinematics

if you wanted to track for instance if

you want to do real-time tracking then

by all means you should be thinking

about gradients and and the like okay

but the onetime problem is a little bit

it must be solved I guess in the

perception case fortunately this is not

some complicated chain of of equations

that can lead to lots of uh of

nonlinearities and local Minima and

stuff this is about the simplest inverse

kinematics problem we could have to

solve and we're going to see it has

beautiful structure and good Solutions

Solutions okay

okay so let's start um we get to pick a

rotation representation now the the

derivation I'll do would go through fine

for at least for

querian and we're going to do rotation

matrices here just because I think it's

a little bit it's obviously a linear

equation in a 3X3 Matrix here so it's

just a little easier on the board um so

so let's let's say even though this is

my spatial algebra rotation we're going

to represent this with 3x3 matrices for

for the purpose of

question so um you know I've got a a mug

here at least once I'm going to say you

know I'm going to build a CAD model or

whatever I'm going to pick an origin of

my cad system and I'm going to just

Define an object relevant model of this

so I think that is the natural like if

you think about what you would have to

do to to build a model and write some

coordinate system the natural thing

would be to attach your coordinate

system to the object

itself okay and then the question of

then is of where it is in space becomes

what's where is that object in space

that's the second transform I think

yep yep so

uh pick the bottom corner you know it's

it's it's just like if you were in solid

works and you were you know you have to

just start drawing lines you got to pick

a z 0 somewhere okay and all of your

points are sort of relative to that Z 0

it doesn't matter if you put it in the

middle it doesn't put it in the matter

in the bottom corner but it's just the

frame of reference that you're going to

define those points relative to each

other thank you that's a good

question okay so so now to solve this

problem we want to back out those 9 + 3

numbers 12 numbers in order to make

those equations match okay little

Annoying that I picked the divider right

up okay that's the

all do you see even though it's it

doesn't look quite in the normal way but

do you see that's just a

linear equation right so this is you

should see that this is like ax is appro

approximately equal to B right except in

this case the

X would be

my three you know three

positions and then the the

12 rotation coordinates all stacked in

one line I just flatten those into one

vector the a has a bunch of stuff about

the P mi in here flipped around a little

bit and shifted okay and this has got

the P SI over

here inside that okay but it really is

just you could just rewrite that you

know if I flatten that out into a big a

matrix that's your data Matrix Big B

Matrix okay and you're trying to solve a

Le squares problem to back that

yes um so this is the scene points you

know in the I guess in the camera it

also is going to have insided the the

X WC which is I guess

given right in this bright pink thing on the

the

top that's the that's the right hand

side yeah and then the rest of it is

decision variables which are just

PM so if I just tried to solve ax equal

to B if I just tried to take an a

inverse what's wrong with that

right if you have any number of so you

need a some number of points we'll ask

you on the P set exactly how many points

you need for that to be well defined

okay you need some number of points for

this to just to have a unique solution

okay but most of the time you're going

to be in a situation where you have many

points many more than 12 points let's

say well you know if each point

contributes three things so you know

many more than four points hopefully if

you're at the fourpoint room you know

find another camera or something right

okay so if solving that with exact

equality would be very brittle for all

the reasons we brought up a minute ago

right if there's any noise if you

sampled uh slightly different points on

the surface just because of where the

position of your camera or something

like that that's not going to be a good

way to go so we're going to solve this

that wow it used to be like I had

options one through 32 but now it

actually just says chalkboard and Center

light turn on that's good they just

upgraded that this year okay that was

way too easy I should have done that before

sorry this is a I'm I'm I'm using this

as an abstract so so I could I want to

when I see this I think of a you know a

standard linear algebra problem where

typically in linear algebra when they

people write this they will use a and b

Matrix and what I was trying to convey

is that the the problem we have with

different variables that are you know

rooted in Geometry could be interpreted

as our standard ax B where the a matrix

is populated with the data from PM okay

and the B Matrix is populated from

yeah that's exactly the yes so you got

it right so so the reason I said almost

he says is this going to give you a

valid rotation Matrix right so what

really want to write in an optimization

world is I want to

minimize the difference between the

right and the left hand side this is

basically my ax minus B okay but I'll

say um

P plus

r p Mii minus uh P SII that's already

rotated into the world frame

okay sum over I I'm going to minimize

this over

P that's almost what I want to say okay

that's like the least squares version of

that fitting but there's one important

detail which is that not all nine

numbers you can't pick arbitrary nine

numbers and get a valid rotation Matrix

okay so what I'll do is I'll write in

here R has to be part of s

SO3 as a constraint this is the special

orthogonal group three okay uh and

you're going to understand it completely

Matrix one way to write that is with

additional constraints on the optimization

optimization

problem so I can I could write this you

know in the shorthand but what I in

order to implement that what I would

actually do is

write the determin the things that

define a valid rotation Matrix first of

all R transpose has to be R

inverse that's one constraint that makes

a valid rotation Matrix and the other

one is that the

one so this is the optimization problem

yes it's a good that's a good question

so um it turns out that this constraint

by itself

ensures that the determin is either plus

one or minus one but the minus one case

uh can get you and that would be called

an improper rotation which is a rotation

plus a reflection so if you want to stay

with the proper rotations you need the

extra constraint that's a good question

in practice we often we will drop this

and then check for Reflections

afterwards and flip it back because this

general in fact if you you think about

this so this we said is if in the

decision variables the inside of this is a

a linear

linear

function which means the squared is a

quadratic function so you should be

thinking I've got you know a quadratic

bowl like this that's a good case what

are these in terms of the constraints I

told you quadratic programming is

beautiful if you have L but it's defined

when you have linear constraints are

these constraints linear right this is

not a linear

constraint it's a it's a um it's also a

quadratic constraint the way you could

see that is you multiply by both sides

you say R * R

transpose equals I would be an

equivalent writing of that and so the decision

decision

variables have to be zero or one to make

all of the nine elements match okay but

those are each of those the entries in

this Matrix is is a squared of the

original decision variabl okay so this

constraint this constraint turns out to

be cubic in the 3X3 matrices in in 2x2

matrices it's just it's actually the

same as it's a quadratic again okay but

that can be in general

cubic so those are less good but it

turns out that um you know we have

really really good solutions to this one

this is this is like one potentially

ugly hard class of optimization problems

where this one we just we nail it okay

um let's actually do the 2D

example I think it's

useful to understand the optimization

landscape right what what are we setting

solve okay so let's say we're going to

do the two x two version of it so in that

that

case a 2 x two rotation Matrix um you'll

often see it it's cosine

Theta sin

Theta cos

Theta we could try to search directly for

for

Theta but to keep our analogy to the 3D

case what I'm going to instead do is I'm

going to just name a

variable I'll call it what did I call it

a for cosine Theta

so B for sin

Theta and also I'll parameterize my

Matrix as a b

b a and I'm going to search over A and B

this is a trick that you can't quite do

in so so in 3D if I I I would have had

to I could have done a b c d for

instance like

this it just happens that in 2D I know

there's not enough degrees of freedom I

know that I can solve away C and D so I

haven't done that but in 3D you don't

it okay we'll come back to it when it

when it makes sense then okay all right

so um so let's just multiply this out so

what is

r * R transpose equal I look like well

that looks

like a b b a times the transpose of that

A and that implies in order for this to

equal 1

01 that implies

1 and it says that AB minus ba has to equal

equal

zero okay so that's a two quadratic

constraints that Define the rotation

Matrix being a valid

rotation okay and this is the same thing

is happening in 3D you know same same

type of thing but you just have more

equations flying

around it happens if you wanted to say

that the determinant of this equals

pos1 in this case it's the same as this

right this is also the

one and that's because I took out the

improper rotations by my parameterization

yes yep yep sorry that's that was just

me spelling it out but you're right

true yep also because I did the change of

variables okay

so I made a animation of a visualization

of this okay so what I want you to think

of is a objective that's a quadratic

bowl and a constraint that is this

quadratic constraint

like went out of order but here it is

okay this is what it looks like okay so

I I took a few points and I made it I

made a you know my quadratic objective

and I can and I just I took my points

and I just rotated them I'm just trying

to back it bring it back so I can

actually dynamically rotate what those points

points are

are

okay and it's moving around and I'm my

goal is to find the bottom of the

quadratic bow but it has to be on this

constraint so if you look down from the

top this is the unit circle constraint

which looks like a cylinder I mean I it

projected up okay so I I have to find

the lowest point on the on the parabola

that intersects with the red constraint

and it turns out in the case where

there's no noise the minimum is always

going to be on the manifold right

because the best rotation that you could

find is going to have it's going to be a

valid rotation okay so in this case is

actually the good case now as soon as

you add noise that the the the parabola

could move off the unit circle and

you're going to need to project it back

onto the unit circle that's the

fundamental geometry of this problem

it's also it's very it's a very famous

problem it's the point correspondence

problem the woba problem there's it

comes up in all kinds of fields it's a

famous problem of this solve a quad

ratic objective onto the unit circle or

ahead it's only a little bit hard so it

is harder to do um basically I don't

know that I don't have the simple

relationship to just know that this is

negative sign Theta so I would have to

use nine numbers instead of I just used

in 3x3 I'll you know have ABC DF

right and then the um the arch transpose

R is still quadratic the determinant is

cubic for for a 3X3 Matrix

yeah okay but the geometry is roughly

uh that's this constraint and the

yep okay interestingly let me just get

one thing more thing in and then

interestingly we could have

parameterized the whole thing directly

with Theta okay that's only one variable

in 2D of course in 3D we'd pick cians or

we'd pick one of the other

representations in this case it's maybe

illustrative to see that if we just did

it in terms of theta what does that cost

function look like I threw that on a plot

okay and it is similarly beautiful and

good okay so now there's no constraints

I don't if I were to just parameterize

it with Theta then I would always get a

valid rotation out I don't need this so

I could just write WR my

objective but the objective is no longer

quadratic it's a it's a non-convex

objective it's got signs and cosines in

the middle of it and the signs and

cosines multiplied out you get you got

cosine squar whatever you know to the

second power gives you a cost landscape

that looks like this and if I you know

move my Theta that rotated the two

relative to each other then the minimum moves

moves

correspondingly luckily you know all of

the minimum are good in this case

they're all just 2 pi off so just you

know similarly this is a good

optimization if I start with an initial

guess and I and I go down then I'm

always going to find a good answer in

the no noise case these things are going

to behave differently not only when you

have noise but also when you have

additional constraints like for instance

if you don't want to penetrate if you

don't want your object to penetrate the

world or something like this that

becomes a harder constraint and we'll

choose we'll see the differences between

those representations more yeah

in the 2x two case it is quadratic in

the 3X3 it's not quadratic but what

we're going to do is ignore

it the rotation Matrix is sufficient to

get determinant plus or minus one we're

going to solve if our if our determinant

was minus one then we're going to we're

going to multiply by negative one in one

of the in one of the Yeah so basically

it's whether you have a right-handed

rule or a left-handed rule if you end up

with a left-handed rule you flip it back

to a right-handed Rule and you call it a

day yeah so that's how we get around

cubic okay so that was just searching

for rotations I left out the the the in

this simple example I left out the

positions but one of the most important

insights I want you to take away today

is that actually you can separate

solving rotations from solving uh for

the translations

okay why is that so registering the

points the key the key insight and we we

already had it in the first lecture

about spatial transforms remember I I I

made you I did a checky yourself kind of

thing saying the position the the

position of B relative to

a only depends on the rotation between

those two frames not on the position

because it's already it's already a

vector the base of that Vector is is you

know in rooted in the coordinate system

so the the relative positions of two

rotations so the trick is if you just

try to f fit every Point by

itself then then you have to solve for

the translation and rotation separately but if you just take the difference

but if you just take the difference between two points that quantity does

between two points that quantity does not depend on the translation of the

not depend on the translation of the object you could take an object you know

object you could take an object you know I don't know in in building 32 or an

I don't know in in building 32 or an object here okay and the the absolute

object here okay and the the absolute translation does not affect the relative

translation does not affect the relative points only the

orientation so the trick is you subtract out some nominal point in the middle of

out some nominal point in the middle of your point cloud from all your points

your point cloud from all your points you solve for the rotations and then at

you solve for the rotations and then at the end now that you know the rotations

the end now that you know the rotations you can easily figure out the

you can easily figure out the positions okay you can solve for the

positions okay you can solve for the rotations

rotations separately you guys didn't look like you

separately you guys didn't look like you got that I mean I'm not trying to but

got that I mean I'm not trying to but you didn't look as happy as a as a you

you didn't look as happy as a as a you know on average let's just say yeah

know on average let's just say yeah yes

true so so so so uh you said it almost right and I have to resay it for the

right and I have to resay it for the people on the video so um so we're going

people on the video so um so we're going to take the the the model Point Cloud

to take the the the model Point Cloud we're find the centroid of the model

we're find the centroid of the model Point cloud and and write all of the

Point cloud and and write all of the model points relative to the model

model points relative to the model centroid and I'll take the central I'll

centroid and I'll take the central I'll take all of those scene points I'll take

take all of those scene points I'll take the scene centroid and write all of

the scene centroid and write all of theirs relative to the scene centroid

theirs relative to the scene centroid and then I'm going to try to take those

and then I'm going to try to take those relative coordinates and rotate them

relative coordinates and rotate them until they match and now I have an easy

until they match and now I have an easy problem to just snap the positions into

problem to just snap the positions into alignment so it's a two-step process

yes uh so so the question is why at an arbitrary point so it turn the key

arbitrary point so it turn the key Insight is that it's the relative points

Insight is that it's the relative points that that match it turns out there is a

that that match it turns out there is a right kind of a natural point to pick

right kind of a natural point to pick which is the centroid uh because then it

which is the centroid uh because then it actually just drops right out of the out

actually just drops right out of the out of the equations in a beautiful way yeah

of the equations in a beautiful way yeah yeah

ISF it's the average of the points it's literally the average of the data points

literally the average of the data points yeah

yes uh in 3D also it's only rotations that affects the relative points if you

that affects the relative points if you have a you know this point relative to

have a you know this point relative to this point then it the only thing that

this point then it the only thing that changes that number you know changing

changes that number you know changing the coordinate

the coordinate system doesn't you know the the location

system doesn't you know the the location of the coordinate system does not change

of the coordinate system does not change the relative number it's only the

rotations in 3D it works fine yeah okay so quick quick quiz so um

yeah okay so quick quick quiz so um what happens if you have a symmetric

what happens if you have a symmetric object right so I drew one sort of

object right so I drew one sort of intentionally that had a uh you

intentionally that had a uh you know asymmetry there what if it was a

know asymmetry there what if it was a box

it's so so she says it's impossible to know you know if it was if it was

know you know if it was if it was four-way symmetric in an actual box then

four-way symmetric in an actual box then it's impossible to know but so you're

it's impossible to know but so you're right of course but the thing I just

right of course but the thing I just want to make sure it's clear so far I've

want to make sure it's clear so far I've assumed known

assumed known correspondences so in the case of known

correspondences so in the case of known correspondences there is no symmetry it

correspondences there is no symmetry it cannot be right if someone told me that

cannot be right if someone told me that this point corresponds to that face and

this point corresponds to that face and there is always a unique solution right

there is always a unique solution right and the reason that sort of maybe

and the reason that sort of maybe puzzling is I'm showing you these plots

puzzling is I'm showing you these plots that look like they have a unique

that look like they have a unique solution even in the case of something

solution even in the case of something that has symmetries and the reason for

that has symmetries and the reason for that it's not a it's not a trick that

that it's not a it's not a trick that function doesn't change if your object

function doesn't change if your object suddenly becomes symmetric it's because

suddenly becomes symmetric it's because it's the known correspondences case okay

it's the known correspondences case okay let me let me keep moving a little bit

let me let me keep moving a little bit so I want to get through a couple things

so I want to get through a couple things all right and we'll ask you a couple

all right and we'll ask you a couple questions about uniqueness and the like

questions about uniqueness and the like on the

problems okay is that is that basic idea clear if someone gives me the

clear if someone gives me the correspondences then I have a very good

correspondences then I have a very good algorithm which is just solving this uh

algorithm which is just solving this uh that that uh

that that uh can can find the optimal solution in

can can find the optimal solution in fact even the the solving the quadratic

fact even the the solving the quadratic thing projecting onto the unit circle uh

thing projecting onto the unit circle uh you don't have to just you know solve

you don't have to just you know solve and then project you actually can solve

and then project you actually can solve it beautifully and it turns out the

it beautifully and it turns out the solution is given by this by the

solution is given by this by the singular value decomposition okay so the

singular value decomposition okay so the the Waba problem is famously solved by

the Waba problem is famously solved by the singular value uh decomposition if

the singular value uh decomposition if you have extra constraints then you're

you have extra constraints then you're going to use extra Machinery typically

going to use extra Machinery typically the a very powerful way to write that is

the a very powerful way to write that is as a semi-definite program which we'll

as a semi-definite program which we'll get to later okay but but these kind of

get to later okay but but these kind of quadratic constraints are a particularly

quadratic constraints are a particularly nice case of a semi-definite

nice case of a semi-definite program okay but in in the in the

program okay but in in the in the unconstrained case or you know only this

unconstrained case or you know only this constraint this is like the the SVD if

constraint this is like the the SVD if you know the basic picture of of the

you know the basic picture of of the geometry of SVD right it's about finding

geometry of SVD right it's about finding the coordinate you warp it to the circle

the coordinate you warp it to the circle you rotate it and then you warp it back

you rotate it and then you warp it back okay well that warping to the circle is

okay well that warping to the circle is exactly the warping that happens of

exactly the warping that happens of projecting onto the unit circle okay so

projecting onto the unit circle okay so it's it turns out to be exactly related

it's it turns out to be exactly related to the SVD

okay so equipped with that we now have the most

the most important uh algorithm for sort of

important uh algorithm for sort of geometric perception which is the

geometric perception which is the iterative closest

point the biggest assumption we made so far was this known

far was this known correspondence right someone told me the

correspondence right someone told me the the relative

the relative correspondences if I instead have to

correspondences if I instead have to solve for the

solve for the correspondences then um then I have an

correspondences then um then I have an extra work to do and just to give my

extra work to do and just to give my notation let's define a correspondence

notation let's define a correspondence Vector

okay so I'll use um a correspondence Vector c one for

um a correspondence Vector c one for it's the length

it's the length num points time one

okay so num points by one vector and the E element takes an integer value and

E element takes an integer value and I'll

I'll say the E element is an integer

say the E element is an integer J

J if

if Point SI

corresponds to model

MJ and now I was careful that to choose that so it doesn't have to be a one

that so it doesn't have to be a one toone mapping right I'm going to try to

toone mapping right I'm going to try to go power through a little bit more so um

go power through a little bit more so um it doesn't have to be a one toone

it doesn't have to be a one toone mapping there could be model points uh

mapping there could be model points uh that don't have a corresponding scene

that don't have a corresponding scene point which is important because often

point which is important because often times if you have a camera you're just

times if you have a camera you're just looking at one side you're not going to

looking at one side you're not going to have scene points all the way around the

have scene points all the way around the OB object okay but we're saying that

OB object okay but we're saying that every scene Point corresponds to a model

every scene Point corresponds to a model there's other choices people sometimes

there's other choices people sometimes make where the you assume that every

make where the you assume that every model goes to a scene and that all of

model goes to a scene and that all of them have implications but in this we're

them have implications but in this we're going to choose it like like this

going to choose it like like this okay I could then just write my

X using that notation minus p SI squar but now I have to search I have

SI squar but now I have to search I have to find both

to find both CI which is this discreet thing right

CI which is this discreet thing right this is an it's a function on the

this is an it's a function on the elements of one to numb model

points that's the set that that lives in so it's kind of a weird thing to

so it's kind of a weird thing to optimize

optimize over and X in

se3 so how am I going to optimize that right it looks like kind of a

right it looks like kind of a quadratic objective but with a

quadratic objective but with a combinatorial aspect of it of trying to

combinatorial aspect of it of trying to figure out all these correspondences

figure out all these correspondences simultaneously and you can do that we've

simultaneously and you can do that we've had paper that does that kind of thing

had paper that does that kind of thing where where we're trying to do the

where where we're trying to do the combinatorial search at the same time as

combinatorial search at the same time as the continuous search but it's very it's

the continuous search but it's very it's a expensive

a expensive optimization

optimization so the ICP algorithm uh

so the ICP algorithm uh famously does it by by splitting it into

famously does it by by splitting it into two

in many optimizations kind of like this it's often the case that if you fix one

it's often the case that if you fix one set of variables then the optimization

set of variables then the optimization is easy fix another set of variables the

is easy fix another set of variables the other optimization is easy and then you

other optimization is easy and then you end up with natural algorithms that

end up with natural algorithms that alternate between the two optimization

alternate between the two optimization problems and that's exactly what we'll

problems and that's exactly what we'll do here because if the correspondences

do here because if the correspondences are known then the optimization is

are known then the optimization is exactly what we did a minute ago that's

exactly what we did a minute ago that's the point registration with known

the point registration with known correspondences and it has a beautiful

correspondences and it has a beautiful solution and then the the other side of

solution and then the the other side of it is if the transform is known then

it is if the transform is known then finding the corresponding points is just

finding the corresponding points is just a nearest neighbor

a nearest neighbor problem okay so if we have an initial

guess for X then we can find we'll take step one

X then we can find we'll take step one solve the nearest neighbor

so C our new CI is going to be just be the we can just try all the possible

the we can just try all the possible correspondences basically I'm going to

correspondences basically I'm going to say

say x p o m j Argin over

x p o m j Argin over J minus

J minus PSI squared so in the if for a small

PSI squared so in the if for a small Point Cloud you can just literally try

Point Cloud you can just literally try all the possible correspondences for

all the possible correspondences for this when X is known you can just

this when X is known you can just measure the distance and take the

measure the distance and take the smallest

smallest one okay when it gets bigger you start

one okay when it gets bigger you start using efficient nearest neighbor data

using efficient nearest neighbor data structures like KD trees and stuff like

structures like KD trees and stuff like that okay but this is a a fast to

that okay but this is a a fast to nearest neighbor

nearest neighbor query and then the second step is um

query and then the second step is um given

solve for x and then you just repeat until

until convergence

okay so let's see what that looks like try to not stand in the midle right

like try to not stand in the midle right here okay so this is the kind of plot

here okay so this is the kind of plot I'm going to show you here so this was

I'm going to show you here so this was the known correspondence one I picked a

the known correspondence one I picked a lovely salmon color for the random

lovely salmon color for the random object with random points in 2D and

object with random points in 2D and known points and then I rotated it and

known points and then I rotated it and translated it by some random quantities

translated it by some random quantities and got my blue scene points in the

and got my blue scene points in the first step we have a known

first step we have a known correspondence problem and the

correspondence problem and the registration just works exactly okay

registration just works exactly okay that's the known correspondence version

that's the known correspondence version now if I take another Point another

now if I take another Point another example here with my scene my my model

example here with my scene my my model points and my scene points if I start

points and my scene points if I start the it the

the it the ICP algorithm then I think I can just

ICP algorithm then I think I can just step through here we go okay the first

step through here we go okay the first thing is I do is I solve that given that

thing is I do is I solve that given that initial guess which was a bad initial

initial guess which was a bad initial guess I you know this is the initial

guess I you know this is the initial guess here I just compute the nearest

guess here I just compute the nearest neighbors for every um every scene point

neighbors for every um every scene point I find the nearest

I find the nearest Point okay and then I given those

Point okay and then I given those correspondences I try to solve for the

correspondences I try to solve for the new optimization and it doesn't do very

new optimization and it doesn't do very well because my correspondence were all

well because my correspondence were all wrong but the hope is that it gets you

wrong but the hope is that it gets you closer okay and then I get a new chance

closer okay and then I get a new chance at my correspondences and many times

at my correspondences and many times this converges

this converges beautifully in a small number of of

beautifully in a small number of of alternations because you know at some

alternations because you know at some point your correspondences are correct

point your correspondences are correct and you snap right into

and you snap right into place

yeah yes um so just like uh you can separate translation and rotation

separate translation and rotation scaling can be separated too and the

scaling can be separated too and the trick is so just like the the

trick is so just like the the observation is that um the the relative

observation is that um the the relative positions only depend on rotations it

positions only depend on rotations it turns out the difference of of

turns out the difference of of distances only depends on

distances only depends on scale so if you if you play that trick

scale so if you if you play that trick one more time you get something that

one more time you get something that only depends on scale so you can fit

only depends on scale so you can fit scale

scale first and then fit rotations and then

first and then fit rotations and then fit translations I actually cite that in

fit translations I actually cite that in the notes because I think that's part of

the notes because I think that's part of the

the story

yeah yes good so so this algorithm can absolutely get stuck in local minimum I

absolutely get stuck in local minimum I I you know there's a in the in the code

I you know there's a in the in the code you can play with it's just random so

you can play with it's just random so the fact that it's mostly translation

the fact that it's mostly translation it's probably because I wanted one that

it's probably because I wanted one that fits on the slide uh but I didn't think

fits on the slide uh but I didn't think of it that way now I feel like I picked

of it that way now I feel like I picked a bad example but uh yeah so so it but

a bad example but uh yeah so so it but it absolutely can get it stuck in local

it absolutely can get it stuck in local minimum yes if you pick the wrong

minimum yes if you pick the wrong correspondences you make not enough

correspondences you make not enough change you could get the same correspond

change you could get the same correspond same wrong correspondences back and that

same wrong correspondences back and that will never leave

will never leave right uh so there are many ways and

right uh so there are many ways and we're going to talk about those more

we're going to talk about those more next time but there are many ways you

next time but there are many ways you can try random initializations of course

can try random initializations of course responden but you can also there are

responden but you can also there are more robust methods that can that can

more robust methods that can that can try to avoid some of those local minimum

yeah so and we're going to talk about noise uh also next time it's actually

noise uh also next time it's actually it's a fairly subtle question and I had

it's a fairly subtle question and I had I had a slide I blw past real quick but

I had a slide I blw past real quick but um just to to show you some like real

um just to to show you some like real world noise is is extremely structured

world noise is is extremely structured so if you think about noise as like

so if you think about noise as like adding

adding gaussian values to all of those values

gaussian values to all of those values independently that's not the way cameras

independently that's not the way cameras have noise cameras tend to have like

have noise cameras tend to have like this is the actual depth image and this

this is the actual depth image and this is like the the act the depth image you

is like the the act the depth image you get out of a camera they have dropouts

get out of a camera they have dropouts like pixels that are just mixing missing

like pixels that are just mixing missing in the middle they will have swaths of

in the middle they will have swaths of like a shiny material might have very

like a shiny material might have very noisy things and a you know a flat

noisy things and a you know a flat material could could not so so the the

material could could not so so the the answer to your question requires

answer to your question requires thinking a little bit about the types of

thinking a little bit about the types of noise

yeah yep just alternate back and forth between when X is fixed the problem is

between when X is fixed the problem is easy it's nearest Neighbors when the

easy it's nearest Neighbors when the correspondences are fixed then the

correspondences are fixed then the problem is easy it's this W

that's true you're solving many optimization problems in the loop the

optimization problems in the loop the one the this one is so easy that it

one the this one is so easy that it becomes an SVD it's a called SVD so it's

becomes an SVD it's a called SVD so it's I wouldn't even call it an optimization

I wouldn't even call it an optimization problem in the implementation it's very

problem in the implementation it's very fast even for very big Point clouds but

fast even for very big Point clouds but yes it is it is alternating between

yes it is it is alternating between those and let me just I'll take home

those and let me just I'll take home with one more example here so yeah these

with one more example here so yeah these are I've got lots of examples of real

are I've got lots of examples of real noisy things

but you're going to play with the bunny because everybody who ever does ICP

because everybody who ever does ICP makes the Stanford

makes the Stanford bunny snap into alignment with another

bunny snap into alignment with another Stanford bunny that's just like you know

Stanford bunny that's just like you know early in computer Graphics the Stanford

early in computer Graphics the Stanford bunny sort of did a win or take all

bunny sort of did a win or take all thing and it just one out and there's

thing and it just one out and there's everybody uses the Stanford bunny okay

everybody uses the Stanford bunny okay so you you'll do that on your problem

so you you'll do that on your problem set but just to show you even in the

set but just to show you even in the examples I I um I showed you have like

examples I I um I showed you have like the the loading a dishwasher for

the the loading a dishwasher for instance if you watch carefully at what

instance if you watch carefully at what happens so there was a perception system

happens so there was a perception system that tried to figure out where the mug

that tried to figure out where the mug was to begin with okay but as the robot

was to begin with okay but as the robot moves even in this sort of state of the

moves even in this sort of state of the art perception system okay state of the

art perception system okay state of the a a few years ago I guess but um watch

a a few years ago I guess but um watch this that was running ICP I mean that

this that was running ICP I mean that wasn't the ICP updates but it it

wasn't the ICP updates but it it actually when it goes there it has a

actually when it goes there it has a model of the mug back back in the day

model of the mug back back in the day okay and it actually tried to align the

okay and it actually tried to align the model of the mug before going into to

model of the mug before going into to close the the the difference between the

close the the the difference between the far away cameras rough estimate of the

far away cameras rough estimate of the where the mugs were and actually making

where the mugs were and actually making the pick and people still do that today

the pick and people still do that today Leslie and toas we were in a meeting

Leslie and toas we were in a meeting with Leslie and toas the other day and

with Leslie and toas the other day and they're like we're going to do ICP for

they're like we're going to do ICP for this and and the young students were

this and and the young students were like okay that's kind of old school but

like okay that's kind of old school but uh but it still works like it still

uh but it still works like it still works really well

yeah yep so this is part one part two is like how do you do more robust versions

like how do you do more robust versions of this with partial views and outliers

of this with partial views and outliers and noise y so we're going to talk about

and noise y so we're going to talk about that next time good I I'll answer you

that next time good I I'll answer you can come on afterwards yeah thank you

Nhấn vào bất kỳ đoạn văn bản hoặc mốc thời gian nào để nhảy đến phần đó trong video

Chia sẻ:

Hầu hết transcript sẵn sàng trong dưới 5 giây

Sao Chép 1 Chạm125+ Ngôn ngữTìm kiếm nội dungNhảy đến mốc thời gian

Dán URL YouTube

Nhập link bất kỳ video YouTube để lấy toàn bộ transcript

Hầu hết transcript sẵn sàng trong dưới 5 giây

Cài Tiện Ích Chrome Của Chúng Tôi

Lấy transcript ngay mà không cần rời khỏi YouTube. Cài tiện ích Chrome để truy cập transcript của bất kỳ video nào ngay trên trang xem, chỉ với một cú nhấp.

Thêm vào Chrome — Miễn phí

Hỗ trợ YouTube, Coursera, Udemy và nhiều nền tảng học tập khác

Lấy Transcript Ngay: Chỉ Cần Sửa Tên Miền Trên Thanh Địa Chỉ!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

Transcript YouTubeĐang chuẩn bị kết quả…

Transcript YouTube:6.4210 Fall 2023 Lecture 6: Geometric Perception (Pt. 1)