YouTube Transcript:
Algorithmic Trading and Machine Learning

Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.

AutoDub

Understand YouTube Foreign Videos

Immersive YouTube Voice Translation

Break language barriers, embrace global quality content

Solve Foreign Video Barriers Instantly

Video Transcript

Video Summary

Summary

Core Theme

This presentation discusses the algorithmic challenges in modern electronic financial markets from the perspective of a quantitative trading group, focusing on optimizing trade execution and managing market impact through machine learning and control theory.

Mind Map

Click to expand

Click to explore the full interactive mind map • Zoom, pan, and navigate

Okay.

[MICHAEL KEARNS] Thanks, Kostas, and thanks for having me.

And I want to thank Eric for his excellent talk because it also

helps set the context for a lot of what I'll be discussing here.

You can think about much of what I'm gonna talk about today,

about what it's like to sort of live on the other side

of these exchanges and to deal with HFTs and other

counterparties as well.

Um I also should say that there's really

gonna be no explicit game theoretic content in this

talk which Kostas promised me was okay, but there will

be quite a bit of practice.

Um, as many of you know, I've, for the last 12 years or so,

I've been working with a quantitative trading

group on Wall Street in, in my sort of shadow life.

And I wanna give special thanks to my trading partner,

Jurri Naivaka who's a co-author on all this work.

All this work kind of came out of some proprietary commercial

context in our trading group.

And the parts that I'm talking about are sort of

the non-proprietary parts that seemed scientifically

interesting to us over time.

Uh, and so, you know, this slide sort of

puts a lot of--

This is kind of just a context slide for a lot of

the things that Eric said.

We certainly live in interesting times on Wall Street, both from

a technological perspective and also more generally

from a social perspective.

So here's the same image that Eric showed of the

flash crash a few years ago.

Here's the famous book by Michael Lewis,

which I highly recommend, on high-frequency trading.

We have Bernie Madoff, other insider traders, and the like.

And so without passing moral judgment on any of

this activity, including high-frequency traders what

I wanna do in this talk is just lay out what some of the

algorithmic challenges are in trading in modern electronic

financial markets from the perspective of somebody working

in a quantitative trading group, essentially something like

a traditional stat arb equities trading group.

What we do in my group is trade equities long and short.

We trade pretty much every liquid instrument

in many, many markets, both domestically

and internationally.

And we hedge exclusively with futures, so we're

not doing any complicated derivatives or the like.

So we're in kind of the simplest markets or the

simplest instruments.

But as Eric has sort of already outlined today,

a lot of the complexity of trading those environments

today comes from the automation of those markets and the data

that's available and what you do with it and what

your counterparties do with it as well.

And needless to say, because of the rising automation and

availability of data on Wall Street, there are many,

many trading problems that one encounters that invite a

learning approach, a machine learning approach, right?

Because you have the data, it's at a temporal and

spatial scale that you can't understand as a human being.

And so one has to develop algorithms to trade sensibly,

and those algorithms have to be adaptive and trained or should

be trained on historical data.

Sometimes a lot of historical data, including very,

very recent data.

And so what I'm gonna do is just try to give a high-level talk

where I quickly outline three sort of little vignettes or case

studies of specific problems that kind of arise in trading,

algorithmic trading in modern electronic markets, and sort

of what the challenges those problems present for kind of

traditional machine learning are, and some kind of hints

about how you can address those problems with new techniques.

And so the first two problems I'm gonna talk

about, the first one's probab-- the first two are quite closely

related to sort of the world that Eric was describing,

are both about problems of optimized execution,

where a person or a model or an algorithm has made

a decision to execute some trade, to like buy, you know,

100,000 shares of Google or short, you know,

100,000 shares of Yahoo, what have you.

And now the problem is actually executing that trade,

because a directive like that is underspecified.

In particular, it's underspecified in

its temporal aspects, right?

So you know, I can say I wanna buy 100,000 shares of something,

but how quickly do I want to do that, okay?

And so as we'll see that there's this trade-off in

all electronic markets.

And by the way, I don't think any of this would

go away with something like the very sensible

alternative mechanisms that Eric is proposing.

You have this trade-off between immediacy and price.

You can be in a hurry and impact prices more

and get worse prices, but you might wanna do that

because let's say you think that the informational advantage you

have is fleeting and you wanna execute the trade quickly.

Or you can try to do things in a more leisurely way and try

to let the market come to your preferred price, but then it'll

take longer in general, okay?

And so the first two problems I'm gonna talk about are sort

of very specific instances of that kind of immediacy versus

price trade-off that you see in electronic markets.

The last one is a little bit more on kind of algorithmic

versions of kind of classical portfolio optimization,

mean-variance optimization.

The kinds of problems where you wanna hold a diversified

portfolio that, you know, kind of maximizes your

return subject to some risk considerations or

volatility considerations, okay?

And I may not finish all of this, but I'll hopefully at

least say a few words about all of these topics.

But I think I did save some time because Eric very

nicely laid out kind of what the modern continuous double

limit order auction looks like.

This is what's sometimes referred to in financial circles

as market microstructure.

And so let me start off by just describing

a canonical trading problem.

We already mentioned it a bit.

Let me make it a little bit more formal.

Suppose that we have a particular trading goal,

like we wanna sell these shares of some stocks.

Buy or sell is symmetric, but just to be concrete,

let's say that we wanna sell a certain number of shares, some

volume that I'll call capital V.

And I also have a specified time horizon in which I

wanna execute that trade.

Um, so in, I wanna sell

V shares in T time steps,

and of course if I'm selling,

I wanna maximize the revenue from

those shares.

And if I'm buying, I wanna minimize

the expenditure for those shares.

And as you might or might not imagine, there's sort

of a very large number of benchmarks that have been

proposed for how to measure the quality of such a trade.

So if I actually go out in the market and execute a, a, an

order to, let's say, you know, buy a certain number of shares

of Google in a specified time period, we have to ask,

well, you know, "How good, how well did I perform," right?

And almost all of the sensible measures kind of measure your

performance relative to the activity that was going on

in the market at the time that you executed your trade.

So a very typical thing would, to do would be to say,

let's say I started this trade at 10 o'clock in the morning.

I executed it over the course of an hour.

How well did the actual average share price that I got in my

execution compare to, let's say, the volume-weighted average

price of the stock during that same period?

That's called the VWAP.

There's something called the TWAP.

I'm, in this particular case, going to use something that's

called the implementation shortfall, which is you look

at the midpoint between the bid and the ask at the time

you started your trade and which of course, by the way,

is itself a fictitious benchmark because by definition, there's

no liquidity at the midpoint between the bid and the ask.

But it sort of seems like a fair peg point,

and then I compare my performance to that

and ask, you know, how much worse I

did than that.

It'll almost always be worse on average than the midpoint

of the bid-ask because in a--

You know, in all of

these problems, your trading activity

itself is going to push the prices in the

direction that disfavors you.

Okay.

If I buy, if I'm buying a large volume of shares, it's

gonna push, push the price up.

If I'm selling, it's gonna push

the price down.

And this is entirely due, well, it's at least largely due to

just the mechanical impact I have on the order book, right?

As I cross the spread and eat liquidity from the other side,

it makes the successive prices worse and worse.

Okay?

So whatever metric you look at, it's very, if you,

if any of you thought about this problem seriously

for just a little while, you'd come to the view that

a very natural framework to think about such problems

is essentially problems of state-based control, okay,

where there are at least two very, very obvious

state variables you wanna consider: how much inventory

you have remaining, i.e., how many of the V shares do

you still have left to buy in the given time period,

and how much time is left.

Okay?

And you might also want to think about other features

or state variables that capture other kinds of

market activity that might have informational advantage as well.

Okay?

So this is just a snapshot, sort of showing what a typical

order book looks like.

This is from many, many years ago in Microsoft

on, on on Island at the time, which has now been

rolled up into s-- to some larger exchange.

But you see the buy orders arranged by,

in order of decreasing price, the sell orders arranged

in de-- ascending price.

By definition, there's always

gonna be a gap between these two,

which is the bid-ask spread that Eric wa-- was discussing.

There's volumes at each of these,

and just to remind you that the way these

markets work is, you know, if I'm trying to let's say buy

shares I can, you know, I can, I can place a, a limit order.

And that limit order will either, you know,

sort of sit somewhere in this price ladder,

or it might actually cross the spread and

execute against the existing liquidity on

the sell book, okay?

And it might partially match.

So if I put in an order for, I don't know,

let's say a thousand shares at, you know, 23,7-79.9,

I would execute 900 of those shares, the first

hundred at this price, the next 800 at this

price, and then the, my remaining 100

shares would down, become the new top

of the buy book.

Okay?

So it's a very mechanical process.

And as Eric pointed out, it's extremely volatile these

days, it's extremely dynamic.

The sort of timescale at which orders arrive and executions

take place is, you know, often at the sub-millisecond

level in highly liquid stocks or even in microseconds.

And as he mentioned also, right, you know, everything's possible.

I can cancel.

I can revise my order.

Partial executions may occur.

And so, you know, one very interesting,

very difficult question or type of research to ask is,

how do individual, you know, sort of microscopic orders

influence aggregate macro behavior, you know, of the

overall aggregate market?

But from the purposes of a trading group like ours,

one of the things we care about most is this trade-off

between immediacy and price.

So, you know, to just, you know, I won't explain this

chart in any detail, but what it's showing you,

the X axis is sort of, you know, what my limit order price is,

let's say if I'm buying, right?

And at the left extreme I'm sort of considering prices

that are sitting way down in the book, so they're very,

very far from where the market is currently trading.

And so what I'm showing you here is the performance of a

very simple class of strategies for this problem of execute

V shares in T time steps, which is what I call kind of

submit-and-leave strategies.

Okay?

And in a submit-and-leave strategy, what you do is

you just pick a fixed limit order price,

put all these shares at that limit order price,

hope that they get executed, but any volume that's not

executed after the T time steps, you have to cross the spread

and buy the shares at whatever prices they're offered on the

other side of the book, okay?

Because in all of these, you know,

it's very as a practical matter,

it's very much the case that whenever

we execute a trade, we always set a time limit

on it because we essentially have some model or empirical

measurement historically of how long our informational

advantage lasts, right?

So if an analyst upgrades a stock, for instance, there's

some informational advantage in trading on that activity, on

that upgrade very, very quickly.

But, you know, after five minutes,

it's gone because the market's essentially

already incorporated that news.

Okay?

So what you see here is that there's this very distinct peak.

If I put my orders very deep down in the book,

then nothing gets executed.

I have to cross the spread all, at the end and pay the

prices on offer.

At the other extreme, I could cross the spread

at the beginning, right, and pay the prices on

offer at the beginning.

These two things are roughly equal,

but you can see that there's a great deal of optimization.

There's sort of a magic price at which I should

put my limit order if all I'm gonna do is leave it there

and then cross the spread with the remaining inventory at the

end of the time interval, okay?

So this already shows you that even within this kind of

brain dead class of strategies there's actually quite a bit

of optimization to do, right?

There's a very, very sharp peak, and this is quite typical.

I don't remember what stock this is in.

Yeah, Ashish.

Does

[ASHISH] This mean the market reaction in dot com?

[MICHAEL KEARNS] No.

So in everything I'm gonna describe here,

I'm sort of only accounting for the mechanical impact.

And the mechanical impact is the only thing you can

kind of optimize for using historical data, right?

Because by definition, you can't ask the counterfactual question

of what would the psychological, or if you like, you know,

the sort of non-mechanical impact be.

I mean, you can have models for it, and there are many models

proposed in the literature.

But from the viewpoint of a group like ours, I mean,

we don't even bother looking at it in those models because we

know that at the end of the day, the only way you're gonna

measure that is by actually starting to trade, right?

And so we, of course, do that in small volumes at the beginning

when we're trying something new and gradually ramp it up as we

get some empirical sense of what that impact is like.

And as you can imagine, right, I mean, all these things

have to do with liquidity.

It's much easier to hide your activity and have it have lower

mechanical impact and kind of counterfactual impact in a

highly liquid name than one-- than something that trades

very infrequently.

Okay.

[AUDIENCE MEMBER] This graph is meant to be general or,

[MICHAEL KEARNS] This is for some specific stock for some

specific period, but this is extremely generic, right?

If you go to any reasonably liquid stock, you would see

that there's really an annual ex-- the exact location.

Like, the zero here means kind of putting your order

at the current bid or the ask.

So, being, you know, kind of putting your order,

not crossing the spread, but not sitting down too far, right?

So because, you know, in a highly illiquid

stock, if you sit down too far, on average,

you're just never gonna get executed because, you know,

even if a little bit of liquidity gets eaten away,

other counterparties are just filling that vacuum behind it.

Okay?

So it's unimodal

So it's very, it's always

unimodal, and it's, and it always has a

relatively sharp peak.

Exactly where that peak is might depend on the stock in question,

but it's a very generic empirical figure.

Okay.

So you know, reinforcement learning,

which is essentially the AI term for,

you know discrete state-based control is

sort of a perfect fit for solving this kind of

problem, where as a first cut, what you might do is just

say, all right, I'm gonna keep track of two things,

the amount of inventory I have remaining and how much time I

have remaining in whatever period I've specified over

which I want to do the trade.

And I won't go into it, but we did, you know, many years ago,

this sort of large-scale empirical study where we

applied kind of fairly generic reinforcement learning using

historical data to learn kind of optimized policies using

those two state variables.

And, you know, the kind of takeaway

message here is you learn very sensible

policies, and this is what you want, right?

I mean, you kind of want, you know, in general, I mean,

we're kind of the opposite of black box quants.

Anytime we use machine learning in a trading problem or in

some kind of strategy model, if we can't back out in our

heads some reasonable kind of economic rationale for

why this makes sense, we're extremely uncomfortable

deploying it in actual trading.

And so what you want to kind of see are pictures like this.

I'm showing you the actual policies learned over three

relatively liquid NASDAQ stocks where the state space is sort of

on the XY plane, where you have a certain amount of time left.

I've discretized the variables here.

So this is sort of decreasing amounts of time left in your

interval and decreasing amounts of inventory left

in the size of the trade that you're trying to execute.

And basically, the sort of z-axis

here is measuring, you know, whether you're s-- you know,

kind of sitting further back in the book versus

crossing the spread.

So the higher the value means the more you're kind of

crossing the bi-- the higher the price you're buying at, right?

You're trying to cross the spread.

And you can see here, you know, very sensible behavior.

As you're running out of time and you have a

lot of inventory left, then you better get going

and start being more aggressive in your trades.

If you've got a lot of time left and very little inventory,

so this might have happened because, you know,

at the beginning, some other counterparty

kind of crossed the spread and executed a lot of your shares

very early in the interval.

Well, now you can afford to be patient, right?

Now you can afford to drop down in the book and hope

for price improvement and get a better price.

Okay?

So, you know, so if you do

this across many, many different stocks and

over many, many different historical periods, you

will always get kind of a shape that looks like this.

But the details really matter here, right?

The numerical details really matter because different stocks

have different bid-ask spreads.

The sort of price, the bid-ask spread

determines how big you have to be you know,

how large your price has to be up, you know, above

the bid to actually reach the buyers or the sellers

on the other side, et cetera.

So this is all kind of good news-- Question.

Yeah.

[AUDIENCE MEMBER] These deadlines, an example

would be like what you said, the analyst came,

made an upgrade, and now you just know you have--

[MICHAEL KEARNS] Yeah, yeah.

And these, and these volumes and, you know, as a sort of

to talk about the practice of it a little bit more,

these volumes and these time intervals, we would

kind of self-impose them.

Now, we might optimize them, right?

Because everything here is a moving part that interacts

with everything else.

So for instance, and, and this is the kind

of thing we're always trading off, you know,

we might know that we can kind of on average

get better prices by taking a longer time interval,

but then we'll also know there's a, you know,

essentially a decay of what you would call our

alpha on Wall Street.

Like our informational advantage, right?

And so we'll always be trading those two things off

and testing them on historical data to the extent that we can.

And then it's,

But then at some point you have to, you know, we have to either,

you know, if we're using kind of direct market access algorithms

or if we're trading through a prime brokerage, we have

to have a directive, right?

You, you kind of can't say to the brokerage,

or you definitely don't wanna say to the brokerage, "Oh,

you know, why don't you buy, you know some Google shares

over whatever amount of time you feel like it."

Right?

We have to actually give them a very specific directive.

[AUDIENCE MEMBER] So it seems that something

more complex would be

some kind of learning model where as this is happening,

you might detect things at once that will make you want to

change your time limit, right?

Potentially.

[MICHAEL KEARNS] Potentially, yeah.

Um--

[AUDIENCE MEMBER] A more complex learning problem.

[MICHAEL KEARNS] Yeah.

And by the way, I mean, like one form of that, for instance,

is it's quite often that when one analyst upgrades their

view on a stock other analysts upgrade their view on that same

stock in quick succession.

And so one of the things you need to decide there is like,

is that actually fresh news or is this just kind of an artifact

of some other more basic news that's getting into the market?

And then you know, in that case, you might not even wanna

trade on it at all.

Yeah.

[AUDIENCE MEMBER] I don't understand why at the end of

your time period you want to buy the rest of the volume.

Why not just give up at that point?

You bought what you bought.

[MICHAEL KEARNS] I mean, the high level answer --

so there's two answers to this.

One is if you're an actual brokerage like Bank of

America and you have an algorithmic trading desk,

you're performing these actions on behalf of a

client that's given you the directive, right?

And so you don't ask, "Well, you know, hey,

I didn't get all your shares of Google.

Do y-- are you sure you still wanna buy them?"

They, the language in which they communicate with you is

a directive to buy this many shares in this amount of time.

From our standpoint, we do have that option,

but the reason we don't is because at some point we've

kind of optimized all of these things together and realized

that this is sort of the right amount of volume to buy based

on this informational signal.

And by the way, that volume is sort

of usually the maximum that we think we can get

away with without pushing prices around too much.

So you could do something like what you're suggesting, but I

think, as a practical matter, there's sort of so many

things to think about in this world that the fewer

moving free parameters you can have, the better.

Yeah.

[AUDIENCE MEMBER] Is the counterfactual market impact

being ignored seems to be a big problem in my mind.

When you do the testing on the six months following, are

you doing that live or is that

[MICHAEL KEARNS] Still using the model that you-- Still using,

so this is, you know, you can think about this

study as limiting the uh, sort of scope to the

improvements you can get on that particular kind

of market impact, okay?

And in general, you know, as a practical matter,

the way we deal with the second kind of impact is

by, you know, I mean, first of all, we have many rules.

In fact, we even have like contractual regulations on

how much volume, you know, sort of how much volume we can

trade in any particular stock.

Okay, so for instance, my specific group, you know,

we will never be more than, I mean, we have a contractual

limit that says we will never constitute more than 1% of

the average daily volume in any single stock.

But we don't even ever wanna get close to that.

We will sometimes be a half a percent.

And by the way, we're not an especially

large group, right?

So there are a lot of groups with this sort

of behavior that they're willing to be that, you know,

like a quarter or half a percent of the average daily volume.

Yeah, Alan.

[ALAN] So I'm very happy to see that you're talking

about a glass box rather than a black box, machine learning

And I guess this being able to, for human beings to be able to

understand what is going on.

I wonder, given the complexity of what's being learned the

special tools you have to explain these policies

to people, is it more done by eyeballing?

What is the--

[MICHAEL KEARNS] I mean, in the case of a problem like this,

it's very clear what sort of sensible behavior means, right?

If somehow the learning process told me, "Oh,

you're running out of time.

You have a huge amount of inventory left,

and so, you know, reduce your buy

price lower."

This is kind of obvious.

The harder thing is when you have strategies that work well

and you don't quite, you know, work well historically, maybe

they are even working well in live trading, and you don't have

a clear handle and, you know.

And so, here, I mean, I think just the nature of the problem

makes what's sensible clearer.

Other times you, we find ourselves

admittedly kind of thinking about almost thinking about

it from an anthropological perspective, which is like,

you know, if this trade is consistently making money,

there's gotta be some other group of counterparties out

there that are, you know, are taking the other

side of this trade.

And what, you know, what is the reason, you know,

what is their reason for persistently doing this,

you know, other than, you know, stupidity, right?

And actually you're worried if the other,

if the reason is stupidity, right?

Because people you know, people often realize that

they're doing something stupid and stop doing it, and then

you can lose a lot of money.

But if there's sort of a good reason why, I mean,

so like to give a concrete example it very consistently,

you can very consistently make small amounts of money, by,

in the period before the market opens, laying down a ladder of

limit orders on both side of the books in every liquid stock.

But spaced far enough that if your first order gets hit,

it's a significant kind of opening, you know,

10-minute deviation for that stock, okay?

And if your second order gets hit, it's like a, you know, an

even larger standard deviation.

So basically, just laying down

these orders on both sides of the book,

seeing if anything gets hit, and then liquidating it over,

let's say, the first two hours after the market open very

consistently makes money.

It's not easy to implement this in kind of, you know,

there's a lot

Uh --The reason I'm telling you this is that I know that

the engineering details of this are non-trivial.

But so why is this?

Well, the reason is that many parties that sort of

trade on a slower time scale, like mutual fund managers,

first of all, they like to trade at the open or the close

to try to benchmark to opening and closing prices, because

that's how their performance is benchmark, A. and B, right,

they're often reacting to overnight news, right?

And so they're all coming in the morning, but it's sort of

worthwhile for them to do so, because they wanna get it close

to the open or to the close.

The same thing works on the close as well.

So there, it's not sort of like there's an obvious optimization

criteria and there's a sensible behavior to it.

But there's more of a sociological

explanation to it.

Okay.

So now, I mean, already, you know, again, just on

this aspect of mechanical market impact if you

compared, you know, sort of how well you

can do by keeping track of these two simple state

variables and taking kind of a control theoretic approach,

versus let's say the best you could have done by optimizing

within this class of submit and leave strategies, right,

which don't know about time and remain-- volume remaining,

they just keep an order fixed and then when time runs out,

they cross the spread

You're already getting 35% improvement on

average using this kind of approach compared to that.

Um, and th-- this is a meaningful

improvement in this world because, you know, I think as

Eric mentioned, right, you know, kind of on this particular

problem, performance deltas, like sort of, you know, th--

the sort of meaningful improvements in performance

are measured in basis points, which are 100th 100th of a

percentage point, right?

So if you're sort of getting 35% improvement over something

reasonable, it's non-trivial.

Yeah.

[AUDIENCE MEMBER] So the submit and leave here, in comparison,

is measured without submit and leave prices optimized

with hindsight?

[MICHAEL KEARNS] Yes.

So, it's like a learning-to-learning experiment.

So either do reinforcement learning on the historical

data or use the historical data to find the best single

limit order price within this restricted class of strategies.

You already get a 35% improvement.

[AUDIENCE MEMBER] But the reinforcement learning,

is that also given hindsight or not?

[MICHAEL KEARNS] No.

So it's all trained test methodology.

We take a period of historical data.

We train, reinforcement learning learns some policy

on that historical period.

Over the same historical period, we find the optimized submit

and leave strategy.

We compare both of them on this successive out-of-sample

time period, and you're getting a 35% improvement.

[AUDIENCE MEMBER] So do you compare only on like

when there's sort of like information change in the

market, or you like, you know, training data is like

any arbitrary interval?

[MICHAEL KEARNS] Okay.

So I mean, these experiments we did over a long historical

period with many, many kind of sliding windows of

training and testing data.

To get, you know, to get into the details a little bit, one

of the things that's hard in applying machine learning to

financial or market data is the fact that if you're not careful

you can learn unintentionally all kinds of like, you know,

just sort of drift, okay?

So if you happen to train over a period in which the market is

just rising overall, you know, you'll learn a policy that,

you know, basically just says, you know, "You should, you know,

you should buy," right?

Because the market is rising at a high level, right?

And you don't want to learn this kind of drift.

You can also, you know, learn other sorts of

seasonalities by which I don't mean, you know, fall, winter,

but other kinds of, you know, different trading days have

different volumes, different periods of the day have more

or less trading activity.

And so there's a lot of engineering that kind of

goes into kind of creating your models in a way that

makes sure you're not just learning these seasonalities

or these, you know, kind of directional drifts.

[ALAN] Okay.

[MICHAEL KEARNS] So, you know, it's of course natural to ask,

once you see that this sort of simple control

theoretic state-based approach works, you know,

reasonably well to ask, "Well, maybe I should add other state

variables which don't just have to do with how much time

and volume are remaining, but that are actually

looking at, you know, what's going on in the market."

And since we're in kind of a double limit order book

microstructure setting, it's quite natural to

look at features of the order book itself, right?

Or the recent temporal activity in the order book.

And by the way, I am of the belief or

kind of semi-knowledge that I think a lot of HFT firms

who are not trying to solve execution problems per se,

they're actually trying to predict directional movement

in the market, right?

But I think they're using a lot of the same kind of

machine learning techniques.

I think they're spending a lot of time sitting around

inventing features of the order books that might have

some slight informational advantage in, let's say,

predicting whether the midpoint of the spread

is gonna go up or down in the next 500 milliseconds,

for instance, okay?

It's not easy work and it requires sort of technology

that's not really in our domain, so we don't do

that kind of thing.

But I think the methods are very much the same.

And I won't go into this.

[AUDIENCE MEMBER] I mean, they'd probably be a lot

more open to black box, right?

Because I

[MICHAEL KEARNS] Think so, yeah.

I mean, I think especially at the microstructural level,

you kind of have to be, right?

Because there's no --I mean, at some level,

it's not, you know, there's --you can't

really understand it, right?

It really has to do with sort of volumes and speeds

of data that are beyond kind of normal human comprehension.

And so you mean you can be very careful about your

train and testing methodology and sort of know that there's

some consistent predictive power to some state variable.

But knowing why is very, very hard at that granularity, right?

Thi-this is just showing you a list of a bunch of kind of

features of the order book that we invented and added

to this basic state space of time and volume remaining,

and the percentage improvement that you get by adding them.

Um, I'm not gonna go into all of them, but, you know, a lot

of them are, are, are quite s-- straightforward, like what is

the bid-ask spread, right?

So knowing what the bid-ask spread is at a given moment

is a very valuable thing to incorporate into

the state space.

Because in particular, you know, implicitly,

now the policy can learn exactly how high a price

it needs to put in to just get to the bid and

not, let's say, overshoot the bid by a lot, right?

Because if I basically say, like, "Well,

I'm gonna raise my current price by 15

cents," like, well, is that 15 cents just

reaching the other side and eating a little bit of volume,

or is it actually crossing very deep into the book and giving

me terrible, terrible prices?

Things like signing trades is very useful.

Signing trades is basically going through the order

books and saying whether each trade was initiated

by a buyer crossing the spread to the sell book

or a seller crossing the book to the buy book.

So, you know, and some of these things, right,

actually don't help you that much.

So we thought it would be helpful to know exactly

how much volume there is at the bid or the ask.

It turns out this is not --I mean, it looks slightly harmful,

but it's basically a wash.

It doesn't seem to have much informational value.

[AUDIENCE MEMBER] The

[MICHAEL KEARNS] Negative, what-- Negative actually

means that, you know, in the experiments,

it slightly hurt performance to add that

feature to the state space.

And this could just be for overfitting reasons, right?

Like if I add some variable which doesn't

have any informational value, then, you know,

my model in training will, you know,

fit kind of to the behavior of

that variable, but it's not useful

out of sample.

But, you know, it turns out

that, you know, if you take kind

of the best of these features that did

add informational value and combine them,

you get kind of another 13% improvement over and

above the 35% improvement that I described on the last slide.

Okay.

[MICHAEL KEARNS] Good.

So you know, just again, to kind of riff or free

ride on Eric's talk one of the things that's kind

of very interesting about modern financial markets and

their evolution over the last 20 years or so is that, you know,

again, as he kind of intimated, sometimes in kind of going

to automation and apparent --and transparency, a--

Actually there was some kind of human aspect of prior trading

that served a very useful purpose that was eradicated.

And you know, it made certain types of

trade much more difficult, and then some new electronic

mechanism was introduced to try to replicate that

functionality before.

And so an example of that is, you know again, you know,

this picture of a double --of continuous double limit order

book, this has been around since sort of the dawn

of financial markets.

It's just that prior to automation on Wall Street,

for instance, you know, it would actually be

maintained in a paper ledger by the market maker in that stock

who would sort of, you know, have these little chits of

paper and order them, you know, in order of increasing or

decreasing price, whether on the buy or the sell book.

And so there wasn't this kind of transparency.

But with this kind of transparency where this

data is freely available to, you know, anybody who

wants to can do, you know, real-time reconstruction

of limit order books and see kind of the trades

of everybody else, not earmarked by identity, right?

You're not, you're not being told, "Oh,

this is a Credit Suisse order, and this is from SAC Capital,"

or something like that.

But you can definitely algorithmically detect,

for instance, when there's a large trade in the market,

when somebody's trying to, let's say, unload a lot of

some stock in some small amount of time, you know,

the way they have to do it is through a market like this.

And they're going to leave an algorithmic trace of their

activity in the data, okay?

So in the old days, when, let's say, you know,

suppose there's some mutual fund manager who's,

you know, life goal is to track the S&P 500, okay?

And, you know, because of r-- because of, you know,

volatility in the market, you know,

let's say, for instance, Johnson & Johnson has done

extraordinarily well.

And now because Johnson & Johnson has done

extraordinarily well, they're overweight

Johnson & Johnson, right?

They're out of alignment with the S&P 500.

So they need to unload some Johnson & Johnson,

and they're like a Fidelity or Vanguard fund manager.

So actually, their portfolio

is so big that they, let's say, you know,

they have 1% of the outstanding shares

of Johnson & Johnson, a huge stake in a

public company, and they need to reduce

it to, you know, like .8, you know, .8% instead, okay?

This is a massive, massive trade.

And the way the mutual fund manager would have done

this prior to automation is they would have --it

would have been sort of an off-the-floor or what they

would call an upstairs trade.

They would actually call, you know, a somebody at a brokerage.

They would call one of their buddies at a brokerage or

multiple brokerages, and they would say,

"Look, I've got this rather difficult trade.

Um, I don't wanna execute it on the floor of the exchange

because it'll just be too obvious what I'm trying

to do, and, you know, prices will move against me.

You know, could you possibly shop around and see if

there's a counterparty?

Maybe there's somebody el--

Maybe there's somebody else out there that happens to

want to buy, you know, .2 of a percent of all Johnson

& Johnson outstanding shares."

So then we can kind of --there would still be a record of it

eventually at the end of the day or the end of the trade.

It just wouldn't have taken place on the

floor of the exchange.

It would have taken place by people making phone

calls and saying, "Okay, I found a counterparty for

you at this hedge fund or this other investment firm."

Okay?

That kind of got eradicated when the world went like this,

and all of the activity had to be executed through these books

in electronic fashion in order to get competitive prices.

And this was sort of the initial impetus to what

are called dark pools, okay?

So what are dark pools?

Dark pools are another very recent kind of

electronic exchange.

I think the first ones kind of probably came

out about 10 years ago, but they became quite

widespread about five years ago.

Um, they were essentially --they,

they were introduced to sort of deal with

this thing that went away that I described,

to allow large counterparties to sort of trade with each other

at minimal market impact, okay?

The interesting thing about dark pools is that unlike the double

limit order book you don't specify price and volume the

way you do in a limit order.

You just specify volume.

You just say, "This is the volume."

And I'm-- and whether you're a buyer or seller, you're

not specifying the price.

And so now sort of buyers and sellers instead of sort

of being ordered by price, are instead ordered

by time of arrival.

So you just have these two queues of arriving

buyers and sellers.

You just line them up in the order that they arrive.

As long as there are parties on both sides,

you just keep matching them together, okay?

And now the question is like, well, if they're just ordered

by arrival time and when there's no price specified,

when I match a buyer and a seller, what price does that

transaction take place at?

It basically takes place at the midpoint of the so-called NBBO,

which is essentially the midpoint of the lit double limit

order book market in the US.

Okay?

So this is the whole point here, right?

You don't go to a dark pool looking for a price improvement.

You don't go there with a thought like, "Oh,

I'm gonna actually get a better price here than

I could get by trading on the open limit order book."

You go there because you're perfectly happy with the prices

offered on the lit market.

You're just worried that you would never be able to

get them because you'd so quickly push the price in

the wrong direction because of the volume of your trade.

So you're sort of have these-- this exchange now that's pegged

to an existing exchange, the so-called lit market.

These are called dark, right, because you don't really see--

you don't see any of the activity.

You just put yourself in this line and, you know,

you're just told at each second, you know, nothing yet,

nothing yet, nothing yet.

Okay, your trade's executed and here's the price that you got.

And that price should be pegged to the lit market, okay?

So it's, you know, this was such a popular

idea that there are now dozens or-- or, you know,

many dozens of dark pools in the United States.

And they're essentially you go there to compete for liquidity,

to get liquidity not to compete for price.

And so this, you know, generates kind of another interesting kind

of optimized execution problem where instead of trying to

optimize the performance of your trade over time as in

the problem I described before, you're essentially trying to

break it up over space or more precisely over exchanges.

Okay?

So in other words, the-- the--

The problem that's known as smart order routing on

Wall Street is you have these multiple exchange.

Let's say you have K different dark pools and you have a

certain volume of let's say Johnson & Johnson shares

that you wanna sell, okay?

And at each time step, you need to decide, well,

of my remaining inventory, how should I distribute it

among these multiple exchanges to try to maximize the fraction

of my remaining inventory that's executed at each step, right?

So it's again a problem of you have a well specified trade,

I wanna buy or sell V shares of something, but now instead

of trying to get the bos-- best possible prices,

I'm just trying to as quickly as possible unload

this inventory by d-- disbursing it or routing it

to these multiple exchanges.

Okay?

So we thought about this problem from a theoretical

standpoint and that led to kind of an interesting

algorithmic solution that we evaluated experimentally.

I don't have time to go into the details, but, you know,

here's one simple cartoon model of liquidity, right?

So these --so maybe I should state this explicitly.

Different dark pools will have different sort of liquidity

properties at different times and for different stocks, okay?

So for instance, it's quite common

when a new exchange starts whether it's a

limit order book or a dark pool that they'll

try to actually get a foothold in the market

by offering preferential treatment or rebates or fees to

some particular class of stocks.

So you know, you will actually see marketing materials that

say, "Well, you know, if you're trading mid

cap consumer durable stocks, you know,

we are the dark pool you wanna come to.

That's where people who wanna trade consumer

durable stocks are coming

And so this is where your counterparties are.

Don't go to other dark pools because you'll just be sitting

there waiting for your trade to get executed, whereas,

you know, in these stocks, the action is at our dark pool."

Okay?

So, you know, one kind of cartoon

model of this is to imagine that at each

time that there's some underlying stationary

probability distribution for a given stock, let's

say Johnson & Johnson, the different dark pools

will have sort of different liquidity profiles meaning,

um --and-- and-- and so what I mean by this is that let's

imagine that on the X axis is number of shares, okay,

and on the Y axis is sort of probability that that number

of shares is available for execution on the other side

at each given time step.

So more formally what I have in mind is that, you know,

if we submit V shares of our trade to let's say

this particular dark pool, that what happens is that

we draw a number according to this distribution.

We draw S according to this distribution, okay?

And that's sort of the volume of the counterparty on the

other side in this dark pool at that particular

discretized time step.

And so then what gets-- what get executed of

course is actually the minimum of V and S, right?

Because you're never gonna execute more than you asked for.

You might execute less.

So if I submit an order to this dark pool to buy

100,000 shares and S gets drawn and it's a million,

then I'll just be told your 100,000 shares

were executed, okay?

But if I submitted 100,000 shares and S was--

the S that was drawn was 50,000, I'll be told half of your order

was executed, half of it still waiting in line.

Okay?

Okay, so

Yeah.

[AUDIENCE MEMBER] So here, you mentioned you get

to partial execution, but how can the curve

then be non-decreasing?

[MICHAEL KEARNS] This curve is just showing you --You know,

so this is a curve showing, you know, modeling the idea

that, you know, volumes kind of in this range are quite

likely to be available, okay, whereas, you know, larger

volumes are not available.

[AUDIENCE MEMBER] Right.

But if you do partial execution, shouldn't it

be a diminishing curve?

[MICHAEL KEARNS] Don't--

So, the partial execution is sort of on my side, right?

So, I submit a volume.

A number is drawn, which is the volume

available on the other side.

The minimum of the two quantities is

what's executed, and then I still have

that remaining inventory that at the next time step

I need to try to execute.

[AUDIENCE MEMBER] It's a density function?

Okay.

[MICHAEL KEARNS] Yeah.

I thought you were asking more of a mechanism question

than a math question.

It's a density.

It's a density, n-- not a --It's a, you know, PDF, not a CDF.

Yeah, Kevin?

[KEVIN] Why is it stable for there to be so many dark pools?

Isn't it in everyone's interest to just go along?

[MICHAEL KEARNS] These are good questions.

Um, I mean I'm sure Eric has thoughts on this

And I'm not sure --I think Ramesh isn't here,

but I know he's thought about kind of these questions.

Why is there a --I mean, it's the same thing with,

you know, it's the same thing that happened under

the sort of you know, kind of regulation of

financial markets.

There's this famous kind of incident called r-- you know uh,

kind of regulatory ruling called Reg N-- NMS, which isNational

Market System, and, and I-- I'm not sure it's an equilibrium,

but there does seem to be this proliferation and they compete

with each other in various ways.

I mean, it's a good question whether or not at some

point this will shake out and, you know,

there will only be one dark pool or only a couple.

This hasn't, you know --This

hasn't really happened in sort of the continuous

double auction either.

Since they sort of --I mean, by the way,

it used to be the case that regulatorily, you know,

your stock could only trade on the exchange it was listed

and this was sort of the golden era of specialists and the like.

So if your company listed on the New York Stock Exchange,

the only place to buy or sell shares was on the floor of

the New York Stock Exchange.

So when this got deregulated and basically

any startup, if you like, with the know-how and the

ability to get the regulatory approvals could open up shop

and say, "Hey, we're running a double--" You know.

It's --I mean, I think to Eric's point,

it's a very interesting time because the regulatory landscape

is such that, you know, startup companies can

s-- you know, propose brand new mechanisms, right,

like the one Eric's proposing.

[AUDIENCE MEMBER] To be more pointed about it,

if everyone was doing what you're doing here,

wouldn't the end point be only dark pool with trades in it?

[MICHAEL KEARNS] You know from a conceptual standpoint,

I want to agree with you, but so many things in the

actual world of financial markets are obviously out

of equilibrium in some sense.

And so, you know, how to explain that,

it could be not everybody's doing what I'm proposing here,

that they're doing it in different ways.

And by the way, I mean, to, you know, I think on this

question of sort of what does game theory have to say

about financial mark-- I mean, I think it's very difficult,

but at a high level, you know, we think -- the time one of the

things we're worried about is how many other people or groups

are doing something like what we're doing, right?

Because any given trade has a certain capacity, right?

You can't sort of execute arbitrary volumes, so if

we're all doing the same thing, then our respective piece of

that trade or its profitability is definitely going to decline.

[MODERATOR] We're running out of time.

We have run out of time, so let's take further

questions online.

[MICHAEL KEARNS] All right.

So h-- how much time do I have or is it actually zero

or, um --L-- let me just finish this one vignette

and I'll skip the last one, and then we can, we can --Okay.

So it turns out that, you know, in this simple formalism that

I just described, if you kind of specify it there's

actually a very simple and nice algorithm that you can use.

So first of all, the sort of real

machine learning challenge that comes up in

this model is the fact that your observations are censored

by your own actions, right?

So, there's this thing that if you submit 1,000 and

you get more than 1,000, you don't know actually.

Might there have been more, you know, a million,

and you want to kind of know that because you're deciding

how to distribute your order over these multiple exchanges.

And if you just ask the question like, "Well,

suppose I wanted to estimate these probability distributions

from my censored observations," there's a well-known maximum

likelihood estimator for this, which is called the Kaplan-Meier

estimator or sometimes called the product limit estimator.

But we have an exploration problem here as well, right?

We basically wanna learn just enough about each of these pools

to execute our trade optimally.

So, if I'm never executing sort of orders of more

than 100,000 shares, I don't need to know

what the probability of a 10 million share

order getting executed.

I sort of only need to know it up to a certain prefix.

And so there's a very simple and appealing algorithm that kind of

has a bit of an EM flavor to it, which is you basically

just start --you know, you just start trading,

let's say, at the beginning, distributing your orders

uniformly over all of the different dark pools.

Then you start getting information about

them, and you can, for each one of them,

form the Kaplan-Meier estimator of that,

of the distribution of liquidity behind

that dark pool.

It's wrong, of course, okay?

But then you could just pretend that it's right and do optimal

allocation with respect to those estimated distributions.

When you do that, of course, you will then get new

information based on the orders that you submitted, and

then you can just re-estimate.

So it's a simple, you know, do greedy allocation under

your current distributional estimates, then re-estimate

them with the results of the next round of executions.

And one can prove that this converges quickly, you know,

with a polynomial rate of convergence, to the optimal

allocations that you would have made if you knew the

distributions from the start.

Um, and you know, it, it --For those of you

that know a little bit of the reinforcement

learning literature, the analysis of this is

actually quite reminiscent of the analysis of things

like E-- E3 and Rmax uh, which are sort of provably

conversion algorithms for learning and Markov

decision processes.

Here, we don't have a Markov decision process,

but the state is sort of our informational state about

the different liquidities of the different exchanges

rather than an actual state.

Okay?

So, I had another vignette that I won't talk about but you can

go look on the slides at the pointer to the paper at least.

And let me stop there and take any questions if there's time

or we can just morph into lunch.

Yeah, Andrew.

[ANDREW] So, Michael, there's a really interesting

connection between your paper and Eric's, and I think a way

to put it together to make it relevant for algorithmic --So

obviously, you're not the only one doing this on the Street.

And there --but there are a finite number,

a relatively small number of HFTs out

there that are doing this.

So that means that you can actually run a very simple

simulation where suppose that you're not the only

one using your algorithm.

What if there are another two or three other players that

are using similar algorithms but perhaps have different,

slightly different parts of the data?

Can you figure out what that interaction might produce

in terms of the outcome of prices, not mathematically,

but computationally?

And then to connect it to Eric's paper, you, you, you can ask,

ask the question, what different market structures with different

kinds of algorithms are gonna be ideally suited for market

stability versus fragmentation, all the issues that

regulators care about?

And then start asking more sophisticated questions

about how market structure and algorithms interact.

So it almost seems like you can combine what you're doing with

what Eric is doing in a way that could be really interesting.

[MICHAEL KEARNS] Yeah.

I mean, certainly one could do simulations.

The problem is, you know, kind of saying what it means

to be doing something like what we're doing, right?

I mean, I kind of --This isn't quite an answer to

your question, but I sort of am of the belief that,

you know, things like the flash crash or even, like,

the '08 crash, you know, the broader crash in '08.

I mean, there's a sense in which the more automation we have and

the more data that's available, the more inevitable it is

that we're going to have incidents like that simply

because when you have many, many parties that are all using

something like machine learning on the same historical data,

that's going to correlate our behavior automatically, like,

without any, anybody directly intentionally trying to

copy anybody else.

Like, just take this execution problem I described.

If any of you thought about this problem,

you would have come to kind of a reinforcement

learning approach.

If you then said, "Oh, let's go optimize that," you would

have ended up with, you know, things that were probably

numerically very similar to what we ended up with

in our policies, okay?

So, you know, in some sense, right, I mean,

people talk about, like, the smarts on Wall Street and

the quants and, and, you know --

But, at the end of the day, right, the first order effect

here is that our behavior is being correlated through shared

information and are all kind of optimizing with respect

to that shared information.

And this was just not possible 30 years ago, right?

The data wasn't there.

I mean, we could imagine, and I think that that just

forced there to be greater diversity in, in kind of the

approaches people took to the same problem that's kind of

being, you know, that's for

-- And so, I mean, the reason I relate this

to the '08 crash is that, you know, w-- the sort of, m--

maybe not the source of the crash, but the final steps,

which was this liquidity crunch.

You know, like lo and behold, all these hedge funds suddenly

find when they try to go start unwinding their portfolios that

we all have the same portfolios.

So you know, we're all short Johnson

& Johnson and so we're all trying to buy back

those shares to cover our short positions at the same

time, and we all probably got to those portfolios through

this kind of group optimization.

[ANDREW] So if you put together these kind of algorithms and

the potential for correlation, that actually has some very

significant implications for whether you should

use continuous versus batch auctions,

which gets back to Eric's paper.

So I think that this could actually be a really

interesting combination of demonstrating for various

[MICHAEL KEARNS] --yeah.

I agree.

I think there's great research opportunities.

The, you know, the --You can see

I'm coming from this other side of the

fence which, you know, is kind of deeply mired in

the details of the way the exchanges currently work.

And so I think the hard thing in this to sort of formulate

research questions here is sort of what details can you safely

throw out and remain relevant to actual practice and which

are essential, and you know, that's a hard question in

general, but very hard in financial markets.

Yeah, Ashish.

[ASHISH] So are you able to ever detect the signature

of another trading firm?

Like, um-- Um,

[MICHAEL KEARNS] W-- we haven't --We, we don't

spend time really looking at that because we're not

--We're actually not engaged in HFT per se.

I mean, we need to deal with HFTs as counterparties.

I'm sure we're trading with them all of the time every day.

I mean, you, anybody who's

buying or selling stocks at some point

is indirectly you know, trading with them as a

counterparty because they're a very large fraction

of the, of the trading volume.

But it is definitely possible to do this.

I mean, you can definitely,

[AUDIENCE MEMBER] --perhaps even after the fact, like,

tomorrow you will see, well, this must have been a trade

executed by that hedge fund.

[MICHAEL KEARNS] Yeah.

I mean, it would be a statistical knowledge, right?

You wouldn't actually say, um --You wouldn't be able to go

and say, like, "Well, you know, look at this sequence of--"

I mean, because people are breaking up their orders

over time and venues, right?

So you wouldn't be able to say, "Okay, you know,

I'm gonna mark in red each one of the trades that was a child

order of some large block trade that somebody was trying to do."

But you'd definitely be able to say, "You know,

just given the statistics historically and in this

period, it's quite likely that there was, you know,

some large party trying to execute a high volume trade

in a short period of time."

Those kinds of signatures are definitely in the data

and you don't need kind of super --I mean, you,

it would be available even from just, like, the execution data,

like trade and quote data.

You wouldn't need to do, like, limit order book

reconstruction to see those kinds of signatures.

Maybe we should let people go to lunch if we're gonna have

any chance of starting on time.

Thanks.

(applause)

Click on any text or timestamp to jump to that moment in the video

Most transcripts ready in under 5 seconds

One-Click Copy125+ LanguagesSearch ContentJump to Timestamps

Paste YouTube URL

Enter any YouTube video link to get the full transcript

Most transcripts ready in under 5 seconds

Get Our Chrome Extension

Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.

Add to Chrome — Free

Works with YouTube, Coursera, Udemy and more educational platforms

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube TranscriptPreparing your results…

YouTube Transcript:Algorithmic Trading and Machine Learning

AutoDub

Video Transcript

Summary

Core Theme

Paste YouTube URL

Transcript Extraction Form

Get Our Chrome Extension

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube Transcript:
Algorithmic Trading and Machine Learning