This presentation discusses the algorithmic challenges in modern electronic financial markets from the perspective of a quantitative trading group, focusing on optimizing trade execution and managing market impact through machine learning and control theory.
Mind Map
Click to expand
Click to explore the full interactive mind map • Zoom, pan, and navigate
Okay.
[MICHAEL KEARNS] Thanks, Kostas, and thanks for having me.
And I want to thank Eric for his excellent talk because it also
helps set the context for a lot of what I'll be discussing here.
You can think about much of what I'm gonna talk about today,
about what it's like to sort of live on the other side
of these exchanges and to deal with HFTs and other
counterparties as well.
Um I also should say that there's really
gonna be no explicit game theoretic content in this
talk which Kostas promised me was okay, but there will
be quite a bit of practice.
Um, as many of you know, I've, for the last 12 years or so,
I've been working with a quantitative trading
group on Wall Street in, in my sort of shadow life.
And I wanna give special thanks to my trading partner,
Jurri Naivaka who's a co-author on all this work.
All this work kind of came out of some proprietary commercial
context in our trading group.
And the parts that I'm talking about are sort of
the non-proprietary parts that seemed scientifically
interesting to us over time.
Uh, and so, you know, this slide sort of
puts a lot of--
This is kind of just a context slide for a lot of
the things that Eric said.
We certainly live in interesting times on Wall Street, both from
a technological perspective and also more generally
from a social perspective.
So here's the same image that Eric showed of the
flash crash a few years ago.
Here's the famous book by Michael Lewis,
which I highly recommend, on high-frequency trading.
We have Bernie Madoff, other insider traders, and the like.
And so without passing moral judgment on any of
this activity, including high-frequency traders what
I wanna do in this talk is just lay out what some of the
algorithmic challenges are in trading in modern electronic
financial markets from the perspective of somebody working
in a quantitative trading group, essentially something like
a traditional stat arb equities trading group.
What we do in my group is trade equities long and short.
We trade pretty much every liquid instrument
in many, many markets, both domestically
and internationally.
And we hedge exclusively with futures, so we're
not doing any complicated derivatives or the like.
So we're in kind of the simplest markets or the
simplest instruments.
But as Eric has sort of already outlined today,
a lot of the complexity of trading those environments
today comes from the automation of those markets and the data
that's available and what you do with it and what
your counterparties do with it as well.
And needless to say, because of the rising automation and
availability of data on Wall Street, there are many,
many trading problems that one encounters that invite a
learning approach, a machine learning approach, right?
Because you have the data, it's at a temporal and
spatial scale that you can't understand as a human being.
And so one has to develop algorithms to trade sensibly,
and those algorithms have to be adaptive and trained or should
be trained on historical data.
Sometimes a lot of historical data, including very,
very recent data.
And so what I'm gonna do is just try to give a high-level talk
where I quickly outline three sort of little vignettes or case
studies of specific problems that kind of arise in trading,
algorithmic trading in modern electronic markets, and sort
of what the challenges those problems present for kind of
traditional machine learning are, and some kind of hints
about how you can address those problems with new techniques.
And so the first two problems I'm gonna talk
about, the first one's probab-- the first two are quite closely
related to sort of the world that Eric was describing,
are both about problems of optimized execution,
where a person or a model or an algorithm has made
a decision to execute some trade, to like buy, you know,
100,000 shares of Google or short, you know,
100,000 shares of Yahoo, what have you.
And now the problem is actually executing that trade,
because a directive like that is underspecified.
In particular, it's underspecified in
its temporal aspects, right?
So you know, I can say I wanna buy 100,000 shares of something,
but how quickly do I want to do that, okay?
And so as we'll see that there's this trade-off in
all electronic markets.
And by the way, I don't think any of this would
go away with something like the very sensible
alternative mechanisms that Eric is proposing.
You have this trade-off between immediacy and price.
You can be in a hurry and impact prices more
and get worse prices, but you might wanna do that
because let's say you think that the informational advantage you
have is fleeting and you wanna execute the trade quickly.
Or you can try to do things in a more leisurely way and try
to let the market come to your preferred price, but then it'll
take longer in general, okay?
And so the first two problems I'm gonna talk about are sort
of very specific instances of that kind of immediacy versus
price trade-off that you see in electronic markets.
The last one is a little bit more on kind of algorithmic
versions of kind of classical portfolio optimization,
mean-variance optimization.
The kinds of problems where you wanna hold a diversified
portfolio that, you know, kind of maximizes your
return subject to some risk considerations or
volatility considerations, okay?
And I may not finish all of this, but I'll hopefully at
least say a few words about all of these topics.
But I think I did save some time because Eric very
nicely laid out kind of what the modern continuous double
limit order auction looks like.
This is what's sometimes referred to in financial circles
as market microstructure.
And so let me start off by just describing
a canonical trading problem.
We already mentioned it a bit.
Let me make it a little bit more formal.
Suppose that we have a particular trading goal,
like we wanna sell these shares of some stocks.
Buy or sell is symmetric, but just to be concrete,
let's say that we wanna sell a certain number of shares, some
volume that I'll call capital V.
And I also have a specified time horizon in which I
wanna execute that trade.
Um, so in, I wanna sell
V shares in T time steps,
and of course if I'm selling,
I wanna maximize the revenue from
those shares.
And if I'm buying, I wanna minimize
the expenditure for those shares.
And as you might or might not imagine, there's sort
of a very large number of benchmarks that have been
proposed for how to measure the quality of such a trade.
So if I actually go out in the market and execute a, a, an
order to, let's say, you know, buy a certain number of shares
of Google in a specified time period, we have to ask,
well, you know, "How good, how well did I perform," right?
And almost all of the sensible measures kind of measure your
performance relative to the activity that was going on
in the market at the time that you executed your trade.
So a very typical thing would, to do would be to say,
let's say I started this trade at 10 o'clock in the morning.
I executed it over the course of an hour.
How well did the actual average share price that I got in my
execution compare to, let's say, the volume-weighted average
price of the stock during that same period?
That's called the VWAP.
There's something called the TWAP.
I'm, in this particular case, going to use something that's
called the implementation shortfall, which is you look
at the midpoint between the bid and the ask at the time
you started your trade and which of course, by the way,
is itself a fictitious benchmark because by definition, there's
no liquidity at the midpoint between the bid and the ask.
But it sort of seems like a fair peg point,
and then I compare my performance to that
and ask, you know, how much worse I
did than that.
It'll almost always be worse on average than the midpoint
of the bid-ask because in a--
You know, in all of
these problems, your trading activity
itself is going to push the prices in the
direction that disfavors you.
Okay.
If I buy, if I'm buying a large volume of shares, it's
gonna push, push the price up.
If I'm selling, it's gonna push
the price down.
And this is entirely due, well, it's at least largely due to
just the mechanical impact I have on the order book, right?
As I cross the spread and eat liquidity from the other side,
it makes the successive prices worse and worse.
Okay?
So whatever metric you look at, it's very, if you,
if any of you thought about this problem seriously
for just a little while, you'd come to the view that
a very natural framework to think about such problems
is essentially problems of state-based control, okay,
where there are at least two very, very obvious
state variables you wanna consider: how much inventory
you have remaining, i.e., how many of the V shares do
you still have left to buy in the given time period,
and how much time is left.
Okay?
And you might also want to think about other features
or state variables that capture other kinds of
market activity that might have informational advantage as well.
Okay?
So this is just a snapshot, sort of showing what a typical
order book looks like.
This is from many, many years ago in Microsoft
on, on on Island at the time, which has now been
rolled up into s-- to some larger exchange.
But you see the buy orders arranged by,
in order of decreasing price, the sell orders arranged
in de-- ascending price.
By definition, there's always
gonna be a gap between these two,
which is the bid-ask spread that Eric wa-- was discussing.
There's volumes at each of these,
and just to remind you that the way these
markets work is, you know, if I'm trying to let's say buy
shares I can, you know, I can, I can place a, a limit order.
And that limit order will either, you know,
sort of sit somewhere in this price ladder,
or it might actually cross the spread and
execute against the existing liquidity on
the sell book, okay?
And it might partially match.
So if I put in an order for, I don't know,
let's say a thousand shares at, you know, 23,7-79.9,
I would execute 900 of those shares, the first
hundred at this price, the next 800 at this
price, and then the, my remaining 100
shares would down, become the new top
of the buy book.
Okay?
So it's a very mechanical process.
And as Eric pointed out, it's extremely volatile these
days, it's extremely dynamic.
The sort of timescale at which orders arrive and executions
take place is, you know, often at the sub-millisecond
level in highly liquid stocks or even in microseconds.
And as he mentioned also, right, you know, everything's possible.
I can cancel.
I can revise my order.
Partial executions may occur.
And so, you know, one very interesting,
very difficult question or type of research to ask is,
how do individual, you know, sort of microscopic orders
influence aggregate macro behavior, you know, of the
overall aggregate market?
But from the purposes of a trading group like ours,
one of the things we care about most is this trade-off
between immediacy and price.
So, you know, to just, you know, I won't explain this
chart in any detail, but what it's showing you,
the X axis is sort of, you know, what my limit order price is,
let's say if I'm buying, right?
And at the left extreme I'm sort of considering prices
that are sitting way down in the book, so they're very,
very far from where the market is currently trading.
And so what I'm showing you here is the performance of a
very simple class of strategies for this problem of execute
V shares in T time steps, which is what I call kind of
submit-and-leave strategies.
Okay?
And in a submit-and-leave strategy, what you do is
you just pick a fixed limit order price,
put all these shares at that limit order price,
hope that they get executed, but any volume that's not
executed after the T time steps, you have to cross the spread
and buy the shares at whatever prices they're offered on the
other side of the book, okay?
Because in all of these, you know,
it's very as a practical matter,
it's very much the case that whenever
we execute a trade, we always set a time limit
on it because we essentially have some model or empirical
measurement historically of how long our informational
advantage lasts, right?
So if an analyst upgrades a stock, for instance, there's
some informational advantage in trading on that activity, on
that upgrade very, very quickly.
But, you know, after five minutes,
it's gone because the market's essentially
already incorporated that news.
Okay?
So what you see here is that there's this very distinct peak.
If I put my orders very deep down in the book,
then nothing gets executed.
I have to cross the spread all, at the end and pay the
prices on offer.
At the other extreme, I could cross the spread
at the beginning, right, and pay the prices on
offer at the beginning.
These two things are roughly equal,
but you can see that there's a great deal of optimization.
There's sort of a magic price at which I should
put my limit order if all I'm gonna do is leave it there
and then cross the spread with the remaining inventory at the
end of the time interval, okay?
So this already shows you that even within this kind of
brain dead class of strategies there's actually quite a bit
of optimization to do, right?
There's a very, very sharp peak, and this is quite typical.
I don't remember what stock this is in.
Yeah, Ashish.
Does
[ASHISH] This mean the market reaction in dot com?
[MICHAEL KEARNS] No.
So in everything I'm gonna describe here,
I'm sort of only accounting for the mechanical impact.
And the mechanical impact is the only thing you can
kind of optimize for using historical data, right?
Because by definition, you can't ask the counterfactual question
of what would the psychological, or if you like, you know,
the sort of non-mechanical impact be.
I mean, you can have models for it, and there are many models
proposed in the literature.
But from the viewpoint of a group like ours, I mean,
we don't even bother looking at it in those models because we
know that at the end of the day, the only way you're gonna
measure that is by actually starting to trade, right?
And so we, of course, do that in small volumes at the beginning
when we're trying something new and gradually ramp it up as we
get some empirical sense of what that impact is like.
And as you can imagine, right, I mean, all these things
have to do with liquidity.
It's much easier to hide your activity and have it have lower
mechanical impact and kind of counterfactual impact in a
highly liquid name than one-- than something that trades
very infrequently.
Okay.
[AUDIENCE MEMBER] This graph is meant to be general or,
[MICHAEL KEARNS] This is for some specific stock for some
specific period, but this is extremely generic, right?
If you go to any reasonably liquid stock, you would see
that there's really an annual ex-- the exact location.
Like, the zero here means kind of putting your order
at the current bid or the ask.
So, being, you know, kind of putting your order,
not crossing the spread, but not sitting down too far, right?
So because, you know, in a highly illiquid
stock, if you sit down too far, on average,
you're just never gonna get executed because, you know,
even if a little bit of liquidity gets eaten away,
other counterparties are just filling that vacuum behind it.
Okay?
So it's unimodal
So it's very, it's always
unimodal, and it's, and it always has a
relatively sharp peak.
Exactly where that peak is might depend on the stock in question,
but it's a very generic empirical figure.
Okay.
So you know, reinforcement learning,
which is essentially the AI term for,
you know discrete state-based control is
sort of a perfect fit for solving this kind of
problem, where as a first cut, what you might do is just
say, all right, I'm gonna keep track of two things,
the amount of inventory I have remaining and how much time I
have remaining in whatever period I've specified over
which I want to do the trade.
And I won't go into it, but we did, you know, many years ago,
this sort of large-scale empirical study where we
applied kind of fairly generic reinforcement learning using
historical data to learn kind of optimized policies using
those two state variables.
And, you know, the kind of takeaway
message here is you learn very sensible
policies, and this is what you want, right?
I mean, you kind of want, you know, in general, I mean,
we're kind of the opposite of black box quants.
Anytime we use machine learning in a trading problem or in
some kind of strategy model, if we can't back out in our
heads some reasonable kind of economic rationale for
why this makes sense, we're extremely uncomfortable
deploying it in actual trading.
And so what you want to kind of see are pictures like this.
I'm showing you the actual policies learned over three
relatively liquid NASDAQ stocks where the state space is sort of
on the XY plane, where you have a certain amount of time left.
I've discretized the variables here.
So this is sort of decreasing amounts of time left in your
interval and decreasing amounts of inventory left
in the size of the trade that you're trying to execute.
And basically, the sort of z-axis
here is measuring, you know, whether you're s-- you know,
kind of sitting further back in the book versus
crossing the spread.
So the higher the value means the more you're kind of
crossing the bi-- the higher the price you're buying at, right?
You're trying to cross the spread.
And you can see here, you know, very sensible behavior.
As you're running out of time and you have a
lot of inventory left, then you better get going
and start being more aggressive in your trades.
If you've got a lot of time left and very little inventory,
so this might have happened because, you know,
at the beginning, some other counterparty
kind of crossed the spread and executed a lot of your shares
very early in the interval.
Well, now you can afford to be patient, right?
Now you can afford to drop down in the book and hope
for price improvement and get a better price.
Okay?
So, you know, so if you do
this across many, many different stocks and
over many, many different historical periods, you
will always get kind of a shape that looks like this.
But the details really matter here, right?
The numerical details really matter because different stocks
have different bid-ask spreads.
The sort of price, the bid-ask spread
determines how big you have to be you know,
how large your price has to be up, you know, above
the bid to actually reach the buyers or the sellers
on the other side, et cetera.
So this is all kind of good news-- Question.
Yeah.
So
[AUDIENCE MEMBER] These deadlines, an example
would be like what you said, the analyst came,
made an upgrade, and now you just know you have--
[MICHAEL KEARNS] Yeah, yeah.
And these, and these volumes and, you know, as a sort of
to talk about the practice of it a little bit more,
these volumes and these time intervals, we would
kind of self-impose them.
Now, we might optimize them, right?
Because everything here is a moving part that interacts
with everything else.
So for instance, and, and this is the kind
of thing we're always trading off, you know,
we might know that we can kind of on average
get better prices by taking a longer time interval,
but then we'll also know there's a, you know,
essentially a decay of what you would call our
alpha on Wall Street.
Like our informational advantage, right?
And so we'll always be trading those two things off
and testing them on historical data to the extent that we can.
And then it's,
But then at some point you have to, you know, we have to either,
you know, if we're using kind of direct market access algorithms
or if we're trading through a prime brokerage, we have
to have a directive, right?
You, you kind of can't say to the brokerage,
or you definitely don't wanna say to the brokerage, "Oh,
you know, why don't you buy, you know some Google shares
over whatever amount of time you feel like it."
Right?
We have to actually give them a very specific directive.
[AUDIENCE MEMBER] So it seems that something
more complex would be
some kind of learning model where as this is happening,
you might detect things at once that will make you want to
change your time limit, right?
Potentially.
[MICHAEL KEARNS] Potentially, yeah.
Um--
[AUDIENCE MEMBER] A more complex learning problem.
[MICHAEL KEARNS] Yeah.
And by the way, I mean, like one form of that, for instance,
is it's quite often that when one analyst upgrades their
view on a stock other analysts upgrade their view on that same
stock in quick succession.
And so one of the things you need to decide there is like,
is that actually fresh news or is this just kind of an artifact
of some other more basic news that's getting into the market?
And then you know, in that case, you might not even wanna
trade on it at all.
Yeah.
[AUDIENCE MEMBER] I don't understand why at the end of
your time period you want to buy the rest of the volume.
Why not just give up at that point?
You bought what you bought.
[MICHAEL KEARNS] I mean, the high level answer --
so there's two answers to this.
One is if you're an actual brokerage like Bank of
America and you have an algorithmic trading desk,
you're performing these actions on behalf of a
client that's given you the directive, right?
And so you don't ask, "Well, you know, hey,
I didn't get all your shares of Google.
Do y-- are you sure you still wanna buy them?"
They, the language in which they communicate with you is
a directive to buy this many shares in this amount of time.
From our standpoint, we do have that option,
but the reason we don't is because at some point we've
kind of optimized all of these things together and realized
that this is sort of the right amount of volume to buy based
on this informational signal.
And by the way, that volume is sort
of usually the maximum that we think we can get
away with without pushing prices around too much.
So you could do something like what you're suggesting, but I
think, as a practical matter, there's sort of so many
things to think about in this world that the fewer
moving free parameters you can have, the better.
Yeah.
[AUDIENCE MEMBER] Is the counterfactual market impact
being ignored seems to be a big problem in my mind.
When you do the testing on the six months following, are
you doing that live or is that
[MICHAEL KEARNS] Still using the model that you-- Still using,
so this is, you know, you can think about this
study as limiting the uh, sort of scope to the
improvements you can get on that particular kind
of market impact, okay?
And in general, you know, as a practical matter,
the way we deal with the second kind of impact is
by, you know, I mean, first of all, we have many rules.
In fact, we even have like contractual regulations on
how much volume, you know, sort of how much volume we can
trade in any particular stock.
Okay, so for instance, my specific group, you know,
we will never be more than, I mean, we have a contractual
limit that says we will never constitute more than 1% of
the average daily volume in any single stock.
But we don't even ever wanna get close to that.
We will sometimes be a half a percent.
And by the way, we're not an especially
large group, right?
So there are a lot of groups with this sort
of behavior that they're willing to be that, you know,
like a quarter or half a percent of the average daily volume.
Yeah, Alan.
[ALAN] So I'm very happy to see that you're talking
about a glass box rather than a black box, machine learning
And I guess this being able to, for human beings to be able to
understand what is going on.
I wonder, given the complexity of what's being learned the
special tools you have to explain these policies
to people, is it more done by eyeballing?
What is the--
[MICHAEL KEARNS] I mean, in the case of a problem like this,
it's very clear what sort of sensible behavior means, right?
If somehow the learning process told me, "Oh,
you're running out of time.
You have a huge amount of inventory left,
and so, you know, reduce your buy
price lower."
This is kind of obvious.
The harder thing is when you have strategies that work well
and you don't quite, you know, work well historically, maybe
they are even working well in live trading, and you don't have
a clear handle and, you know.
And so, here, I mean, I think just the nature of the problem
makes what's sensible clearer.
Other times you, we find ourselves
admittedly kind of thinking about almost thinking about
it from an anthropological perspective, which is like,
you know, if this trade is consistently making money,
there's gotta be some other group of counterparties out
there that are, you know, are taking the other
side of this trade.
And what, you know, what is the reason, you know,
what is their reason for persistently doing this,
you know, other than, you know, stupidity, right?
And actually you're worried if the other,
if the reason is stupidity, right?
Because people you know, people often realize that
they're doing something stupid and stop doing it, and then
you can lose a lot of money.
But if there's sort of a good reason why, I mean,
so like to give a concrete example it very consistently,
you can very consistently make small amounts of money, by,
in the period before the market opens, laying down a ladder of
limit orders on both side of the books in every liquid stock.
But spaced far enough that if your first order gets hit,
it's a significant kind of opening, you know,
10-minute deviation for that stock, okay?
And if your second order gets hit, it's like a, you know, an
even larger standard deviation.
So basically, just laying down
these orders on both sides of the book,
seeing if anything gets hit, and then liquidating it over,
let's say, the first two hours after the market open very
consistently makes money.
It's not easy to implement this in kind of, you know,
there's a lot
Uh --The reason I'm telling you this is that I know that
the engineering details of this are non-trivial.
But so why is this?
Well, the reason is that many parties that sort of
trade on a slower time scale, like mutual fund managers,
first of all, they like to trade at the open or the close
to try to benchmark to opening and closing prices, because
that's how their performance is benchmark, A. and B, right,
they're often reacting to overnight news, right?
And so they're all coming in the morning, but it's sort of
worthwhile for them to do so, because they wanna get it close
to the open or to the close.
The same thing works on the close as well.
So there, it's not sort of like there's an obvious optimization
criteria and there's a sensible behavior to it.
But there's more of a sociological
explanation to it.
Okay.
So now, I mean, already, you know, again, just on
this aspect of mechanical market impact if you
compared, you know, sort of how well you
can do by keeping track of these two simple state
variables and taking kind of a control theoretic approach,
versus let's say the best you could have done by optimizing
within this class of submit and leave strategies, right,
which don't know about time and remain-- volume remaining,
they just keep an order fixed and then when time runs out,
they cross the spread
You're already getting 35% improvement on
average using this kind of approach compared to that.
Um, and th-- this is a meaningful
improvement in this world because, you know, I think as
Eric mentioned, right, you know, kind of on this particular
problem, performance deltas, like sort of, you know, th--
the sort of meaningful improvements in performance
are measured in basis points, which are 100th 100th of a
percentage point, right?
So if you're sort of getting 35% improvement over something
reasonable, it's non-trivial.
Yeah.
[AUDIENCE MEMBER] So the submit and leave here, in comparison,
is measured without submit and leave prices optimized
with hindsight?
[MICHAEL KEARNS] Yes.
So, it's like a learning-to-learning experiment.
So either do reinforcement learning on the historical
data or use the historical data to find the best single
limit order price within this restricted class of strategies.
You already get a 35% improvement.
[AUDIENCE MEMBER] But the reinforcement learning,
is that also given hindsight or not?
[MICHAEL KEARNS] No.
So it's all trained test methodology.
We take a period of historical data.
We train, reinforcement learning learns some policy
on that historical period.
Over the same historical period, we find the optimized submit
and leave strategy.
We compare both of them on this successive out-of-sample
time period, and you're getting a 35% improvement.
[AUDIENCE MEMBER] So do you compare only on like
when there's sort of like information change in the
market, or you like, you know, training data is like
any arbitrary interval?
[MICHAEL KEARNS] Okay.
So I mean, these experiments we did over a long historical
period with many, many kind of sliding windows of
training and testing data.
To get, you know, to get into the details a little bit, one
of the things that's hard in applying machine learning to
financial or market data is the fact that if you're not careful
you can learn unintentionally all kinds of like, you know,
just sort of drift, okay?
So if you happen to train over a period in which the market is
just rising overall, you know, you'll learn a policy that,
you know, basically just says, you know, "You should, you know,
you should buy," right?
Because the market is rising at a high level, right?
And you don't want to learn this kind of drift.
You can also, you know, learn other sorts of
seasonalities by which I don't mean, you know, fall, winter,
but other kinds of, you know, different trading days have
different volumes, different periods of the day have more
or less trading activity.
And so there's a lot of engineering that kind of
goes into kind of creating your models in a way that
makes sure you're not just learning these seasonalities
or these, you know, kind of directional drifts.
[ALAN] Okay.
[MICHAEL KEARNS] So, you know, it's of course natural to ask,
once you see that this sort of simple control
theoretic state-based approach works, you know,
reasonably well to ask, "Well, maybe I should add other state
variables which don't just have to do with how much time
and volume are remaining, but that are actually
looking at, you know, what's going on in the market."
And since we're in kind of a double limit order book
microstructure setting, it's quite natural to
look at features of the order book itself, right?
Or the recent temporal activity in the order book.
And by the way, I am of the belief or
kind of semi-knowledge that I think a lot of HFT firms
who are not trying to solve execution problems per se,
they're actually trying to predict directional movement
in the market, right?
But I think they're using a lot of the same kind of
machine learning techniques.
I think they're spending a lot of time sitting around
inventing features of the order books that might have
some slight informational advantage in, let's say,
predicting whether the midpoint of the spread
is gonna go up or down in the next 500 milliseconds,
for instance, okay?
It's not easy work and it requires sort of technology
that's not really in our domain, so we don't do
that kind of thing.
But I think the methods are very much the same.
And I won't go into this.
[AUDIENCE MEMBER] I mean, they'd probably be a lot
more open to black box, right?
Because I
[MICHAEL KEARNS] Think so, yeah.
I mean, I think especially at the microstructural level,
you kind of have to be, right?
Because there's no --I mean, at some level,
it's not, you know, there's --you can't
really understand it, right?
It really has to do with sort of volumes and speeds
of data that are beyond kind of normal human comprehension.
And so you mean you can be very careful about your
train and testing methodology and sort of know that there's
some consistent predictive power to some state variable.
But knowing why is very, very hard at that granularity, right?
Thi-this is just showing you a list of a bunch of kind of
features of the order book that we invented and added
to this basic state space of time and volume remaining,
and the percentage improvement that you get by adding them.
Um, I'm not gonna go into all of them, but, you know, a lot
of them are, are, are quite s-- straightforward, like what is
the bid-ask spread, right?
So knowing what the bid-ask spread is at a given moment
is a very valuable thing to incorporate into
the state space.
Because in particular, you know, implicitly,
now the policy can learn exactly how high a price
it needs to put in to just get to the bid and
not, let's say, overshoot the bid by a lot, right?
Because if I basically say, like, "Well,
I'm gonna raise my current price by 15
cents," like, well, is that 15 cents just
reaching the other side and eating a little bit of volume,
or is it actually crossing very deep into the book and giving
me terrible, terrible prices?
Things like signing trades is very useful.
Signing trades is basically going through the order
books and saying whether each trade was initiated
by a buyer crossing the spread to the sell book
or a seller crossing the book to the buy book.
So, you know, and some of these things, right,
actually don't help you that much.
So we thought it would be helpful to know exactly
how much volume there is at the bid or the ask.
It turns out this is not --I mean, it looks slightly harmful,
And this could just be for overfitting reasons, right?
Like if I add some variable which doesn't
have any informational value, then, you know,
my model in training will, you know,
fit kind of to the behavior of
that variable, but it's not useful
out of sample.
But, you know, it turns out
that, you know, if you take kind
of the best of these features that did
add informational value and combine them,
you get kind of another 13% improvement over and
above the 35% improvement that I described on the last slide.
Okay.
[MICHAEL KEARNS] Good.
So you know, just again, to kind of riff or free
ride on Eric's talk one of the things that's kind
of very interesting about modern financial markets and
their evolution over the last 20 years or so is that, you know,
again, as he kind of intimated, sometimes in kind of going
to automation and apparent --and transparency, a--
Actually there was some kind of human aspect of prior trading
that served a very useful purpose that was eradicated.
And you know, it made certain types of
trade much more difficult, and then some new electronic
mechanism was introduced to try to replicate that
functionality before.
And so an example of that is, you know again, you know,
this picture of a double --of continuous double limit order
book, this has been around since sort of the dawn
of financial markets.
It's just that prior to automation on Wall Street,
for instance, you know, it would actually be
maintained in a paper ledger by the market maker in that stock
who would sort of, you know, have these little chits of
paper and order them, you know, in order of increasing or
decreasing price, whether on the buy or the sell book.
And so there wasn't this kind of transparency.
But with this kind of transparency where this
data is freely available to, you know, anybody who
wants to can do, you know, real-time reconstruction
of limit order books and see kind of the trades
of everybody else, not earmarked by identity, right?
You're not, you're not being told, "Oh,
this is a Credit Suisse order, and this is from SAC Capital,"
or something like that.
But you can definitely algorithmically detect,
for instance, when there's a large trade in the market,
when somebody's trying to, let's say, unload a lot of
some stock in some small amount of time, you know,
the way they have to do it is through a market like this.
And they're going to leave an algorithmic trace of their
activity in the data, okay?
So in the old days, when, let's say, you know,
suppose there's some mutual fund manager who's,
you know, life goal is to track the S&P 500, okay?
And, you know, because of r-- because of, you know,
volatility in the market, you know,
let's say, for instance, Johnson & Johnson has done
extraordinarily well.
And now because Johnson & Johnson has done
extraordinarily well, they're overweight
Johnson & Johnson, right?
They're out of alignment with the S&P 500.
So they need to unload some Johnson & Johnson,
and they're like a Fidelity or Vanguard fund manager.
So actually, their portfolio
is so big that they, let's say, you know,
they have 1% of the outstanding shares
of Johnson & Johnson, a huge stake in a
public company, and they need to reduce
it to, you know, like .8, you know, .8% instead, okay?
This is a massive, massive trade.
And the way the mutual fund manager would have done
this prior to automation is they would have --it
would have been sort of an off-the-floor or what they
would call an upstairs trade.
They would actually call, you know, a somebody at a brokerage.
They would call one of their buddies at a brokerage or
multiple brokerages, and they would say,
"Look, I've got this rather difficult trade.
Um, I don't wanna execute it on the floor of the exchange
because it'll just be too obvious what I'm trying
to do, and, you know, prices will move against me.
You know, could you possibly shop around and see if
there's a counterparty?
Maybe there's somebody el--
Maybe there's somebody else out there that happens to
want to buy, you know, .2 of a percent of all Johnson
& Johnson outstanding shares."
So then we can kind of --there would still be a record of it
eventually at the end of the day or the end of the trade.
It just wouldn't have taken place on the
floor of the exchange.
It would have taken place by people making phone
calls and saying, "Okay, I found a counterparty for
you at this hedge fund or this other investment firm."
Okay?
That kind of got eradicated when the world went like this,
and all of the activity had to be executed through these books
in electronic fashion in order to get competitive prices.
And this was sort of the initial impetus to what
are called dark pools, okay?
So what are dark pools?
Dark pools are another very recent kind of
electronic exchange.
I think the first ones kind of probably came
out about 10 years ago, but they became quite
widespread about five years ago.
Um, they were essentially --they,
they were introduced to sort of deal with
this thing that went away that I described,
to allow large counterparties to sort of trade with each other
at minimal market impact, okay?
The interesting thing about dark pools is that unlike the double
limit order book you don't specify price and volume the
way you do in a limit order.
You just specify volume.
You just say, "This is the volume."
And I'm-- and whether you're a buyer or seller, you're
not specifying the price.
And so now sort of buyers and sellers instead of sort
of being ordered by price, are instead ordered
by time of arrival.
So you just have these two queues of arriving
buyers and sellers.
You just line them up in the order that they arrive.
As long as there are parties on both sides,
you just keep matching them together, okay?
And now the question is like, well, if they're just ordered
by arrival time and when there's no price specified,
when I match a buyer and a seller, what price does that
transaction take place at?
It basically takes place at the midpoint of the so-called NBBO,
which is essentially the midpoint of the lit double limit
order book market in the US.
Okay?
So this is the whole point here, right?
You don't go to a dark pool looking for a price improvement.
You don't go there with a thought like, "Oh,
I'm gonna actually get a better price here than
I could get by trading on the open limit order book."
You go there because you're perfectly happy with the prices
offered on the lit market.
You're just worried that you would never be able to
get them because you'd so quickly push the price in
the wrong direction because of the volume of your trade.
So you're sort of have these-- this exchange now that's pegged
to an existing exchange, the so-called lit market.
These are called dark, right, because you don't really see--
you don't see any of the activity.
You just put yourself in this line and, you know,
you're just told at each second, you know, nothing yet,
nothing yet, nothing yet.
Okay, your trade's executed and here's the price that you got.
And that price should be pegged to the lit market, okay?
So it's, you know, this was such a popular
idea that there are now dozens or-- or, you know,
many dozens of dark pools in the United States.
And they're essentially you go there to compete for liquidity,
to get liquidity not to compete for price.
And so this, you know, generates kind of another interesting kind
of optimized execution problem where instead of trying to
optimize the performance of your trade over time as in
the problem I described before, you're essentially trying to
break it up over space or more precisely over exchanges.
Okay?
So in other words, the-- the--
The problem that's known as smart order routing on
Wall Street is you have these multiple exchange.
Let's say you have K different dark pools and you have a
certain volume of let's say Johnson & Johnson shares
that you wanna sell, okay?
And at each time step, you need to decide, well,
of my remaining inventory, how should I distribute it
among these multiple exchanges to try to maximize the fraction
of my remaining inventory that's executed at each step, right?
So it's again a problem of you have a well specified trade,
I wanna buy or sell V shares of something, but now instead
of trying to get the bos-- best possible prices,
I'm just trying to as quickly as possible unload
this inventory by d-- disbursing it or routing it
to these multiple exchanges.
Okay?
So we thought about this problem from a theoretical
standpoint and that led to kind of an interesting
algorithmic solution that we evaluated experimentally.
I don't have time to go into the details, but, you know,
here's one simple cartoon model of liquidity, right?
So these --so maybe I should state this explicitly.
Different dark pools will have different sort of liquidity
properties at different times and for different stocks, okay?
So for instance, it's quite common
when a new exchange starts whether it's a
limit order book or a dark pool that they'll
try to actually get a foothold in the market
by offering preferential treatment or rebates or fees to
some particular class of stocks.
So you know, you will actually see marketing materials that
say, "Well, you know, if you're trading mid
cap consumer durable stocks, you know,
we are the dark pool you wanna come to.
That's where people who wanna trade consumer
durable stocks are coming
And so this is where your counterparties are.
Don't go to other dark pools because you'll just be sitting
there waiting for your trade to get executed, whereas,
you know, in these stocks, the action is at our dark pool."
Okay?
So, you know, one kind of cartoon
model of this is to imagine that at each
time that there's some underlying stationary
probability distribution for a given stock, let's
say Johnson & Johnson, the different dark pools
will have sort of different liquidity profiles meaning,
um --and-- and-- and so what I mean by this is that let's
imagine that on the X axis is number of shares, okay,
and on the Y axis is sort of probability that that number
of shares is available for execution on the other side
at each given time step.
So more formally what I have in mind is that, you know,
if we submit V shares of our trade to let's say
this particular dark pool, that what happens is that
we draw a number according to this distribution.
We draw S according to this distribution, okay?
And that's sort of the volume of the counterparty on the
other side in this dark pool at that particular
discretized time step.
And so then what gets-- what get executed of
course is actually the minimum of V and S, right?
Because you're never gonna execute more than you asked for.
You might execute less.
So if I submit an order to this dark pool to buy
100,000 shares and S gets drawn and it's a million,
then I'll just be told your 100,000 shares
were executed, okay?
But if I submitted 100,000 shares and S was--
the S that was drawn was 50,000, I'll be told half of your order
was executed, half of it still waiting in line.
Okay?
Okay, so
Yeah.
[AUDIENCE MEMBER] So here, you mentioned you get
to partial execution, but how can the curve
then be non-decreasing?
[MICHAEL KEARNS] This curve is just showing you --You know,
so this is a curve showing, you know, modeling the idea
that, you know, volumes kind of in this range are quite
likely to be available, okay, whereas, you know, larger
volumes are not available.
[AUDIENCE MEMBER] Right.
But if you do partial execution, shouldn't it
be a diminishing curve?
[MICHAEL KEARNS] Don't--
So, the partial execution is sort of on my side, right?
So, I submit a volume.
A number is drawn, which is the volume
available on the other side.
The minimum of the two quantities is
what's executed, and then I still have
that remaining inventory that at the next time step
I need to try to execute.
[AUDIENCE MEMBER] It's a density function?
Okay.
[MICHAEL KEARNS] Yeah.
I thought you were asking more of a mechanism question
than a math question.
It's a density.
It's a density, n-- not a --It's a, you know, PDF, not a CDF.
Yeah, Kevin?
[KEVIN] Why is it stable for there to be so many dark pools?
Isn't it in everyone's interest to just go along?
[MICHAEL KEARNS] These are good questions.
Um, I mean I'm sure Eric has thoughts on this
And I'm not sure --I think Ramesh isn't here,
but I know he's thought about kind of these questions.
Why is there a --I mean, it's the same thing with,
you know, it's the same thing that happened under
the sort of you know, kind of regulation of
financial markets.
There's this famous kind of incident called r-- you know uh,
kind of regulatory ruling called Reg N-- NMS, which isNational
Market System, and, and I-- I'm not sure it's an equilibrium,
but there does seem to be this proliferation and they compete
with each other in various ways.
I mean, it's a good question whether or not at some
point this will shake out and, you know,
there will only be one dark pool or only a couple.
This hasn't, you know --This
hasn't really happened in sort of the continuous
double auction either.
Since they sort of --I mean, by the way,
it used to be the case that regulatorily, you know,
your stock could only trade on the exchange it was listed
and this was sort of the golden era of specialists and the like.
So if your company listed on the New York Stock Exchange,
the only place to buy or sell shares was on the floor of
the New York Stock Exchange.
So when this got deregulated and basically
any startup, if you like, with the know-how and the
ability to get the regulatory approvals could open up shop
and say, "Hey, we're running a double--" You know.
It's --I mean, I think to Eric's point,
it's a very interesting time because the regulatory landscape
is such that, you know, startup companies can
s-- you know, propose brand new mechanisms, right,
like the one Eric's proposing.
[AUDIENCE MEMBER] To be more pointed about it,
if everyone was doing what you're doing here,
wouldn't the end point be only dark pool with trades in it?
[MICHAEL KEARNS] You know from a conceptual standpoint,
I want to agree with you, but so many things in the
actual world of financial markets are obviously out
of equilibrium in some sense.
And so, you know, how to explain that,
it could be not everybody's doing what I'm proposing here,
that they're doing it in different ways.
And by the way, I mean, to, you know, I think on this
question of sort of what does game theory have to say
about financial mark-- I mean, I think it's very difficult,
but at a high level, you know, we think -- the time one of the
things we're worried about is how many other people or groups
are doing something like what we're doing, right?
Because any given trade has a certain capacity, right?
You can't sort of execute arbitrary volumes, so if
we're all doing the same thing, then our respective piece of
that trade or its profitability is definitely going to decline.
So
[MODERATOR] We're running out of time.
We have run out of time, so let's take further
questions online.
[MICHAEL KEARNS] All right.
So h-- how much time do I have or is it actually zero
or, um --L-- let me just finish this one vignette
and I'll skip the last one, and then we can, we can --Okay.
So it turns out that, you know, in this simple formalism that
I just described, if you kind of specify it there's
actually a very simple and nice algorithm that you can use.
So first of all, the sort of real
machine learning challenge that comes up in
this model is the fact that your observations are censored
by your own actions, right?
So, there's this thing that if you submit 1,000 and
you get more than 1,000, you don't know actually.
Might there have been more, you know, a million,
and you want to kind of know that because you're deciding
how to distribute your order over these multiple exchanges.
And if you just ask the question like, "Well,
suppose I wanted to estimate these probability distributions
from my censored observations," there's a well-known maximum
likelihood estimator for this, which is called the Kaplan-Meier
estimator or sometimes called the product limit estimator.
But we have an exploration problem here as well, right?
We basically wanna learn just enough about each of these pools
to execute our trade optimally.
So, if I'm never executing sort of orders of more
than 100,000 shares, I don't need to know
what the probability of a 10 million share
order getting executed.
I sort of only need to know it up to a certain prefix.
And so there's a very simple and appealing algorithm that kind of
has a bit of an EM flavor to it, which is you basically
just start --you know, you just start trading,
let's say, at the beginning, distributing your orders
uniformly over all of the different dark pools.
Then you start getting information about
them, and you can, for each one of them,
form the Kaplan-Meier estimator of that,
of the distribution of liquidity behind
that dark pool.
It's wrong, of course, okay?
But then you could just pretend that it's right and do optimal
allocation with respect to those estimated distributions.
When you do that, of course, you will then get new
information based on the orders that you submitted, and
then you can just re-estimate.
So it's a simple, you know, do greedy allocation under
your current distributional estimates, then re-estimate
them with the results of the next round of executions.
And one can prove that this converges quickly, you know,
with a polynomial rate of convergence, to the optimal
allocations that you would have made if you knew the
distributions from the start.
Um, and you know, it, it --For those of you
that know a little bit of the reinforcement
learning literature, the analysis of this is
actually quite reminiscent of the analysis of things
like E-- E3 and Rmax uh, which are sort of provably
conversion algorithms for learning and Markov
decision processes.
Here, we don't have a Markov decision process,
but the state is sort of our informational state about
the different liquidities of the different exchanges
rather than an actual state.
Okay?
So, I had another vignette that I won't talk about but you can
go look on the slides at the pointer to the paper at least.
And let me stop there and take any questions if there's time
or we can just morph into lunch.
Yeah, Andrew.
[ANDREW] So, Michael, there's a really interesting
connection between your paper and Eric's, and I think a way
to put it together to make it relevant for algorithmic --So
obviously, you're not the only one doing this on the Street.
And there --but there are a finite number,
a relatively small number of HFTs out
there that are doing this.
So that means that you can actually run a very simple
simulation where suppose that you're not the only
one using your algorithm.
What if there are another two or three other players that
are using similar algorithms but perhaps have different,
slightly different parts of the data?
Can you figure out what that interaction might produce
in terms of the outcome of prices, not mathematically,
but computationally?
And then to connect it to Eric's paper, you, you, you can ask,
ask the question, what different market structures with different
kinds of algorithms are gonna be ideally suited for market
stability versus fragmentation, all the issues that
regulators care about?
And then start asking more sophisticated questions
about how market structure and algorithms interact.
So it almost seems like you can combine what you're doing with
what Eric is doing in a way that could be really interesting.
[MICHAEL KEARNS] Yeah.
I mean, certainly one could do simulations.
The problem is, you know, kind of saying what it means
to be doing something like what we're doing, right?
I mean, I kind of --This isn't quite an answer to
your question, but I sort of am of the belief that,
you know, things like the flash crash or even, like,
the '08 crash, you know, the broader crash in '08.
I mean, there's a sense in which the more automation we have and
the more data that's available, the more inevitable it is
that we're going to have incidents like that simply
because when you have many, many parties that are all using
something like machine learning on the same historical data,
that's going to correlate our behavior automatically, like,
without any, anybody directly intentionally trying to
copy anybody else.
Like, just take this execution problem I described.
If any of you thought about this problem,
you would have come to kind of a reinforcement
learning approach.
If you then said, "Oh, let's go optimize that," you would
have ended up with, you know, things that were probably
numerically very similar to what we ended up with
in our policies, okay?
So, you know, in some sense, right, I mean,
people talk about, like, the smarts on Wall Street and
the quants and, and, you know --
But, at the end of the day, right, the first order effect
here is that our behavior is being correlated through shared
information and are all kind of optimizing with respect
to that shared information.
And this was just not possible 30 years ago, right?
The data wasn't there.
I mean, we could imagine, and I think that that just
forced there to be greater diversity in, in kind of the
approaches people took to the same problem that's kind of
being, you know, that's for
-- And so, I mean, the reason I relate this
to the '08 crash is that, you know, w-- the sort of, m--
maybe not the source of the crash, but the final steps,
which was this liquidity crunch.
You know, like lo and behold, all these hedge funds suddenly
find when they try to go start unwinding their portfolios that
we all have the same portfolios.
So you know, we're all short Johnson
& Johnson and so we're all trying to buy back
those shares to cover our short positions at the same
time, and we all probably got to those portfolios through
this kind of group optimization.
[ANDREW] So if you put together these kind of algorithms and
the potential for correlation, that actually has some very
significant implications for whether you should
use continuous versus batch auctions,
which gets back to Eric's paper.
So I think that this could actually be a really
interesting combination of demonstrating for various
[MICHAEL KEARNS] --yeah.
I agree.
I think there's great research opportunities.
The, you know, the --You can see
I'm coming from this other side of the
fence which, you know, is kind of deeply mired in
the details of the way the exchanges currently work.
And so I think the hard thing in this to sort of formulate
research questions here is sort of what details can you safely
throw out and remain relevant to actual practice and which
are essential, and you know, that's a hard question in
general, but very hard in financial markets.
Yeah, Ashish.
[ASHISH] So are you able to ever detect the signature
of another trading firm?
Like, um-- Um,
[MICHAEL KEARNS] W-- we haven't --We, we don't
spend time really looking at that because we're not
--We're actually not engaged in HFT per se.
I mean, we need to deal with HFTs as counterparties.
I'm sure we're trading with them all of the time every day.
I mean, you, anybody who's
buying or selling stocks at some point
is indirectly you know, trading with them as a
counterparty because they're a very large fraction
of the, of the trading volume.
But it is definitely possible to do this.
I mean, you can definitely,
[AUDIENCE MEMBER] --perhaps even after the fact, like,
tomorrow you will see, well, this must have been a trade
executed by that hedge fund.
[MICHAEL KEARNS] Yeah.
I mean, it would be a statistical knowledge, right?
You wouldn't actually say, um --You wouldn't be able to go
and say, like, "Well, you know, look at this sequence of--"
I mean, because people are breaking up their orders
over time and venues, right?
So you wouldn't be able to say, "Okay, you know,
I'm gonna mark in red each one of the trades that was a child
order of some large block trade that somebody was trying to do."
But you'd definitely be able to say, "You know,
just given the statistics historically and in this
period, it's quite likely that there was, you know,
some large party trying to execute a high volume trade
in a short period of time."
Those kinds of signatures are definitely in the data
and you don't need kind of super --I mean, you,
it would be available even from just, like, the execution data,
like trade and quote data.
You wouldn't need to do, like, limit order book
reconstruction to see those kinds of signatures.
Maybe we should let people go to lunch if we're gonna have
any chance of starting on time.
Thanks.
(applause)
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.