The speaker expresses a critical view of the current state of Quantum Machine Learning (QML), highlighting fundamental issues in research direction, model design, and benchmarking, and proposes a new research agenda focused on understanding quantum computing's strengths in structured problems and interference, rather than solely pursuing quantum advantage through variational circuits.
Mind Map
Click to expand
Click to explore the full interactive mind map • Zoom, pan, and navigate
thank you very much so I'm very excited
to be here first of all because the
speaker lineup is something out of my
dreams it's either my friends or people
that are always wanted to see again or
people always wanted to meet so I'm
really Keen like for what's happening
this week thanks very much IAM as well
and secondly I'm excited and also
slightly nervous because in the last 3
years I've actually not given talks at
all if I could help it really the bare
minimum and the reason for that was that
I had a growing feeling of um ease with
the literature in Quantum machine
learning especially in the field of
thinking about how quantum computers can
be used for machine learning and this
unease was also reflected in my own
research I always had this feeling
something was missing there was an
elephant in the room that we don't
address that there were problems and so
um this is now the first talk where I
want to present a bit of work where I'm
very confident again that this is going
into the right direction so I spent the
last couple of years besides having a
child there's another one on the way so
a bit of family stuff as well but I
spent a lot of the last years really
thinking about how to verbalize these
issues that are kind of felt and
secondly how to build a research agenda
that uh addresses them or circumvents
them or sometimes is a little bit
different and this is um so what I want
to present here are preliminary results
from two pilot studies in two Focus
areas that we're now kind of following
exam do and um so the reason why I could
spend so much time in actually thinking
about a conceptual version and also you
know not publish a lot not go to talks a
lot is that um I was you know asked to
build a little Quantum machine learning
team xandu and a lot of the credits go
to those folks here this is the core
team this is a bit more the larger
Circle if you like what you hear here
then consider working for us it's a very
sweet team and we have um what's an
opportunity and curse we are all remote
um so yeah okay cool the first thing I
realized in the last couple of years
speaking to also people like for example
Ryan or like a couple of you in the
audience is that these concerns that I
have come from the objective that I'm
trying to to optimize and you might not
share this objective but the first thing
I have to do is be open with this
objective I'm working in Industry um
zanu's mission is to build quantum
computers that are useful and available
to people everywhere and this hides what
the company actually does most of the
company is building a photonic quantum
computer but a large uh part of the team
is actually trying to build software
that makes them available and hopefully
a joy to program you might have heard of
Penny Lane and I'm part of the uh
software team leading this LEL Quantum
machine learning team that's trying to
make these computers
useful and so the derived Mission or
objective that we're working on is to
make computers useful for machine
learning and this sounds now like a bit
of industrial bubble but it's actually
not it's a very precisely and
consciously formulated objective and
what you do not see here is that we want
to prove a practical Quantum Advantage
because this is something completely
different what we're doing here is
envisioning a state of the world this is
something that you do in startups a lot
and I learned from it and I find it
actually quite an interesting way of
working you envision a a state of the
world maybe in 10 years time where the
typical machine learning practitioner
doesn't only have specialized knowledge
on how to program a GPU or on linear
algebra but also needs to know Quantum
Computing and the question is how do we
get there from here so this is an
industrial question it doesn't mean to
understand the world it means to change
the world but I wouldn't do it if it
wouldn't Encompass also like a lot of
understanding the world obviously and
now the question is you look at the
field and you have especially when
you're building a team and you have to
give them something to do and um the
question is now with the models that
also Nathan had this wonderful
introduction to are we actually on the
right path so basically if we
extrapolate our computers get better our
models get a little bit better we tune a
bit here and there will we get to the
state of the world and my answer to this
is if I formulate it nicely we need to a
lot of things have to change in research
for us to be on the right path obviously
I can't predict the future but um let's
put it like that if I would put my money
or my good name or something to a path
and I would would say let's not choose
those let's change a couple of
things um also I have to say when I talk
about Quantum machine learning in the
back of my mind I have this like very
mainstream approach that Nathan was also
talking about we load data into a
quantum computer we kind of uh run some
kind of variational algorithms and then
it gives us like kind of the answer of a
machine learning task which is a kind of
narrow approach to Quantum machine
learning in general I'm sure we discuss
a lot more facets of it so let me try to
phrase four patterns that I see in the
literature and in my own kind of work
that I started feeling uneasy about and
I think I'll make myself very um well I
don't know if I make some enemies here
but the first one um is the pattern of
we prove an exponential speed up in qo
and actually even Nathan who's like
knows everything about speed UPS was
saying like maybe this is not the right
approach I was actually like i l a
couple of like comments in this
talk why is this so problematic so from
an academic point of view you can play
this game but again it's problematic if
you want quantum computers to be used
for machine learning one day on a grand scheme
scheme
and the point is that the language of
computational complexity is not used in
machine learning if you good in Europe
you hear very few people like talking
about exponential speedups and the
reason is that this language cannot
really formulate or phrase what we see
in current state-of-the-art machine
learning so this language is not useful
for current state-of-the-art machine
learning another way to phrase this is
machine learning is happening in
heuristics and this is a problem and we
should acknowledge that this is a
problem for the language that we have
learned and inherited from Quantum computing
computing
so I do think that our performance
measures in this sense our theoretical
ones are to some extent not meaningful
to get good results in machine learning
and I put here mainstream machine
learning because when I discuss this
with people they always say yeah what
what about Quantum data maybe there are
Tas that machine learning hasn't tackled
yet and there again I think Nathan has
dropped a very interesting comment in
the last weeks I looked a little bit
into Transformer models and what's
actually happening in state-ofthe-art
machine learning and I'm not actually
sure that they can't solve a lot of our
problems in quantum physics because they
learn in a very interesting manner kind
of very complicated correlations physics
problems are simple there's low energy
involved and most of the things we do
they're very structured why on Earth
shouldn't they be able to learn most of
them except from a couple exotic heads
but again my idea is not to prove that
there's an exotic case where things will
happen I want this to be mainstream so
here the second pattern is also related
to Performance so if we don't go to
Quantum Computing as our parent discipline
the answer is no no the answer is yes is
it is it fair to say that a uh a
proposed a machine learning algorithm
that is only polom slower than a Quantum
solution is actually Quantum because it
could be simulated in polom time then or
polinomial equivalent time on a turing
machine oh my gosh this
question can you phrase it again and
like tell me the context of why this is
right now oh the context is a
talking from previous slide so the um
the basic thing is is that if we have a
black box right we can't stare inside
the black box to be able to determine
whether or not it's a uh quantum
computer or it's a d-wave quantum computer
computer
okay so if we don't know whether or not
it's classical or a quantum computer um
then how can we tell from the input out
put pairs if it's only a polom
separation between it because then it
becomes uh there's no clear indication
that we couldn't just be secretly
simulating everything going on inside
inside
okay yes can we talk about this
afterwards it leads into a very
different topic and I don't know if I
entirely understand the problem here
okay yeah do I'm not cutting you off I'm
really just like affirming that I'm I'm
not on the level this
okay lovely thank you cool second
pattern um I'm actually quite impressed
do they have cameras on the videos on
you okay second second P guest oh yeah
yeah you know me quite well by now so
second pattern is um you know we uh look
at performance from a machine learning
angle what we do here is okay heuristic
so we start
benchmarking so a lot of papers like run
little benchmarks there always the
sentence where we start testing things
on mnist or whatever and the big
elephant in the room I think here is
that um machine learning has different
regimes and I think that small problems
in machine learning um are solved and
have a very different working reality
than big problems and the big problems
remember again this is like what we're
trying to solve here so we don't
actually know from the benchmarks that
are so small at the moment what happens
on larger scales and if you have ever
trained a new network on very or a very
performant new network on small problems
they're actually performing really badly
often so there seems to be something
going on and at this stage in time it's
very hard to change this but we could be
a lot more aware in our research asking
questions what benchmarks can we
actually design that are designed to
scale um there's another issue actually
that um is so kind of like guesswork
that I just want to share it with you
but um you know I need to see data if
this is really true I think we have a
huge positivity bias in Quantum machine
learning now I know everyone's like
claiming this but um I work on my
Fridays actually since forever I don't
work in Quantum Computing but I work
with classical machine learning um I
work with a group of social
psychologists and they shared with me
that in 20 around 2010 I think it was in
their field they had a huge Scandal
because a huge collaboration of people
who started feeling really uneasy about
research sat down and tried to reproduce
studies that people did in social
psychology and found almost zero rep
reproducibility so people have built
their Harvard careers on something that
you couldn't reproduce and so start
wondering if someone shouldn't do this
and you will see that we might actually
like start doing something like this and
we have very interesting findings that
I'll share
just so this was more about performance
there's the third pattern this is more
about model design again these are these
are patterns that worry me I don't know
if they worry you
um and this is that in a paper that uses
variational models for Quantum machine
learning at some stage there's always
this place where you introduce the
circuit then there's always a statement
like we use a poate and some inang and
so on and so forth I challenge you to
find papers where there is a logical
explanation of why they use this circuit
and not another one so sometimes you
find a paper that has some design
principles that they optimize but those
design principles are in 99% of the case
derived from quantum physics so we want
a circuit that's not you know that
incorporates a model class that's uh
classically intractable or we want a
model class that's like Universal of
some sorts or it's easy to implement in
Hardware but these design principles are
not coming from machine
learning and so like um and this is for
me like this really became app parent in
work I did with Ryan and Johannes but
also like many of you will know this you
can build a crazy onet that you plug
into your Quantum machine learning model
and it's a really useless machine
learning model in the end you can have a
crazy circuit that only gives you a sign
function in the end so yeah yes um so uh
one thing that we often hear is uh
expressibility we want our Circ to be
Express be able to express the whole
hillt press is that a machine learning
perspective or is it
I think this is super difficult because
um so first of all Quantum
expressibility so can we express any
unitary is not something because again
you can build expressive models that are
very very limited functions if you say
that your Quantum models I think the
latest papers don't fall into this trap
anymore they're actually talking about
you know we want to express or function
classes and expressibility if you ever
looked into the theory of machine
learning is a very very subtle topic
because what classical machine learning
theory was about is actually for for
many many years was to find this balance
between very expressive models and very
simple models so that you regularize
well so it's actually the focus of a lot
of theory and definitely In classical
machine learning more expressivity is
not at all better I think this comes
from people just reading about deep
learning and these models are very
expressive but only because they're
interpolating your data and still doing
well does not mean that all of a sudden
expressibility explosivity is actually a
good thing this is very complicated to
talk about and use as a measure
measure [Music]
[Music]
yeah how do you um streamline your
answers when you design it like um which
parameter is important which is not or
maybe I can refis if they need like
classical techniques that can be useful
here to reduce the parameters yeah don't
know so this is in all these patterns
they're always based on uh a problem
that we can't often can't do better but
I think we can analyze better we can be
more honest so for example the previous
slide also we cannot Benchmark in high
Dimensions but we can actually talk
about this and here again you I don't
know I have never found in classical
machine learning a principle that you
can now use to design your onsets it's
very subtle but we have to like start
thinking about this in my world you will
see just now so the answer that I'm
trying to find to this is to design
models from first principles more know
that there's a mechanism in there that
is interesting and then see how well it
does instead of just closing your eyes
and hoping that a Quantum model does I
did when I started with this in 2017 I
trained the first Quantum neur networks
which is really like a long time ago I
mean even parameter shift rules weren't
invented then yet and um at the
beginning I thought like maybe this is
just like magically going to happen and
this is where the frustration come came
in over the years where I realized these
are not working for me out of the box I
don't know if anyone when I speak to
students they often like agree with me
but if I try to train a Quantum neuron
Network it doesn't work out of the box
if I use a psyched learn model it works
out of the box and there's a discrepancy
between what I see in the literature
this is why I think there's a positivity
bu and what I see in my own research so
okay I gave a long on so turn I show
question so
yeah yes so I think I have a handle on
the first two patterns you described but
I'm not sure I understand this one would
you like to see more uniformity in the
kinds of models that people use or what
do you mean exactly I have some sense of
what you mean when you say that the the
circuit ons is motivated by the physics
but when you like to see it more
motivated by the the learning that's
where I don't understand yes so I want
to know why on Earth are using this
onset and not another one I want to know
also that you test what another onot
would do for example in your benchmark
okay like did you cherry pick an onot on
which the performance is good and it's
not necessarily well motivated in and of
itself yes or just forget about the
concept of an anet in a variation
circuit in general but try to get an
algorithm that is more handmade or where
there's a property in there that you
know I get to this actually in the
second part I well yeah thank you yes so
so is the point somehow to do with this
um so you're mentioning expressibility
in in the classical machine learning
circumstance where uh there is kind of a
contradiction between too much
expressibility and overfitting M so
there's some rigidity or some kind of
regularity that is required of the of
the approximating function right that
prevents it from overfitting is that
what you what youing and you know I mean
a gaan a support Vector machine with a
gaan kernel can express any function I
mean it's cool but it still doesn't do
deep learning at this stage so
so
yeah um cool and then the last pattern I
think this might make
you I think this attacks a bit more the
work of all of us here in the room and
this is the one that frustrated the
heart of me because I love working in
theory and there's so much beautiful
machine learning theory so for example
one thing I really loved is the spectral
analysis of kernels so I thought maybe
we can do this with Quantum kernels or
statistical physics of learning there
was actually a workshop here a couple of
years ago that was fantastic blew my
mind start working with these scientists
at some stage they ask me okay we can
now analyze Quantum models but what
model class would you actually like to
analyze going back to the last slide I
could now take one of um you know
Nathan's um feature Maps they make a big
difference of what I take I could just
arbitrarily take one of them and start
analyzing um but what people actually do
a lot in theory that you see is a
pattern where they start with when when
I see this in a paper always like s
saying like Okay this is the pattern we
consider model class tra RM which means
any embedding any Quantum computation
maybe not any Quantum computation but a
very very large class of anything I
could do in the world and then I prove a
theorem proof that you know if I kind of
approximate this like or I have a kind
of randomly sampled model out of this
class that I have Baron plateaus I prove
for example amir's we were here from
Amir like a beautiful results on there's
no back propagation scaling in general
happening or a lot of cool theory on my
kernel decompositions you know but the
point is that all of these results might
not even hold for the models we should
care about so if quantum Computing in
general can't have can't be trainable or
can't have back propagation SC and the
discussion in the end of your talk was
going in that
direction then okay so be it but there
might be a model class that is actually
good and if you think about classical
machine learning we do not prove lots of
things about any probability
distribution that I could parameterize
we start proving things about new
networks so we have a very different um
situation than in deep learning in deep
learning we know a model that we are
interested in and we build theory about
it now we copy this Theory and analyze a
rather large or arbitrary class of
models so this is like starting was
starting to drive me insane so I any
kind of Investigation about you know
what parameterized circuits actually are
in terms of quantum models and this was
work I did kind of a lot
before okay so now I have destroyed a
lot and in the last years I always gave
these talks that were so negative so let
me know yes I about the patterns I was
wondering could you comment briefly
about to to what extent these patterns
exist in classical machine learning
because I think some of them ought to
like there is a separation between
theory and practice I think data sets
and papers aren't publicly available a
lot of the time I think only the third
one to be honest so the first one that
we use computational complex that we use
basically ways to a language that
doesn't suit the reality I think people
have actually given up using that
language I mean we do a lot of
benchmarking there the second one we can
Benchmark in the big regime so that's
not a problem the third one a bit that
models are no also the third one is not
true because the models are verified by
their success so we know we should
analyze these models because they're
doing well and now the last one we're
doing actually theory about the models
that are doing well so we don't have
this arbitrariness so I don't think
actually any pattern applies there yes
question of how on Earth are you going
to um see what the blue field is right
how do we find out what are the relevant
issues and I Tred to go what we say
across Green Street and talk to our
experimentalist yeah it takes 10 years
and I'm not a lot wiser so how on Earth
are you going to find out where the blue
spot is okay um I'll try now I come to
this now actually
yeah um so this is kind of like what I'm
trying so now I'm sitting there I'm
starting a team and I want to work on
something so I could I could actually
hire any team right what specialist do I
need what field do I want to go in
especially if I'm so frustrated myself
about a lot of things so what on Earth
do we set up and so we arrived at two
Focus areas the first one is okay if I'm
the only one having a worry about our
Quantum designs and finding them or
having a bit of susp I if they're doing
so well let's do the good thing a
scientist does Let's test this
hypothesis especially because no one
else seems to really worry about it
actually except for those students are
sometimes talk to so let's try to assess
how good Quantum models really are and
this is a very big field so kind of
reassess benchmarking reassess a lot of
things that we do and there's a lot in
the pipeline but the first pilot study I
talk about here is just to
systematically Benchmark popular ideas
in Quantum machine learning trying to
make the benchmarks as objective as we
can sounds super boring everyone who
went onto this project at the beginning
was like I think this is the most
exciting piece of research I've done
ever and for various reasons actually
but um yeah and the second one is um
goes a bit into this direction and I
think you know I've tried to find good
models for so long and it's always like
what is my principle to optimize and I
think we should go the other way around
we look into Quantum algorithms we
identify what's the core engine what are
they good at and then we start asking
machine learning questions they have
nothing to do with deep learning and we
see what we find I'll motivate that in
the second part Okay cool so first part
and this is where I share like one
result that I'm actually really shocked
about I hope youed about this so a very
innocent study let's just get a team of
very senior researchers together let's
talk all the time together we have a
good software team let's implement the
best Benchmark design we can come up
with there's my realization is it's an
art to Benchmark World there are there
are millions of questions you have to
answer and we always like ask our
students in the first year of PhD to
answer those and they're completely lost
and there's a lot of arbitrariness
coming from this model selection we had
a lot of procedures the one we ended up
with take archive papers after 2018 with
certain keywords who we caused a very
wide net of trying to get everything
that could be related to
QR we only take the one with more than
30 Google Scholar citations this
introduces a very very serious bias but
one we want it introduces bias towards
earlier papers that are highly cited and
we want this because um there are often
models that people reproduce that they
talk about so influential papers but
they not be the most tweaked one so just
be careful about that notably this
excludes your 2017
work oh my gosh that was really bad no
no no that was you me the one we're both
on no no no that was published in 2018
so no why we did why we did this as well
if you look for classif and archive
forever you find all of this like State
classification papers you just have to
go through so many papers that's so we
literally sat for a week and just went
through thousands of papers right just
to give you an idea of what work this is
now we limit the uh topic we only want
nisk because we can only Implement them
only Cubit models only new models we are
only looking at supervised
classification and we want conventional
classical data so that includes images
as well to some extent but not like
something very specific like graph
structured data we randomly select 15
papers because we felt this is what we
can Implement um and then we realized
when we read through the papers properly
that some of them unfortunately really
good ones we can't Implement because for
example they don't give us actually an
idea what feature embedding they want or
this kind of some gaps
these are the ones we actually had in
the final selection I'll tell you
immediately that three of them are not
yet um implemented well enough that I'm
confident to share the results because I
don't know if you ever did a lot of data
analysis very different from theoretical
work you don't just Implement an
experiment but it's almost like a friend
you get to know you have to like do it
over and over again you find a bug you
start not understanding this you change
the setting and so we've interacted with
the other models not eternally there's a
lot more work to do but quite a lot you
also see because there's an an old paper
bias that there's actually um you know
you this was like you I promis this was
like accidental but I'm happy that it's
also my own work because you have to
criticize your own work as well
something that comes out you see um here
this paper the auth is actually in the
audience oh yes um uh voch and this is
um twice here because it's actually we
used uh it proposes two types of models
that we kind of like put into different
classes so now one observation here is
that um kind of the types of family of
models I think are quite representative
to what we see in this the literature of
supervised qml um there're these Q andn
designs which I know I also feel it's a
complete Misa but it's the idea of
encoding data and then training A
variation circuit as a classifier they
Quantum kernel methods which the idea is
you embed Quantum data you compare the
quantum states of two embedded data
points and then you feed the result into
a classical machine learning algorithm
like a support Vector machine they're
Quantum convolutional um neuron networks
and there's actually one Quantum
generative model
okay cool then you have to decide on
task I also try to be quick here this is
like the hardest thing I think there's a
whole research progam I want to make
about this uh we use binary
classification we optimize the accuracy
and then we use four data sets the first
one is supposed to be the vanilla
example we just sample from a hyper Cube
and use a perceptron model to create a
linear decision boundary and then we label
label
data then the one that I really don't
like being used but I think the first
qml paper actually using it we realized
when we looked AR was my owner is like
the worst paper in the entire world I'm
so happy it didn't make a selection
because it didn't get cited but that's
emest obviously we can't do original
Mist we have to pre-process it somehow
and this is a very simple Problem by the
way because when you see Mist results
they use a multiclass classification in
Quantum machine learning we often use
binary classification and this is a very
easy problem so you just get these two
blobs and you have to find some very
benign hyperplan or decision boundary
that is not very curvy that goes through
them and then what I'm currently working
on a lot is to um donate some more
realistic data sets that are based on
classical machine learning models of
data first result is that hyper
parameters matter and this is a problem
I tell you now why let me tell you what
you look at look at the time oh no it's
actually good um what you look at is the
three families of models that we already
implemented we also compare them to uh
classical models from the names if you
know Cy could learn you realize that
we're using a framework it's a mixture
of Jack's Penny Lane and py could learn
to do the um the hyper parameter
optimization Cy could Lear by the way is
super cool framework I know it sounds a
bit like a beginner's framework but it's
actually quite Wicked what's what's in
there and um Mist was pre-processed
slightly different for the three classes
of um models and what I show you is now
the range of accuracies you get from the
worst to the best hyperparameter setting
and the hyper parameter grid we use is
very very small because it means how
much you know it really increases your
run times of your algorithms um so we
used for example the three most
important hyperparameters of a model we
used to a grid of like two or three
points and then we computed every
combination of those and kind of ran the
model and so depending on what
hyperparameters you have you get really
bad performance or kind of reasonable
performance that's the first Insight why
there such a problem all of a sudden you
do not have a benchmarking um a couple
of experiments to run but every model
you have to run hundreds of thousands of
experiments makes a big headache I
promise you can see that we only ran our
results so far this is why they're
preliminary to 8 cubits Joseph who's
leading the study he's a very well
spoken British gentleman but he said if
he ever reads A Benchmark study up to 8
cubits only these days he would vomit
onto the paper so we're at the moment
doing the task of going higher and this
is actually not as I mean I'm open my
eyes how hard this actually is for these
of things here are the results of the best
best
models now I could now start to try to
interpret them but we're actually really
still playing with
interpretations but I want to basically
pick out a few items here to show you
how hard it is to to interpret so at
first s you would say okay the one model
class that's actually performing really
really well as good as the classical
model are the quantum
kernels but now I told you that I use
different pre-processing so you should
be suspicious and this class here I
don't use amness like the 60,000 data
points but I use 250 subsample from this
data so this is a much easier problem
than the one on the left but maybe still
kernel methods are good because a lot of
it is done on a classical computer so
maybe that's good by the way these total
drops in the models is also not entirely
clear to me yet what this is I think in
this model here was one is of the once I
was involved and I think it's really
just becoming I tried to play
around bit with it and couldn't get it
better there could be convergence issues
coming up here so this is something we
still have to study now the second thing
that maybe just one second I'll I'll get
to you in a second the second thing that
someone would immediately like comment
about is maybe if I was a first year PhD
student and I have this horrible you
know culture of I have to push out a
paper in four months time because I do
an internship somewhere I'll probably
not say oh there's a Quantum model
that's better than the classical model
and this is the dressed Quantum circuit
classifier guess what the dressed
Quantum circuit classifier is a paper by
Andrea Mari and it uses a neuron Network
then a Quantum circuit and then another
neuron Network so it's really good oh
it's a neuron Network now if you
consider this you could say oh so but
the quantum circuit is making it
slightly better and I read a lot of
papers where there's a small line above
and it's like the quantum model is
better out let's to put the result out
let's not try to even Interac it and um
but there's also a butt is a bit more
complicated if you see the best uh
neuron Network in Cross validation is
actually much better than the quantum
models for some reason if we use the
test set and I have no clue why this is
even possible um the new network some
there's some strange overfitting it gets
like worse and I think that if we find
out what's Happening Here we can
actually push this
up okay so fossil good and now here's
the big bomb I want to drop at least to
me it was like quite a
surprise so when you Benchmark you
should try to break a model so that's
what we tried to do we took our you know
these models that I'm talking about they
have a lot of you know it's a big
pipeline all of it it's a different loss
function it's different pre-processing
there's a lot of decisions that come
from these papers but what we did we
checked the quantum part and replaced
the quantum part however interesting the
feature map is or whatever is happening
by separable Circuit so we encode our
data into po rotations that are not
entangled we do our variational part by
rotations that are not entangled we
measure something that's not entangled
and these are the
results and they're almost the same
and this is something I wouldn't show if
we haven't consistently seen it over the
last months it's possible that one or
two Mob Models still go up and down
there's something super strange here the
circuit Centric model gets so much
better and I have a feeling it's because
that model introduces a classical bias
that it adds I think this is actually L
what's happening here anyways this model
here which is this um um feature map
where you encode into an iqp circuit is
the only one I think where the separable
model gets worse What's Happening Here
is the feature M actually doesn't only
take X1 X2 X3 and encodes it it also
encodes X1 * X2 X1 * X3 and so on and so
forth so it builds higher order features
classically and then encodes them so
there's classical Pro pre-processing
going on and when we take the separable
model we switch that off so maybe that's
actually What's Happening Here Okay cool
so that's kind of what it is I'm
interacting with this data really on the
daily basis for a long time and I have
start having this feeling here if you
have ever played around with the new
network playground from tensor flow you
realize that you cannot only put in the
Med features but you can also build
these polinomial features and
trigonometric features and it turns out
that all the models are doing this they
build low order trigonometric features
or uh polinomial features and this
increases the performance capacity but I
do not know if this works for larger
scales uh one second I actually totally
wait where was the question I'm so sorry
yeah I I was just wondering uh did you
run these models on actual Hardware no
no no oh my uh completely simulated and
it's already the biggest headache just
to give you an an idea so this is why I
find this research so interesting
because part of my passion is also
software right or like and if you want
to push this so for example what we use
at the moment we use just in time
compilation with Jacks makes things
really fast but then you get a problem
at some stage you can't compile anymore
because things get too big I also have a
memory leak that I realize on the plan
anyways and then we have to reroute our
entire code use back ends of peny Lan
that use gpus or high performance
Computing just to get access to a
cluster oh my gosh y
anyways sorry yeah uh just a quick
question taking a look at the uh
performance ability or the performance
of these different uh models as a
function of the number of quantum and
classical parameters because they've got
different number of model parameters
yeah so it may not be the fairest
comparison as far as I know is
completely mixed so there's no
correlation between the two but these
will be plots that I'm so like working
at the moment to show you basically how
the parameters grow because it's very
mixed it's very different from each
model yeah one of the things that I
found is that looking at
things like the AIC to disting between
different models that have different
numers of parameters sometimes need to very
very
different results about what's the
whether a Quantum or a classical model
is preferred yes but what I definitely
don't see is that more Quantum
parameters more cubits is better or more
lay even in the hyper parameter
optimization we hardly ever see that
more layers is
better that's something I really find
strange to be honest yes um which
Optimizer do you use and do you have an
assumed noise model is just noisess noisess
noisess
Optimizer oh good question so the way we
implemented and this is you know this
was like months of discussion of how we
do it is literally to go into the paper
and pick out everything they suggest and
do so even if we know this is maybe not
the best loss function you know someone
doesn't use cross entropy but like you
know to we actually like implemented
that and wherever this wasn't suggested
we try to make reasonable take
reasonable choices do you really that
most of these models where they didn't
say something we used
atom is um when you say no do you also
mean no shot noise when doing shot noise
nothing it's just perfectly simulated expectation
expectation
values yeah I've actually never touched
noise because I already like working
already this is like really a lot to do
yeah with the hyper parameters did you
like randomize the the hyper parameters
like how did you optimize them yeah so
we do a grid search a complete grid
surch and actually you could increase
the grid then you would probably have
bigger ranges of the model so this is
always like really difficult choice
where sometimes you can't increase for
example circuit Centric is not doing
very well
here and I do believe it could do much
better if we increase the grid but then
our our laptops just die and as I said
this we can only do afterwards some
models could still like increase but I
do believe also the neuron Network could
increase because we have a very small
grid for that one as well yes so for
this mod so what what kind of the number
of variational parameters you have to
the thousands or Millions like usually
like around 50 but that depends this is
also a hyper parameter how many layers
you have in your circuit and they change
yeah so so here is roughly around 50 for
for classical and Quantum for the
quantum models for the classical you as
well I think yeah as well but that's
actually good point yeah we need to
actually so there will be plots added
for this actually quite hard for every
hyper parameter sitting it's completely
different how many parameters how many
cubits and so on and so forth so you
would really have to to visualize this
well so we're working on that just
because ofly I heard classical Mach
learning they one mil paramet yeah but
for example for the support Vector
machine the parameters are the same
because you put it into a classical
model okay cool let me move on because
I'm really also super excited about the
next topic which is complete change of
energy mentally um this is now very deep
Quantum Computing not very deep I'm
trying to stay as um superficial because
I wasn't like waterlot train in quantum
computer so it's a topic that most of
you will know know more about than me
and this is how can we design models
from first principle so let's say our
Benchmark study showed us that most
models are actually crap and we really
need better ones how do we get better
ones if we don't have this Golden Rule
of how to build a good machine learning
model from classical machine
learning for this I think we have to go
back and ask how did q&n Quantum neuron
networks come about and I think they
come explicitly from taking two
assumptions we interpret Quantum
Computing as something that has to do
with provable Quantum advantage and with
dis circuits we inter interpret machine
learning as the stateof the art which is
deep learning you know big models great
interent and of course then you get num
networks they kind of inherit nicely the
speed UPS of quantum models because you
can make them expressive they use poates
we can Implement on our Hardware they
follow the blueprint of deep
learning let's take a different starting
point um and what I want to do in these
last slides is really convince
you kind not convince you but like kind
of open up this way of thinking which
was a long process for me because at any
stage of This research I'm always like
okay now let's train a variation circuit
and no that's not what we're doing let's
do something different and I'm not sure
whether this is leading yet so this is
quantum Computing interpreted as solving
highly structured problems with
interference and I'll give an example um
to give you an intuition what I mean and
take machine learning as generalizing
from samples no gradient descend no
Hardware no not even a trainable
parameter think of kar's neighbor it
doesn't have trainable parameters it's
still machine
learning it's actually quite a good
algorithm if you ever tried it yeah and
so I said I was doing family building so
I I read this book I don't know if you
know it I read this book to my son and
it's so beautiful and the places you
might go there I start to be convinced
that using these two things as starting
points will get us somewhere where we
start understanding something that will
lead to a better model design we're not
only the first steps of the way
yet okay cool so the example I want to
kind of kind of get these two points
across a bit better is period finding
let's say you've got a couple of
integers and you've got a function let's
say it also maps to integers and this
function is periodic there are a couple
of more requirements you need and the
question is find the period um this
sounds very simple but it's actually an
example of a huge class of quantum
algorithms the hidden subgroup problems
and the most famous one is sh's
algorithm as many of you know this is
where I'm saying like you guys probably
learned this all at University um where
I'm didn't and I also only learned group
Theory this year my colleague asked me
like didn't you have a mathematical
education but somehow just like jumped
over it in my in my universe here and
there you know all these like words that
sound so normal start becoming a larger
concept so the integers become a group
for example in this case that 12 the
function Heights coets of a subgroup so
this is why it's called the hidden
subgroup problem and I try to find a
generator of the
subgroup so now let's so this is
basically what I mean by structured
problems now let's say how do Quantum
algorithms use interference for
structured problems and I just show you
a couple of pictures that you have to
keep in the back of your mind when you
when you think about this so what we do
as a standard algorithm textbook
algorithm to solve this problem is we
put our integers or our X values into
superposition you know we interpret them
as computational basis States then we
have this magic Oracle that always comes
in that kind of knows all the function
values and takes an Ancilla and uh
writes in the function value into the ancill
ancill
state and by the way so one of the super
cool things about this answers is that I
started realizing that it might kick us
out of the assumption that there's a
perfect Oracle it might start to like we
only have samples of the Oracle I come I
come there just
now um the next thing we do is we
measure the second register and what
does it do it only takes the X values
out that have the same value and it
actually doesn't matter what we measure
here so I'm just use you know we
measured one in this case and I'm almost
there what we do then is the magic thing
that we do if you look into Scott AR's
talks for example he often talks about F
being the one thing that is actually
like interesting especially in the Pap
that Nathan mentioned you apply a
Quantum fre transform and what you do is
you create a superp position with these
amplitudes that you see here and you see
it's super structured right and then the
magic happens of interference I just
like wrote different you know I
evaluated these values numerically that
you see this better in those Darkly
highlighted States what you get is
constructive interference because these
ones here are integer values of um you
know the exponential function so you get
like ones here by the way I forgot about
normalization here so constructive
interference and this one here in this
column you always have ones but the rest
of this the amplitudes will give you um
exactly minus one and so you get a
negative interference and by the way if
you know a group Theory then these are
obviously like irreducible
representations and so what you get is
actually also a very simple State and
this is what I mean by the solution
comes from interference why is this a
solution actually I was super surprised
that in Hidden subgroup problems uh how
you get the solution out of the final
state after the quantum for transform is
not so simple to to do but in this case
it's actually very simple because all of
these states have the property that's an
integer times 12 which is the size of
the group divided by the number you want
to get and you can get this out of a few
samples okay cool I'm almost ready uh
done um with kind of what I'm trying to
say here and by the way you're probably
expecting also like a huge preliminary
result that I'll show you now this is
the new model class this one come the
preliminary result is literally a
question that we're asking in research
now but I promise this took us half a
year to get this question and some times
this is not the wrong way to do things
so interpret Quantum Computing as
solving highly structured problems with
interference now let's put in machine
learning and I think you know where this
is going machine learning generalizing
from samples so let's say we don't have
an oracle but we have something
something that we started talking about
as a data rized Oracle if you know any
papers or research directions that do
this already I'm almost sure people have
thought about this from another
Direction then please let me know so
what happens if I only have a constant
number of actually examples of what the
Oracle does let's go very quickly
visually through this algorithm so for
example if I measure one now so I kind
of kept this state here I won't have
some of the states I will only have very
few of the states
left what happens in the interference
pattern is that some columns will be
blacked out and won't exist and what it
does to The Columns of of the rows of
constructive interference is the same
thing they're still constru
constructively interferes but the
amplitude is linearly
smaller the destructive interference
gets dest destroyed and we started off
thinking like maybe there's a certain
probability distribution of data that
doesn't destroy this so this was like
what we went at the beginning so to give
you an example where I plot kind of the
final distribution that you measure from
your hidden subgroup problem um if you
have a perfect Oracle you get nice Peaks
and then if you have 10% of your Oracle
you get kind of these interference I
guess worse and worse and I'm actually
finished now I tell you two questions
that we're currently investigating they
are kind of slightly different flavors
of this and the first thing is can we
amplify the signal in for samp this is
called for sampling basically sampling
from this distribution so still these
Peaks are still very structured so is
there a way to still get out what we
want if the Oracle so generalized from a
lower from data rized Oracle
basically and the second question that's
a bit different is can we learn to
reconstruct the Oracle from data so we
get given only a couple of states can we
kind of recover the full structure and
then just run the HSP and now who's in
the room who doesn't now immediately
think oh let's train a variation circuit
and like but the question is what we
don't want to use arbitrary anws so at
the moment we're really working on
trying to find a very clear way that
uses the inductive bu that this problem
gives us that is highly structured to
really see how we can solve this
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.