Hang tight while we fetch the video data and transcripts. This only takes a moment.
Connecting to YouTube player…
Fetching transcript data…
We’ll display the transcript, summary, and all view options as soon as everything loads.
Next steps
Loading transcript tools…
Keynote: Teaching an Old Dog New Tricks - Matt Godbolt - ACCU 2025 | ACCU Conference | YouTubeToText
YouTube Transcript: Keynote: Teaching an Old Dog New Tricks - Matt Godbolt - ACCU 2025
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
Video Summary
Summary
Core Theme
The speaker recounts their journey of learning modern C++ by attempting to build a ZX Spectrum emulator, contrasting an "old-school" implementation with a "modern" one, and sharing the challenges and insights gained from exploring new C++ features like constexpr, modules, and coroutines.
Mind Map
Click to expand
Click to explore the full interactive mind map • Zoom, pan, and navigate
Cion, a crossplatform IDE for C and
C++. Download now. Wow. Well, what a
great conference, eh? Yeah. Once again,
I want to say thank you to Guy and
everyone who's made this possible
because I've had an amazing time this
week. It's been really, really great.
Um, I've learned a bunch of things and
as the person who gets to go last, that
means I have to change my slides in
light of what I've learned while I've
been here, which is will throw off all
my timing. So, when it takes, you know,
the more than the two hours I've been
given, then all right, I'm just trying
to josh with with Guy here. So, this is
a better form, right? Everyone can see
this. So, I don't know how many of you
who all knows what the heck a ZX
spectrum is by hands. That's a lot of you.
you.
That's impressive. And so, did you spot
like the lanyard, you know, and the
spectrum is a really appropriate thing
in so many ways. Obviously, it means
something to me. It can mean a lot of
things to other people as well. And it
kind of encompasses all of those
meanings, I think. But for me, the
Spectrum logo is all about my childhood
and my introduction to computing and
kind of the beginning of my journey to
where I am right here. Um, so I'm the
old dog in this. This is
me. Um, I'm in between jobs at the
moment. I finished my previous job and
for contractual reasons, I can't start
my new job until the end of this year,
which is boohoo, poor me. Um, I'll be
starting with HRT, one of the sponsors
of this conference. So, you know, thank
you HRT. Thank you. Um, I I don't work
for them yet. Contractually not allowed
to. All that good stuff. Anyway, so why
I wanted to do something cool for this
keynote? Um, and uh, it occurred to me
that I'm kind of getting a bit older in
my career now and I have been giving out
wisdom to other folks that may be in
need of updating. In particular, um, I'm
quite canankerous about build times and
compile things like templates and
conextur and all. I know every there's
lots of people looking at me with
tutting eyes right now. I'm sort of an
old school well I'm an assembly
programmer at heart and I've sort of
clawed my way up slowly and it's kind of
hard to let go of the old ways. So I
thought I'd try and teach myself some
new C++ like the newest I could find and
I'd do it through the medium of writing
a program that I know well and I would
enjoy writing and I'm going to do it
twice. Once the old way the way that I
would write it naturally and then the
second time with all the modern things
and then I could compare and contrast
and see what came out of it. So, you all
know me from the website, right? That is
miracle. The Sega Master System
emulator. No, of course I'm joking. No
one really knows that website. It's JSB,
the BBC Micro emulator. Yeah. No, so
this is my hobby. I love doing these
things. Emulators are a great way of
learning about how hardware really works
and also um just having a fun program at
the end of it all that has some nice
output that you can play with. Right.
Spectrum was my first computer. I got it
on my 8th birthday in
1984. Um, one of my dad's colleagues
sons had written a happy birthday
program in Spectrum Basic and gave me a
cassette alongside the computer which I
loaded up and I had, you know, happy
birthday Matthew and it was like a
little swimming person. And I was into
swimming at the time and I was like I
was blown away by that and I'm actually
getting quite emotional now because this
is actually like how I started. This is
what really put me on this journey to be
standing in front of you all. Right? And
so I, you know, hit break, looked at the
program and started learning how to
program. But obviously most of it we
were just playing games. Right. Okay.
There's going to be a lot of game shots
in this. What game is this? Manny Minor.
Exactly. Right. So let me just quickly
go over what my old dog contankerous old
man uh view of C++ is. Contemporary C++.
I think we're beyond postmodern C++ now
as Tony Vanier several years ago said.
Um this is what I don't like about all
of the things in C++ right now. Long
build times. Yeah. No. No one. You're
fine. Everyone's happy with build times.
It's just me. Okay. Terrible error
messages, right? Yeah. Modules. I've
been told that they don't work. Anyone
using modules in
anger? Let the record show. No one maybe
one half hand went up. Uh co- routines
are complicated. Now again, most of
these slides were written before I
attended this conference and there's a
lot of useful talks that I've been to
now that have changed my mind about some
of these. So you're going to see that.
So I'm going to challenge myself to
write and use all the things I say I
don't like, which is constexer and
constival. I'm pointing at Jason Turner
in the front row. I've just come out of
a talk with of his where he goes into
when to use constal. Uh template meta
programming trickery. I've always hated
it. It feels like I'm programming in the
wrong language. I'm not really a
functional programming type of a person.
uh concepts, modules, co- routines, all
these things that like sound complicated
to to my poor assembler programmer
brain. So I chose this spectrum because
it was really important to me. It's part
of my childhood as I say and because I
thought it's a pretty straightforward
machine. There's hardly anything in it.
And so I, you know, chose to write this
twice because not because it's easy, but
easy and because a friend nerd sniped me
into it. Because when I discovered it
was difficult, I mentioned it in passing
to somebody and that person was like,
"Yeah, but I had that one growing up as
well. I'd like to see that, too." And so
I was like, "All right, I guess I will."
And then she said, "I'll help you if you
do it." And this that was Hana Duskova,
for whom the only way I've got any of
the cool C++ things that you're going to
see today, if you think they're cool,
um, the only reason that they're working
is because I had a lot of help from
someone much smarter than me, and that
person was Hunter. So, let's talk about
what a Spectrum 48K is. Um, it has a Z80
chip in it. This is a Z80 chip. There's
a photograph of it up there. Um, this is
actually the Z80 was in my Spectrum Plus
3, which I got a bit later on, and it's
the only part of the computer that
remains. Um, everything was socketed
back in those days. So, we've got an
8bit CPU running at a blistering 3 and a
half megahertz. I know the crowd goes
wild. 48K of RA ROM that RAM and 16K of
ROM and then this custom ULA which is
posh for like a sort of uh custom chip.
It's not that custom. It's just a pile
of logic gates but then I mean I guess
in strictly speaking all chips are just
a pile of logic gates really but you
know like it's a bit more
straightforward than that. And that
custom ULA did everything else in the
entire computer. It generated the sound,
it generated the picture. It dealt with
the loading and saving from mass
storage, which in the case of a Spectrum
was an audio cassette that where
horrible screeching noises were recorded
onto it and then played back to load up
the games. Inside that Z80, we've got a
bunch of registers. So, this is a quick
overview just so that we can explain
what's going on. We we've got 8 bit
registers. The A register is kind of the
main register. It's the accumulator. Uh
the B, the C, the D, the E, the H, the
L, and the F. Obviously it you know the
exception proves the rule of like having
something straightforward. The F
register is the flags register. It sort
of holds extra special information about
whatever happened with the last
arithmetic thing that you did. If you
subtracted two numbers together and the
result was negative then the flags has a
negative bit set in it. Things like
that. One of the cool tricks the Z80 did
is it let you pair up registers in a
particular way. So B, C, D, E, and HL.
Those are like the B register and the C
register sort of stuck together. And you
could treat them as 16 bit values. So
you could do some limited 16- bit
arithmetic. And this is the only way you
got pointers. So HL could be used as a
pointer to point into memory. And then
you could kind of read and write. So you
know that gives us 16 bit pointers. We
were talking about this in the previous
session. So the uh there's the we have
64ks worth of address space to deal with
a massive amount. Um, interestingly,
although it could do some 16- bit
arithmetic and obviously 8-bit
arithmetic, inside it was actually only
four bit. It was a cheaper part that
they just did twice. So, every single
arithmetic operation would take two
clock cycles to do a two four-bit ads
and then kind of deal with the result.
Anyway, that's off topic. Um, it has
some strange shadow registers which you
could sort of switch in and out from.
Um, there were some genuinely 16- bit
registers, the Ix, the IY stack pointer
in the program counter, and then some
special registers we don't need to talk
about. So, let's look at a Z80 program.
Doesn't really matter what this is
doing. What I really want you to realize
is that computers just look at numbers
in memory. That's all they do, right?
There complicated ways of staring at
numbers in boxes in memory, deciding
what to do with them, and then moving on
to the next number. So, this program,
um, the left hand side is the bytes in
memory. the machine code that's what the
machine sees on the right hand side the
sort of asky text is the human readable
version of it and I use that advisedly
human readable and we're very we we use
you know colloially we just talk about
assembly and assembler code all the time
and we don't really make the distinction
between the machine code and the
assembly code which is the sort of like
source and and binary format of it but
anyway so the machine is going to read
three efff and what that means is load
the value FF into the A register so far so
so
uh the machine reads 01 3412 and the 01
means this is a 16 bit load of the BC
register and so 3 412 is of course 1 2 3
4 the other way round because you know
nothing's nothing straightforward this
is little Indian right you and which is
the correct Indian anyone going to take
me up on this there's little Indian and
wrong Indian right
um okay um a single bite instruction
here just O2 means load BC C with A and
that brackets means sort of treat it
relatively. So that's where I'm treating
BC as if it's a 16 bit pointer and I'm
storing the A register into it. So the
the left hand side is sort of the
destination. The right hand side is the
source. Increment BC decrement A and
then J RNZ means like NZ gosh been in
America too long. NZ um means jump
relative if not zero. So that's
referring to the last sort of arithmetic
operation that happened. In this
instance that would be the decrement. So
we're saying we decrement the A register
and if it's not zero go back to the top
of the loop and so that 2 means JRn Z
and the FB is like back eight bytes
worth. So that's our loop and then we
return. So now you all know how to read
Z80 assembly and you understand all of
the bits and pieces going on here. Um
obviously you'll be putting the compiler
explorer into Z80 mode next time you're
using it I'm sure. Uh before we move on
um the memory map. So you're probably
used to virtual memory right? You can
page memory in and out. No such luxury
on the Z80. Uh well, on the Spectrum
specifically, the bottom 16K was just
the ROM, right? You read from it, you
get ROM, you write to it, nothing
happens because it's read only. No
faults, no traps, no nothing. It just
didn't do anything. Anything above that
was reading and writing to the RAM with
a little sort of exception that the
addresses between 4,000 and 5800 in hex
actually refer to a piece of memory that
the video circuitry and mean and by that
I mean that ula chip that's also reading
that memory and it uses it to produce a
TV picture. So if we read and write to
that we're seeing what's being displayed
on the screen. Fabulous. So we can write
an emulator now.
Right. Here we go. We're going to have
um some chunk of 64k that I'm going to
say this is the whole memory of the
computer, the whole address space. I'm
going to uh take 16k of it and I'm going
to memcopy some ROM that I've got from
from somewhere. You know, if I kept the
ROM chip, I could have dumped it, but I
found it on the internet. Um and I could
have probably used hash embed to get it
into the into the source code, right?
That would have been a cool news if I
only thought of recently. Um I'm going
to have a function to write to memory.
And all I'm going to do is I'm going to
make sure that you're not trying to
write to ROM. Otherwise you so I just
discard it if you're trying to write
below 4,000. I'm gonna have some storage
for all my registers and I'm going to
treat the registers as all 16 bit and
then if I'm reading the A register I'll
sort of shift it down or I'll mask out
the bits that I need in C. You could use
nice unions for this right and my my
assembly programmer brain wants to use
that but I can't do that in C++ without
Yeah, there's lots of shaking heads
about undefined behavior related things.
So we can't do that. All right, let's
write the emulator then. How hard can
this be? Right, so we're going to say
forever switch on the first bite where
the program counter is. Right, we're
going to read the memory at the program
counter and we're going to see what
number it is. And that number will tell
us what operation we need to do. If it's
zero, that just happens to be not.
That's the easiest one. Fantastic. We
just do nothing and we carry on and
we'll read the next bite the next time
round. If it's one, if you remember,
that was that load BC with the the thing
the 3412, right? So the next two
instruct sorry the next two values in
memory PC++ and PC++ again we're going
to put those into the BC register number
two is that indirect write through BC so
we're going to treat BC like a pointer
pass it to the right function and get
the A register out increment BC BC++ hey
this is easy how many more of these can there
there
be yeah well I mean I just sit down well
this is the table of all of the OP
codes. Um there's 256 of them, but you
notice four of them are a different
color. Those different color ones are
like actually this is the first bite of
a multibby op code. There are more. So
in total there's about 700 instructions.
Now most of them are very very similar
but still 700. That's a lot. And we
don't have illegal instructions on the
Z80. You know, like nowadays, if you
just executed any old random bit of
memory, soon enough you'd find a bite
that wasn't a valid instruction on
whatever CPU were on, and it would throw
an exception at the hardware level, you
get sig ill or Linux or whatever, or a
crash in Windows. I don't know what
Windows does actually. Um, no such
luxury here. Any sequence of bytes we
feed into the CPU has some kind of side
effect. And games programmers, and I'm
looking at games programmers here, they
will find useful sequences of bytes that
do something convenient and useful and
they will use them in their games. So,
we have to emulate everything. And it
turns out about 1,400 of them are
actually valuable and useful. And so, if
we want to emulate games, 1400
instructions. Gosh, that's going to take
a while, right? So, we've got an awful
lot to do. Okay. Which game? Jet Willie.
Correct. This is Maria the housekeeper
telling us off leaving the house in a
state. All right. What about this one
then guy? Um, so I'm going to talk about
the first implementation while well
while this this is actually from the
horizon's tape that came with the
computer. You play press play on it and
it drew a picture of all of the parts.
So I could have just showed you this, I guess.
guess.
Okay, so I've done this before a bunch
and the way that I typically write
emulators when I'm emulating a CPU is I
will grab the list of all of the
instructions as like a text file or if
you're really lucky HTML or XML or
whatever's actually, you know, built
this web page that I stole this
screenshot from. and I will parse that
out and then I will generate the code
because if I've got a you know a
4,000line file that has LD that says
first of all it says KNOP then it says
LDA BC like literally the strings right
that's what you can use to write a
disassembler and then I can write some
NAF bit of Python to split it and then
work out what's happening and then I can
just emit C++ code and then I can
generate each case statement
automatically and that's typically how I
do it and in fact in those emulators you
saw with the BBC micro and the master
system. One of them uses a Pearl script
to generate all of the op codes. The
other one because it's written in pure
JavaScript actually the in the the
creator the thing that looks and
generates all the instructions table is
written in JavaScript inside of itself
and it sort of does this sort of like uh
uh sort of snake eating its own tail
thing where it creates text and then
eval it to turn it into a function and
then starts running it. So it's kind of
just in timing itself. It's it's it's
fun, but I didn't want to do it that way
this time because it feels cheating. It
feels like I'm not doing it right. So,
I'm going to do it another way, which is
I'm going to try to decode it in code
like that switch statement, but I'm
going to try and do it a little bit more
principled, right? And the principle
that I would like to do is something
like this. I'm going to have some kind
of abstract notion of what an
instruction is. It has some string name
and op code. It has how many bytes long
this particular instruction is so I can
move the program counter on. It has some
abstract source and destination where
the source could be which register it is
or it might be hey no this is a value I
need to get from memory or it might be a
constant right and the reason it might
be a constant is because there are
increment and decrement instructions and
there is add instructions. How much more
convenient would it be if I decoded an
increment as just an ad with the
constant one. Hey I have reduced the
amount of code I need to write.
Fabulous. So those operands can be 8 bit
registers 16- bit registers. uh the
destination similarly and then the op is
the operation the actual thing we're
going to do and by doing this what we're
hopefully doing is taking that 1,400
space of all these possible instructions
and boiling it down to maybe a dozen
types of operand and then maybe a dozen
operations and as long as we can decode
those bytes into that I don't have to
keep writing all these different
permutations and combinations so the
code might look like this we do some
kind of decode whatever the heck the
program counter is pointing at go get
the operands that it says that we need
execute them and then write back the
result sort of looking a bit like you've
probably if you've ever studied uh you
know uh hardware design the standard way
that CPUs work which is cool because
we're emulating the CPU and then you
know this is maybe what our decode
routine could look like I can now group
together very common operations you know
like 4 14 24 and 34 hm pattern forming
maybe there's a pattern in this um these
are all increments and so I'm going to
say it's an increment and it's an 8 bit
add and it's adding uh I decode whatever
the the destination is and then the
operand is one. Okay, this is cool. So
far so good. Um I have to emulate the
arithmetic unit. The alou is the
arithmetic and logic unit that's going
to do all of the actual calculations.
And these are all the operations it's
going to do. We got 16 bit ads and 8 bit
ads. And you know they're as
straightforward as this, right? So
doesn't take too long to get something
up and running. But it's of course never
that simple, right? Because two things.
one, I need to update the flags, right?
These additions might cause something to
become negative or it might cause
something to overflow or it might cause
something to have a carry. Um, and also,
as I discovered rather late into this
project, inside the chip, incrementing
isn't actually the same as adding one
because you can do increments a lot
cheaper in hardware if you know it's by
a sync like literally the value one.
It's got a clever little ripple thing
and that doesn't update all the flags in
the same way. So I had to sort of
scratch that which was a
pain. So this is what the flags look
like. We don't really need to worry
about them, you know, they're just
essentially a bit mask of interesting
things that happened in the last
instruction. And this is the awkward
code for adding two 8-bit numbers
together, right? So something just one
like operation on the Z80 is 15 lines of
pretty complicated and gnarly code. The
irony is deep deep deep inside the x86
right which is derived from the 8080
ultimately that is the sibling to the
Z80 because the person who designed the
8080 left Intel and went to form Xylog
and actually made the the Z80 sort of in
the image of it. is so deep inside the
x8 x86 these flags also exist and if I
could get access to them I might be have
been able to use them but I wanted to
write this in like straight nice C++ and
not use horrible assembly intrinsics and
things because that would kind of defeat
the purpose of this anyway um so anyway
putting it all together it looks like a
CPU we fetch we decode we execute and we
write back brilliant okay
done so I get to the point where I have
the 16k ROM of the spectrum load into
memory And I write the world's worst GDB
equivalent here, which is, you know,
it's okay. I can single step through and
I can disassemble stuff and I can, you
know, run a few instructions. I could
even set break points if I had to
because, you know, debugging this is a
nightmare incidentally, right? You know,
there's one thing when you're trying to
debug a program who you have the source
code to. There's another thing where
you're debugging a program for which you
do not have the source code to and you
can't trust that adding two numbers
together that actually
works. So this is what world you live
in. But I had the annotated disassembly
of the spectrum ROM which because there
is a community of folks who do this kind
of stuff and we reverse engineer and
read back over old source and sorry old
assembly and like work out where
everything is and I could sort of step
through and I could develop some
confidence that it looked like it was
booting up. It was clearing all the
memory. It was starting to write to the
screen. Oh, maybe this is cool. So, it
would be nice to see what's going on,
right? So, let's talk a little bit about video
video
output. That area of memory between
4,000 and 5800 is interpreted by the ULA
as this sort of one bit per pixel
display. So, it's a black and white
display except each 8 by8 tile can have
a different foreground and background
color. So, there's kind of two parts.
It's quite clever. So there's some
really neat tricks it does internally
because the hardware is so
embarrassingly simple, but by and large
it's pretty straightforward. I don't
really have time to go into it now, but
that is um the way that the screen is.
So you write some code that interprets
the memory in that array and turns it
into RGB values and then you get on with
your life, right? Then you write like an
SDL app. That's all the rage I heard. Um
I don't think it is. I just asked chat
GPT to make it for me and it did. Um,
and I kind of then fixed a few bits and
pieces. And so basically the loop is
every 50th of a second, emulate a 50th
of a second's worth of CPU instructions
and then take the bit of the memory
inside the chip that corresponds to the
video, turn it into RGB and put it on
the screen. And we get to demo time,
which is, you know, you've been very
patient so far. And now we get to be see
test your patience further by me seeing
if I can actually do this. Oh, look. It
came the right way. Yay. Okay, there we
are. There is a spectrum. Now, there's
just No, no, no. Hold your applause. You
lot. No. Go on. You're very easily pleased.
pleased.
So, who among us has not been into a
department store in the
80s where Okay, there's a few lots of
hands are going up here because there's
a there's a definite divide in the crowd
and got or Dixons, right? for the sun
and broken into whatever demonstration
program they had for their computer and
Yeah. Or or local equivalent.
Right. There we go. Yay. And I now Yay.
Thank you. I pressed the right keys as
well. There we go. So, those of you who
might have looked at the spectrum that
our wonderful photographer brought in,
thank you photographer person wherever
you are. um or the Spectrum Next that
someone else has brought in as well.
You'll notice the keyboard is very odd
and I'm wearing the t-shirt that has
unlike some of you who are wearing the
conference t-shirt which has some of the
keys. This one has all of the keys. And
so you have to know which key has what
because in order to make your life
easier, all of the keywords of the basic
language were on each of the keys. And
so to type print, you press P. Great. To
type go to, you press G. That's easy to
remember. To type load, you press J. J.
Of course, everybody knows that it's
muscle memory for a whole bunch of
people in
here. So, you have to remember them and
there's symbol shift and all sorts of
things. Okay, well, that's boring and
everything. Let me just get my mouse
pointer onto the Linux terminal, the
other screen. Let's do this again. This
time, this time with Oh
gosh, over here. Sorry, it's too many.
And we won't put it in training in now.
I don't
know. This won't come out on the
recording, but some of you are being
tortured in the ears by the noises my
computer is making on stage, which are
just absolutely ghastly. But that was
it, right? I'm going to turn that off
now. Mute it. Oh gosh. So, this was like
one of the first and still probably best
platformers. Um, and I'm going to
demonstrate how excellent I am at it by
Oh, am I gonna make it? Yay. And I've
got the lag of looking at the secondary
screen here as well, which is how I'm
going to excuse how bad I am at this
game. But like this this passed for
high-end entertainment in 1982 or 1983.
So, and I'm you're going to die. Okay.
Right. Well, there you go. Anyway, so
the thing about emulators, if I can find
the control key, there we are. The thing
about emulators is that they have like a
hockey stick like reward function,
right? You got nothing, nothing, nothing,
nothing,
nothing. Runs for a bit and then hits an
instruction you haven't emulated yet.
All right. Nothing. Nothing. nothing.
Okay. And then games start working and
everything works and you're like and
then of course you lose weeks of playing
old games and getting back into, you
know, that's how it goes. And you know,
you get some amusing bugs along the
way. Yeah. Turns out the bits weren't
the way round I thought they were. Yeah.
Anyone the space fairing game elite
originally from the BBC? Yeah. BBC U B,
but then ported to the Spectrum. And
this one was particularly insidious.
this was just like some weird um index
register thing I'd gone wrong and that
was not straightforward. It's it's a
miracle that it booted and ran to this
point and had this error instead of just
going getting stuck in an infinite loop
somewhere. Um I wrote some tests because
that's what you're meant to do. And in
fact, it was incredibly useful. Oh my
gosh. Right. So the catch two tests I
wrote for this had um you know simple
tests of the ALU like adding two numbers
to the gives me the right answer and the
right flags. Of course, trying to work
out what the right flags were is
difficult because they're not
necessarily that well defined. Um, and
then luckily, because there are some
lots of crazy people on the internet,
somebody has written like a test suite
in Z80 assembly that is a test for the
Z80 and it kind of runs every
instruction possible and then it kind of
hashes together all of the registers
together and then compares it against
the known good value. And if and it's
been run on a real Z80 and so if you get
the right value, it's a perfect
emulation of a Z80. Congratulations. If
it doesn't, you've got no idea which one
of those instructions is
wrong. I passed that now. So, this is my
first approach. I should have said that
before. This is obviously just me
getting it to work and reminding myself
how this all fits together. Overall,
about a thousand lines of code split
between the decoder and the executor.
There's some sharable code that is code
that I'm not going to try and
reimplement in the second part because
frankly, it's just too miserable to get
right. The ALU, the flags, the
registers, that kind of stuff.
Um, build times are the thing that I
care about because I'm a TDD person and
I'm also of short attention span. So,
what I really like to be able to do is
to add a test, hit the button, and get
some reward almost immediately. Right?
If it takes more than a few seconds,
then I'm starting to reread again and
then I'm like, "Oh, sh oh. Oh, yeah.
That's been sat there with an er compile
error for the last, you know, 20 minutes
and I haven't done anything about it."
So what I do is I edit the
implementation such as I would be if I
was testing a change and I run and I get
to the tests and I did it three times
back to back. Each time I'm making that
same change. This is no caching or
anything like that and it takes two
seconds, 3 seconds. That's my happy
place. I'm really pleased if I can just
press buttons and it's and I'm and I'm
seeing the the the fruits of my
labor. This is the code generation. This
is the x86 assembly of the decoder for
the Z80. I know it's confusing, right?
Um, the only thing you should note about
this is that that's massive, right? And
again, I should I tell people when I do
any kind of training, you should never
judge how fast a piece of code is by how
many instructions it's running. But this
is terrible and it's calling other
functions. So, blleh.
Anyway, that said, modern computers,
even this naf old laptop are super fast.
We emulate even with that nonsense code
a 300 megahertz spectrum which is so
cool. That's 100 times faster than the
original. So that's my first one. I was
underwhelmed with my achievements. I'll
be honest with you. It was fun. Of
course it was. Right. I got to play my
own games. I was hoping for a much
faster piece of code out of it all. And
ultimately it was a bit unrewarding. But
I was like, "No, this is cool. That's
great. I'm going to take that and I'm
going to move on to the second version
which is going to be all the cool C++
things." And I had a secret plan that
the way that I'd written the first
version would segue naturally into the
second version. And I could like you'll
see you'll see how well it worked out as
well. So second
implementation, I'm not going to ask
what this one
is. So what did I learn from the first
version? Well, I didn't really show it,
but that decoder, the one that had all
the switch statements, that was still
pretty ugly, right? There was still lots
of cases in there and although I could
group a lot of them together and in fact
at one point I was using a GNU extension
where you can do case value dot dot next
value and it does a it cases all of them
together like between like zero and
eight not standard C++ which kind of
defeats the point didn't help that much
and obviously I hated the code
generation that's kind of my thing um so
this is where I was going from the
beginning this is what I was hoping for
my dream implementation is to have a
relatively straightforward
understandable decoder that I can run at any
any
instruction then I can build a sort of
conext function that does exactly what
that up code decoded to do. So if it
says it's an ad and it takes this
register and that register then I can
build the bespoke function using context
for trickery that does exactly that
function and then make a table of 256
functions and then just call that
function. So I just literally read the
next bite of memory and jump to the
function that does exactly the right
thing for that instruction. Fantastic.
And all of this at compile time because
you know const constexer constal con
something. So you know ultimately that
whole loop that I had before with the
decode and all that nonsense will be
this read a bite dispatch to it.
Wonderful. Okay. We're going to move
over to uh not valid C++. And I will
make no apology for reinventing CSS
flashing here. Took me a while to get
that working actually. This is how the
flash would work on the uh the spectrum.
There was like a hardware flashing
system. So what I really want to do is I
want to do make an array of 256 pointers to
to
functions which is the result of taking
the values 0 through 255 inclusive
decoding them as if they were an
instruction and breaking them down into
what I need to do somehow magically
compiling them into a function using I
don't know something something conextra something
something
lambda and then turning it all into an
array right you can see where I'm going
with this perhaps And some of you are
smart enough to know where I am going to
be. So again, this is not valid C++
because there are some that the astute
among you will notice things you can't
do in C++. So this is what the compile
function I would like to be able to
write, right? It's a constive val
function because that means it can only
happen at compile time, right? So I
should be able to do all these things
perfectly at compile time, right? It
takes a decoded instruction, but it also
has the word constexert up there. You
can't do this it turns out because what
I really want to do is to do all of this
work at compile time. So given a
particular instruction I want to work
out how to read the value for that
particular operand. Both the operands I
want to work out how to write the value
for that operand and then I want to work
out what the executor is. What what's
the actual operation I need to do for
it? And that can be decoded statically
at compile time. And then at runtime
this bit runs and just runs those three
things. And the compiler should see
through all that and just do the perfect
thing for that particular decoded
instruction. And then we set it back.
And so this is what you know my oper
again none of these things are valid
C++. You know my my const deval getter
for I give it an operand and I do a
switch conextra which was hilarious when
Daisy actually showed this yesterday in
her her keyn. I was like hey I was
looking for that yesterday or like in my
presentation. Um and so you know for
each operand I want to say like well
this is how to access the thing that has
been decoded for that and this should
just be straightforward right but again
you can't do that switch operand uh
switch context expert isn't a thing
returning a little lambda that gets that
thing I can't do this because I can't
have a context
parameter and I I tried some other
tricks I tried getting you know uh
trying to use templates at this point
and you know here I'm saying like well
if I can't construct something maybe I
can make a template I was trying to
avoid templates templates, right? This
is this basically what this boils down
to is I hate programming using
templates. Templates are brilliant for
generic code, right? If you want a
vector of T and T is an in, fabulous, go
knock yourself out. But if I have to
encode the Fibonacci sequence by
recursively instantiating templates or
whatever, count me out. And so I see
constexer as like a way of doing a lot
of the things that I'd otherwise have to
use templates for. And I was hoping to
use it for this. Um, but you can't do
this. I can't even use templates at this
point here because and as Jason's in
last talk we were in you know as much as
we want it to be true we can't take
anything that's a comp a parameter to a
function and build anything that's
constexer out of it and we can't use it
to instantiate templates or anything
like that so I can't make the values 0
to 255 in this way and call a normal
function with them and just do what I
was doing right it just doesn't work
right and this is this is the first
point of learning for me in my journey
first point of discovery in my becoming
a better programmer was constra doesn't
work like that. So the parameters to
constra functions are not constant
expressions even when you make them
constival val you want them to be. In
fact Jason was just telling us this in
the last me meeting the last um session
and so although the compiler absolutely
does know the value of all of these
things you can't take advantage of that
in the way that you want to I want to
you can't instantiate templates from
parameters that were passed to constant
eval functions um and you can't do any
of the other tricks that I was trying to
do. So, I was very frustrated and
annoyed. And I was even more annoyed,
incidentally, that Thoren did not sit
down and start singing about gold in
this screenshot. I spent ages, right?
Anyone who's played this game, what is
it? The Hobbit. Yes, it's pretty
obvious, isn't it? So, at this point, I
was about to sort of give up and try a
completely different attack. Uh, so
there's we're going to do a brief
interlude and I went on a journey of
discovery because I thought to myself,
I've got like 3,000 lines of code and
I'm currently emulating a chip that has
8 and a half thousand transistors in it.
That's almost like one line of code per
three transistors. That doesn't seem
like the right ratio, right? Surely
modern programming techniques should
mean that I could do this in a more
succinct way. So somehow this chip is
doing something that I need to do much
more simply than I'm doing it. So this
is what a Z80 looks like. If you take
the top of it off and you take a bunch
of high resolution photographs, it's
about the size of your fingernail. So,
hiding in this big plastic package
really is a tiny little thing in the
middle of it and all of the wires are
just there because it has to be able to
plug onto a circuit board. Um, it looks
like this. And if you zoom into that bit
at the top, this here is called a PLA.
It's a it's a programmable logic array.
It is
effectively sort of like a
ROMish on uh on a chip. And I was going
to say sort of like a ROM because it's
not just ones and zeros and strictly
looking up in a table. There is
effectively ands and ors and knots in
here and they're arranged in a regular
pattern and then the shape of the thing
over the top of it says well if this bit
is set and this bit is not set and this
bit is set and this bit is not set or
whatever you pick then I am active. Now
obviously this is a picture. Thankfully
there are very lovely people on the
internet who have have done all the work
for me. This is Ken Sheriff whose blog
is fantastic. If you love this stuff, go
look him up. Um, this is rotated 90
degrees from the picture you saw before.
What we've got coming in at the top are
all of the bits of the op code. You
know, the value that I said that's zero,
that's not, and one is loaded with BC.
And then these are the different rows
that get energized when these particular
patterns of bits come in. And then the
the energized row goes off to turn on, I
don't know, the loading and storing
unit, or it turns on the adder, or it
turns on the thing that turns the adder
into the subtractor. All of these
things, all of these bits of what are in
the hidden in the op code are actually
just relatively straightforward bit
manipulation. Okay, maybe we can use
this. So even simpler than that, someone
even clever than sh Ken has said, okay,
there's an even there's a pattern hiding
even in that that 100 odd line table.
And so what we can do is we can take the
op code coming in that 8 bit value treat
it as x y and zed where x is the top two
bits y is the middle three bits and then
the zed is the bottom three bits and
then we can build a relatively simple
table that's kind of like the the
unified theory of decoding right there's
not as many cases and so what we'll do
is if x is zero and zed is zero then
there are eight possibilities for y zero
means not one means exchange two means
decrement and jump three means jump uh
four means jump conditionally and now
four through seven sorry means jump
conditionally and now we can just look
up in the table what the four different
cases of that jump is okay that's
slightly simpler but not much of a win
so far it's a bit like a switch
statement okay how about if x is zero
and zed is one now we look at the q and
p bits which is sub bits of the y and
there are two cases we're either loading
a 16- bit value or we're adding a 16-
bit value to hl okay right this is simpler
simpler
Increments and decrements are also quite
simple. This is the best one. If x is
two, now bear in mind we've only got two
bits. So there are four possibilities
for x. So having one case for all um for
one value of x is a quarter of all op
codes decode as an alu operation which
we can look up in a table based on y. So
it's add, subtract, exclusive or um and
all that good stuff. And then the
register that we pick from zed. Okay.
So, there's a there's a there's a
pattern in here that I didn't spot when
I was just doing my case statements. I
kind of could see that there was
something going on, but I thought it
would be far too complicated for me to
work out, and I'm glad someone else did.
So, I've got a new approach. Okay, good.
Right, we've been on our interlude. I'm
going to go back to my C++ code, right?
Because I'm losing some people here. So,
if I get some C++ code on the screen,
um, so my new approach, I'm going to
make a class. I'm going to accept my
fate. I'm going to accept that templates
are going to be part of my life. I'm
going to make a class for each category
of instruction. So 16 bit loads, one
type of class. Um,
uh, al operations, another type of
class, not is a type of class, all that
good stuff. For the ones that have
different behavior depending on some of
the degrees of freedom, you know, the
ones that have parameterized on which
register they're acting on or
parameterized on whether they have the
carry or they don't take the carry,
things like that, the things that came
out of the table. We're going to
templatize that class on them. So I can
specialize the class for its
particular operands. And then a bit
vague now. Somehow I'm going to use the
x, y, and zed bits to select which of
these um classes is the right class to
do the work for this instruction. And
then somehow we're going to build a
compile time table just as we wanted at
the beginning to dispatch to the right
class. All right, this is the trick that
we came up with. And by we, I mean Hana
came up with it and told me I was stupid
for not thinking of it myself, but
that's Hannah. Bless her. So this is
what a knob looks like. We've got two
sort of fields and it's two static
fields. Really these are just sort of
like namespace kind of objectsish but we
need to be able to address them. Um
we've got a pneummonic class. Pneummonic
is just a string but you can't have
constex stood strings because of
reasons. So this is just a fixed size
buffer that's the biggest pneummonic
that you that any op code has. And we
just sort of copy the bytes in. Um and
then we have an execute function.
Obviously, KNOP is again the easiest
instruction to implement. So, we do
nothing. Fantastic. What about a 16- bit
load? So, this one is a 16- bit load
into one of the four possible 16- bit
registers. And so, we need to get the
pneummonic. And the pneummonic can in
fact use stood string. So, LDS there is
a stood string of L of a load. We're
going to add in the name of whatever the
P register is. And then we're going to
add in this comma NN, which is just the
the placeholder for the number we're
going to be reading in. Um, and then the
result of that is a stood string which
luckily we can copy into our pneummonic
and then the stood string dies before
the end of the compilation. So we're
good on that front. We haven't let a
stood string live beyond uh the compile
time. So so far so good. Okay, I can do
some limited amount of string processing
at compile time. Who knew? And then the
execution. Well, again, we can look up
these things in a table. And this is all
contextra. P is a compile time constant.
These two tables are context for tables
of of of things. And we're just going to
read the two bytes that correspond to
the 16 bit value and poke them into the
reg the relevant halves of the 16- bit
register. And critically, all of these
are known at compile time. So the
compiler will generate the exact right
code. There's no further indirection
inside the compiler. Whereas with my
previous approach with all of those opos
and things, although the compiler if the
optimizer really tried hard, it could
sometimes um constant propagate and work
out that I was actually reading a
particular variable uh register. It
would often
not. Okay, so how do we select which of
these classes is right for a particular
op code? Right, I promised some some C++
type things, right? some new C++y things
new to me, probably not new to most of
you. Um, what I'm going to do is I'm
going to have a templated variable,
which is something I didn't know you
could do. You can template on a
variable, and I'm going to specialize it
by instruction category. And that means
I can actually have a variable that is
different, has a different type
depending on the parameters that are
passed into its template, which blew my mind.
mind.
What do I mean? Well, let me talk about
this, right? First of all, we're going
to have a base case. This is a missing
instruction which says I haven't covered
one of the cases and I have to
absolutely have to have all the
instructions covered otherwise it's
going to blow up. And I've used a new
feature where I said it's you can't
construct one of these things and if you
do try and construct it, the constructor
is deleted but print out this cool
message which I learned again from a
recent C++ weekly which it said that you
could do this. This is
apparently then um here is our base case
for all op codes. This variable called
instruction is an instance of the
missing instruction. So if you try to
instantiate this right now, you'd get a
compile time error saying, "Hey, I don't
know what to do. It's a missing
instruction." And now I can phrase and
specialize each of the individual
instruction types using requires, which
is a cool feature I've discovered.
Again, no surprise to most of you in
this room, but it was new, relatively
new to me, that we can say, hey, if x is
zero and y is zero and zed is zero of
this op code, which is, you know, this
is just a non- template type parameter,
this op code thing that has the x, the
y, and the zed. Then if you try and
instantiate an instruction with that,
you get a knob. Fabulous. Similarly, if
you've got x= 0, z= 1, q= 0, it's one of
those 16- bit loads. It is a 16- bit
load which is itself templatized on the
p value. And this this format is exactly
that list of instructions I have on that
website that said this is how to decode
the instructions. I can literally paste
the the um the the text from the the
website and put it in the requires
clause and say hey this is how you
decode this kind of instruction. So that was
was
awesome. So now we're going to work out
how to dispatch based on a dynamic value
to this compile time construction. So
what I really want to do again is have
this table and dispatch to it. How do I
write this table? And again this is
where my my brain was blown because um
this is how we do it apparently and this
is again this sort of raises my kind of
oh my gosh I'm just an assembly
programmer hackles because this is looks
very complicated but we're going to make
a parameter pack and we're going to
expand it and parameter pack expansion
has magical properties that lets me do
some clever things and let me show you
what I mean. So first of all, how do we
make a parameter pack of all of the
numbers between 0 and 255 inclusive?
Well, obviously we call stood make index
sequence, right? That returns a bespoke
type which is a essentially a tupole of
zero and then one and then then two and
then three and then four and five. So
when you get compiler error messages for
this, incidentally, it's quite a long
name as it's just got all the numbers
from. So thankfully there's only 256.
I'm ignoring those extended instructions
for this just for the purposes of making
it simpler. And then we pass an instance
of this class immediately to an
immediately invoked function lambda that
sorry immediate yes a lambda and we
don't care about the value of it because
it's not the value that's important.
It's what's in that template list which
we can unpack into a parameter pack or
pack rather into a parameter pack. So
this has the side effect of turning num
into a parameter pack of 0 comma 1 comma
2 comma 3 comma and so on.
Then we create an array of lambdas. Each
lambda is an instantiation of this code
with a different op code number. Hooray.
Finally, we found a way to do this. I
would really wanted to use that nice
constexer. Just write regular imperative
code like you know normal people would
write. But no, you can't do meta
programming in that way. And so I just
have to accept it. Fine. This is what
this is about, right? I'm an old dog.
I've got to learn some tricks. Here's
one. And then you know you just make a
stood array of those and we're done. So,
it wasn't quite as bad as I thought it
was going to be. It's sort of relatively
isolated to this area. And then coming
soon, those of you who are on even newer
pieces of code, this didn't work on
clang 20, which is what I was using. Um,
you can actually do this syntax to
unpack an index sequence directly into a
num, although there's some interesting
caveats with that. So, does it work?
Well, yes, of course it does. Yes.
Hooray. Um, this one, anyone?
What was it?
Oh, no, no, no, no. Yeah. Daily
Thompson's Decathlon. This was which
coincidentally last a few weeks ago I
was in a friend's museum and I live in
America now and so he's got a computer
museum in America and I found a copy of
Daily Thompson's Decathlon in a drawer
somewhere in his I don't know how he's
managed to get hold of it. Okay, it
works. Hooray. Let's look at the code
generation which is really what I'm here
for. Right. This is beautiful. I know
you don't appreciate it perhaps in the
same way that I do but this is how to
dispatch one instruction. It's exactly
what you'd want. In fact, it's so small,
I was able to annotate every single line
of it. I'm not going to go through it
here, but it just does the minimum
amount of work needed to work out which
thing to dispatch to and then jumps to
it at the end.
Fabulous. Let's look at one of those op
codes to see that there's nothing
extraneous going on inside of it here.
So, this is exchange AF with AF prime.
So, it's just switching two registers
and it just reads the two and writes the
two out again. No further indirection,
no table lookups, no nothing. This is gorgeous.
gorgeous.
How long's the Oh, sorry. I actually
deliberately Yeah, thank you. Jason
asked how long the symbol name was. And
there's a reason why I deliberately left
overflow on. Yeah, this is one of those
things. Oh, hang on. If I can get my
mouse to do the thing. Oh, I'm scrolling
on the wrong window because I'm looking
at my notes. Somewhere in here is a
mouse pointer that can come over. Ah, we
are. Okay, I'm a professional. Yeah, now
I'm going to point something out here.
This is cool, right? So embedded in that
is like a pneummonic, one of those
classes that is a compile time class
that that just holds strings. And as of
clang 20, but not clang 19, it kind of
makes the guess that if you have an
array, a stood array of chars, it should
print them out as that, which was super
cool because otherwise it was 16 numbers
one after another, which obviously was
not very helpful, or 10. Anyway, thank
you for the distraction. That's cool.
Look, it keeps going.
Yeah. And there's the lambda. Hey,
right. Okay.
So that's exchange AF with AF. 16 bit
ads are a bit more complicated. Um but
you know like this is you know we look
it up in the table. Here's a little bit
of code. Um it's it devolves to this.
Just read the two registers, add them
together, write them back out again.
Account for seven cycles. I haven't
really talked about time passing, but
that's an important thing to measure as
well. That's cool. Um I'm going to
ignore the flag handling code because
you saw how hideous the uh the C++ code
is. Nothing to see here, but it is
there. But that was the same in the old
implementation. I and I have some ideas
about improving it. So let's get on to
build times because this is the thing
that's close to my heart. If I touch the
CPP file or one of the CPP files in this
half of the project, takes me three or
four seconds. I'm happy enough with
that. But realistically speaking, all of
the actual guts of the code is in a
header file now because it needs to be
included in a couple of places. One for
the disassembler, one for the um the
actual running part of the code, and for
the tests. And now suddenly my build has
jumped up to a very variable 11 to sort
of 20 seconds and it really depended on
what else was going on my computer
because those each of the compiles was
really big and waiting required all
those massive instantiations of
templates and so took up a lot of RAM.
So I noticed some swapping. Again this
is all measured on my computer so it's
not particularly representative but it's
representative of my flow. So not ideal
but also not really as bad as I thought
it was going to be and I'm sure there
are tricks to make it go down. So maybe
maybe I should just keep you not not
keep banging on about build times. I
don't know. Performance 200 times faster
than the Spectrum. Twice as fast as the
last one. It's fabulous. 650 MHz um
computer. That's pretty good actually
given that you know this is only a um
like one and a half MHz 2 G sorry one
and a half GHz machine. That's not bad.
Not a bad clip. And in fact I can get it
up to about a gigahertz if I slightly
cheat. Cool. Well okay. So, we've got a an
an
experimental version two that we can
play with. And the first thing I want to
do is say, can I use modules because
I've heard they're they're a bit flaky,
but they might be the solution to all my
build wos and my buildtime wos.
Right. There's some laughs in here for
the for the recording. Yeah. Um, and
also, anyone know what this game is?
It's not Nightlaw, but it is similar to
Nightlaw going Head Over Heels. No,
again, it's a similar engine. It's
World. Oh, okay. Hey, I have found
something so obscure no one else knows
what it is. That's fantastic. All right.
Um, coincidentally, I was asking Claude
for suggestions for which spectrum gain
best represented each section, and it
picked this
one. Um, because of the sort of modular
modular look, I suppose, of the the
isometric game. All right, so my hopes
here, cleaner separation because that's
what modules are really about, right? Is
to get rid of the sort of the
pre-processor and all the hell that
comes with it. And then obviously my
secret hope is a faster build as John
Lakeos is in the front row here and he
he does he knows in the early I started
a business on the basis of his book um
large scale C++ design because it brings
down build times generally and that was
fantastic and ever since then I want my
build times to be low as possible and
I'm hoping modules will do this for me
but we'll see.
see.
Okay, so this is what a module looks
like. So, we start with a big line that
says module. And the cool thing cool
thing about this is that you can't put
any pre-processor nonsense before the
word module or it doesn't work. So,
unfortunately, I can't conditionally
make something a module or not. So, I
have to either be fully module or not
module or do something else. And I chose
to do something else which I regret, but
we'll come back to
that. This is a there's a lot of regret
in this talk. Okay, we sort of save what
we're exporting and you can export a
module. You can also export a partition
of a module. And for what it's worth, I
am not trying to represent myself as an
expert at this at all. This is me trying
to get it to work and then randomly
posting on Blue Sky and getting people
to help me until it worked. Um,
importing is so much nicer. I can just
import all of
STO. No more remembering whether it's
algorithm or is it is it uh numeric or
is it, you know, whatever ranges. Um,
you can also import local partitions
within your own module and say, "Hey,
there's another part of my little
localized area that is supporting the
pneummonic stuff. I'm going to import
that here as well." It's sort of the
equivalent of like including a local
header. And then you just put the magic
keyword export in front of things that
you want to be visible to the outside
world. Fabulous. And then the rest of
the code, you can kind of write it like
it's Java. I hate to say that at this
conference, but no, I don't think
anyone's done a talk on Java, but
anyway. Um, it's just you write it out
in the head of and I suppose again to
sort of explain myself here, most of you
all probably write stuff in headers
because you know you like header only
library type things. I I abore them
unfortunately and I try and hide my
terrible awful code into the CPP file so
compiler. But this was kind of nice
because I didn't have to keep making the
decision. Is this something that should
go in the header or is it something
should go in the CP? It's just all in
there in a big blob. Brilliant.
build support was surprisingly good. I
have a contemporary CMake and I had, you
know, clang 19 and then I moved to Clang
20 during the middle of this project.
You just say, "Hey, this set of files as
CXX modules and then I I chose to call
them all CPPM files for C++ modules.
That seems to work." Um, the the files
are there and I've got this module.
CPPM, which is where I kind of import
the local parts and then reexplore the
whole module as a whole to the outside
world, but with only the bits that I
want the outside world to see, outside
world for this component that is, right?
Hooray. And now unfortunately uh the the
bright and hopeful voice goes away and
we start talking about what actually
happened rather than what should have
happened. Import stood didn't work. It
was lovely. There was this um ridiculous
uh char 256 value I had to set some
mystical value inside cmake to make
import stood work. And it did in its
defense. It did work as much as I could
do import stood. But if I tried to use
any other library that itself pulled in
standard headers, I would get not the
clashes you would expect of like, hey,
stren is defined in two places, but some
unusual ABI tags would get noted as
being different. Like it said, you know,
underscore V2AI tag is duplicate
defined. I'm like, I'm thinking ODR
panic. Oh my gosh, this is going to go
horribly wrong. And I asked around and
the received wisdom was, oh yeah, they
need to fix that. And I'm like, ah, I
also noticed that it that
um because I have uh link time
optimization turn over my release builds
because I do tend to hide all of my code
away from the compiler and my CP files,
but I do like inlininers so I'm kind of
torn. Um there was a weird um thing
where the import stood and LTO just
fought each other and so every time I
was building a release build, it would
just rebuild everything from scratch
every time. I'm like just no, don't do
that. Don't do that. So, I couldn't get
it to work and I eventually turned it
off. And then I was kind of telling a
fib earlier. Um, this format over here
is not actually what I went with. What I
did is I had line 12 be hash include the
CPP file and then I could keep both the
CPP file and the CPPM file and have a
dual build. But then there was all sorts
of other horrible hash if deaths I had
to do to stop us me from including in it
was it wasn't the best choice. So in
fact after I wrote all these slides I
went away and made a branch and I did
modules kind of quote properly and I
still had problems and all the the
things you're about to see in the next
slide this um happened in that branch.
So although if you go to look at the
project now it's on GitHub it has that
horrible hash include hack um this is
what I saw when I did it quote properly
and that is the surprise to me right if
you have two components that share a
header file inside your own project then
if you make a change to that header file
the two files that include it will build
in parallel right they can both include
the header file they can both go off and
do their work it means they're probably
doing the same work twice because they
both have to pause that header file and
do some work with it, right? That's a
pain, but at least I can parallelize
it. If you're using modules, and again,
hopefully someone can correct me on
this, but if you're using modules, I
can't even import the other module until
it has been built, which means that it
runs and it compiles and then it then
and only then can I import from it and
see which symbols it actually has in it
and then use them in my compile. And so
what I found was if I dump out the
builds graph as in like what happens if
I touch this file there was in the
non-module case it was like clusters of
like these four files build together
then we link them and then these four
files build together separate somewhere
else and then we link them and then they
use each other's headers and then those
get linked together you know like the
usual pattern you're expecting the
header files are like the sort of like
the layer of separation. What it meant
is essentially everything was linear.
It's like, oh, this project needs all
these things and all of them need to
build individually first before they can
they can include each other and then
they get linked together and now finally
we can start using things that include
the headers. Does that make sense?
Sorry, I I realized I went a bit gabbly
there, but it serializes the build and
maybe I'm doing it wrong. And build
times went up to 30 seconds. That was
not what I was
promised. You know, for reference,
remember it was 10 seconds before and
now it's like 40 seconds. That's no
good. and a clean build which I didn't
show on the other things is like 2
minutes versus 1 minute. It was a
disappointment. So overall I loved the
modularity. I got so many OD not ODR um
like uh flyby includes
um discovered. So I was using you know
size t instead of stood size t in a
bunch of places. I was using all sorts
of headers. It was lovely for just
getting that sense of I'm I'm definitely
importing the right things and I'm using
them from the right name spaces. Lovely.
Love it. Brilliant. probably more
experience is needed to be able to do
this properly and certainly the
linearized build. Maybe I'm just doing
it wrong. It's quite likely. Tooling
would help. A lot of the tools didn't
work all that well with modules. Um
Clang tidy for example seems to not like
it. Um we need some best practices.
Luckily, I think there are people in
this room who might be able to write
some once things are settled down. But I
do think it's close, right? The fact
that I was expecting Seg 14 compilers or
it just not working at all. I got it to
work pretty quickly and it mostly did
what it was said. It just needs to be
better. And I think that's a great
place. So let's talk about co-
routines. I wish to heck that I had been
to this conference before I preferred
this prepared this talk rather than
afterwards because there were three
separate talks on co- routines two days
ago and I would have done a much better
job of it if I'd have been to those
talks first. So take this with a pinch
of salt and either speak to Phil or one
of the other presenters about co-
routines because I think I'm doing it
wrong. In fact, I know I'm doing it
wrong now. But you can learn something
from my mistakes. That's kind of why
you're here, right? That's head over
heels by the way. Yes, you it says it
sorry in big
letters. So why would I want to use co-
routines in an emulator? What on earth
is the point of that? Right? Co-
routines are for like concurrency or IO
or threading something something. Well,
first of all, they're not just for
threading. They're for things that are
running at the same time. And if you're
emulating something accurately, you're not just emulating the Z80. I know I
not just emulating the Z80. I know I showed a picture earlier where I said I
showed a picture earlier where I said I run the Z80 for a 50th of a second and
run the Z80 for a 50th of a second and then I put the picture on the screen.
then I put the picture on the screen. That isn't how it works really because
That isn't how it works really because video game developers I'm going to stare
video game developers I'm going to stare at the video game developer again like
at the video game developer again like to play tricks like, hey, if I change
to play tricks like, hey, if I change this color while the picture's being
this color while the picture's being physically painted, I can make it look
physically painted, I can make it look like there are more colors available
like there are more colors available than there are on the real computer. Oh,
than there are on the real computer. Oh, so if I emulate them separately, I won't
so if I emulate them separately, I won't see these effects that you get when game
see these effects that you get when game programmers use cool tricks.
programmers use cool tricks. So you actually want to do something
So you actually want to do something like this where here in my when I'm
like this where here in my when I'm reading memory it takes three cycles
reading memory it takes three cycles cycle past time when I did that ad the
cycle past time when I did that ad the one 16- bit op uh it takes seven cycles
one 16- bit op uh it takes seven cycles to do an ad and then what does it 80
to do an ad and then what does it 80 passime look like? It looks like this.
passime look like? It looks like this. Hey just account for seven cycles and
Hey just account for seven cycles and then tell all of the bits of hardware do
then tell all of the bits of hardware do whatever you do in seven cycles please.
whatever you do in seven cycles please. So the ULA, the video chip, the audio
So the ULA, the video chip, the audio chip, the tape system when you're
chip, the tape system when you're emulating a tape, you know, do some
emulating a tape, you know, do some work. Do seven cycles worth of work.
work. Do seven cycles worth of work. That's easy. If all you are is a
That's easy. If all you are is a counter, you just bump your number up by
counter, you just bump your number up by that. But if you're a sort of stateful
that. But if you're a sort of stateful system, I don't know, like a video chip,
system, I don't know, like a video chip, then what you end up writing is sort of
then what you end up writing is sort of a horrible state machine where you have
a horrible state machine where you have to kind of manually say, well, if I'm in
to kind of manually say, well, if I'm in the bit where we're at the top border,
the bit where we're at the top border, then I need to write out some things.
then I need to write out some things. But if we get beyond some point, if this
But if we get beyond some point, if this end this many cycles have passed, now
end this many cycles have passed, now I'm moving to the next state, which is
I'm moving to the next state, which is the now I'm doing the left hand border
the now I'm doing the left hand border until eventually I do the bit where the
until eventually I do the bit where the screen and it's just a pain. It's a
screen and it's just a pain. It's a pain.
pain. And realistically the Z80 should work
And realistically the Z80 should work this way too. It's just that I'm using
this way too. It's just that I'm using the Z80 as the thing that primarily
the Z80 as the thing that primarily drives the rest of the system. You could
drives the rest of the system. You could imagine if I had two Z80s if this was a
imagine if I had two Z80s if this was a multiprocessing system that which Z80 is
multiprocessing system that which Z80 is in charge and which Z80 is told can you
in charge and which Z80 is told can you just run one cycle? How do you run one
just run one cycle? How do you run one cycle of a seven cycle ad? You have to
cycle of a seven cycle ad? You have to write everything as a giant state
write everything as a giant state machine. Right? It would be really
machine. Right? It would be really awkward. But that's what co- routines
awkward. But that's what co- routines are for, right?
are for, right? How much nicer if I could just say my
How much nicer if I could just say my video chip, like the physical video
video chip, like the physical video chip, runs all the time, right? As soon
chip, runs all the time, right? As soon as it's got power, that thing's doing
as it's got power, that thing's doing what it does. And what it does is it
what it does. And what it does is it starts at the top of the screen and
starts at the top of the screen and every cycle for some number of cycles,
every cycle for some number of cycles, it waits for the next clock cycle,
it waits for the next clock cycle, right? Raising edge of the clock cycle,
right? Raising edge of the clock cycle, and it writes out one pixel of the
and it writes out one pixel of the border color. Whatever the border color
border color. Whatever the border color is currently, I'm going to write it out.
is currently, I'm going to write it out. And then I'm going to go around in me
And then I'm going to go around in me loop. And then for the next 192 lines,
loop. And then for the next 192 lines, the whole 192 lines of the screen, yes,
the whole 192 lines of the screen, yes, the whole screen is is uh um 256 by 192
the whole screen is is uh um 256 by 192 in today. I forgot to say, which is
in today. I forgot to say, which is about the same as a capital W on an
about the same as a capital W on an iPhone screen right
iPhone screen right now, how far we've
now, how far we've come. I was actually speaking to someone
come. I was actually speaking to someone in the pub and they were, "Oh, I was
in the pub and they were, "Oh, I was doing a presentation on emulators and I
doing a presentation on emulators and I kept I kept trying to find a high
kept I kept trying to find a high resolution screenshot to put in my
resolution screenshot to put in my presentation and then I realized, oh no,
presentation and then I realized, oh no, they were all that size." You know,
they were all that size." You know, like, yeah, that is actually the right
like, yeah, that is actually the right size. Um, so you know, you you can write
size. Um, so you know, you you can write it this way and we just want to write
it this way and we just want to write wait for the right amount of time and
wait for the right amount of time and let the other things in the system do
let the other things in the system do their thing until it's time to come back
their thing until it's time to come back to me and then let it let the compiler
to me and then let it let the compiler do all the hard work for me. And so co-
do all the hard work for me. And so co- routines and again no expert obviously
routines and again no expert obviously no expert I think by now you've worked
no expert I think by now you've worked out my mo um the co- any any routine
out my mo um the co- any any routine that has the co- statement any of the
that has the co- statement any of the co-statesments co-await co yield co-
co-statesments co-await co yield co- returnturn gets magically transformed
returnturn gets magically transformed through a very arcane and complicated
through a very arcane and complicated but very powerful process um we're not
but very powerful process um we're not going to go into all these details go
going to go into all these details go watch one of uh the talks from this very
watch one of uh the talks from this very conference on it but it I like to think
conference on it but it I like to think of it is it's a bit like a lambda You
of it is it's a bit like a lambda You know how when you when you make a
know how when you when you make a lambda, you know that behind the scenes
lambda, you know that behind the scenes some structure is being made with an
some structure is being made with an oper a call operator and any of the
oper a call operator and any of the captures get copied into that lambda and
captures get copied into that lambda and held as as as variables and then you
held as as as variables and then you just call the call just you call the
just call the call just you call the call operator and magic happens there.
call operator and magic happens there. Right? This is like that on steroids
Right? This is like that on steroids where what happens is for each time a
where what happens is for each time a co- underscore magical operation happens
co- underscore magical operation happens that the code between that and either
that the code between that and either the top of the function or the previous
the top of the function or the previous co- routine is kind of broken into its
co- routine is kind of broken into its own sub function and then we hold in the
own sub function and then we hold in the state in the lambda state. We have all
state in the lambda state. We have all of the local variables that are shared
of the local variables that are shared between everything and then we have like
between everything and then we have like a pointer to which function is the
a pointer to which function is the function we would resume at if we were
function we would resume at if we were to come back to this co- routine and we
to come back to this co- routine and we start off with some initial state
start off with some initial state although that can be configured
although that can be configured everything can be configured that's what
everything can be configured that's what makes them awesome awesome um yes that
makes them awesome awesome um yes that was that was a Freudian uh slip should
was that was a Freudian uh slip should we say let me get my um and so like the
we say let me get my um and so like the in it will do like everything from the
in it will do like everything from the top of it and sort of set all the
top of it and sort of set all the variables to their initial variables and
variables to their initial variables and then say let's go to step one and then
then say let's go to step one and then it returns some magical thing that you
it returns some magical thing that you can control when it will get scheduled
can control when it will get scheduled again, when it'll get resumed. Step one
again, when it'll get resumed. Step one looks like this. This is the top of that
looks like this. This is the top of that bit where the border was happening. We
bit where the border was happening. We kind of wait that many cycles and then
kind of wait that many cycles and then when we reach the right point, we set
when we reach the right point, we set the Y value to zero, which is now the Y
the Y value to zero, which is now the Y at the top of the screen. We go to step.
at the top of the screen. We go to step. I think you get the idea, right? Some
I think you get the idea, right? Some magic has happened behind the scenes and
magic has happened behind the scenes and um and our it's written the state
um and our it's written the state machine for us. That's what I want from
machine for us. That's what I want from it. And that works beautifully. It works
it. And that works beautifully. It works great. Um there is a little bit of when
great. Um there is a little bit of when we first call the function, where are
we first call the function, where are we? Right at the beginning here. when
we? Right at the beginning here. when you call this video run and it spots the
you call this video run and it spots the coowwait um some machinery comes in and
coowwait um some machinery comes in and all that state that we were looking at
all that state that we were looking at that state object gets allocated
that state object gets allocated somewhere. So there's a bit of an
somewhere. So there's a bit of an allocation somewhere. Sometimes the
allocation somewhere. Sometimes the compiler can elide them but in the case
compiler can elide them but in the case of the the situations that I'm using
of the the situations that I'm using here it won't be able to because they're
here it won't be able to because they're very very longived. So that was a pain.
very very longived. So that was a pain. Oh get out of there. The Z80 now would
Oh get out of there. The Z80 now would look something like this. You know this
look something like this. You know this is what the main loop looks like. I
is what the main loop looks like. I didn't show this before but you know we
didn't show this before but you know we do uh um an operate. Oops. Hang on.
do uh um an operate. Oops. Hang on. Haven't done the things. Um, so what
Haven't done the things. Um, so what we're going to do is we're going to say
we're going to do is we're going to say like reading from memory takes some
like reading from memory takes some time. So I have to co-await it. And that
time. So I have to co-await it. And that will effectively deschedule the Z80 at
will effectively deschedule the Z80 at this point. Let the video chip do three
this point. Let the video chip do three cycles of drawing pixels. And then it'll
cycles of drawing pixels. And then it'll come back to me and go, "Hey, three
come back to me and go, "Hey, three cycles have passed. Read the memory
cycles have passed. Read the memory now." And then I need to dispatch off to
now." And then I need to dispatch off to my op code. And this is where the
my op code. And this is where the problem
problem starts, right? Because this function
starts, right? Because this function here, instructions table op code, also
here, instructions table op code, also will need to read memory and let time
will need to read memory and let time pass. So it kind of needs to be a co-ine
pass. So it kind of needs to be a co-ine as well.
as well. But I don't want to create an allocation
But I don't want to create an allocation here to do that. And it can't live on
here to do that. And it can't live on the stack because I know I'm going to be
the stack because I know I'm going to be yielding to god knows how many other
yielding to god knows how many other processes. So there's going to be an
processes. So there's going to be an allocation here. And so what I really
allocation here. And so what I really want to be able to do is, you know, in
want to be able to do is, you know, in our execute routine, I need to be able
our execute routine, I need to be able to do this co-await again. And I
to do this co-await again. And I couldn't work out how to do that without
couldn't work out how to do that without causing an allocation pretty much every
causing an allocation pretty much every cycle of the machine, which is, you
cycle of the machine, which is, you know, a 100 thousand no whatever 100
know, a 100 thousand no whatever 100 megahertz. Yeah.
megahertz. Yeah. million times a second, which doesn't
million times a second, which doesn't sound something that's very feasible.
sound something that's very feasible. Now, I could probably do something
Now, I could probably do something clever with allocators, but there's a
clever with allocators, but there's a lot of stuff going on there. And so,
lot of stuff going on there. And so, this was a bit of a shame. And I I
this was a bit of a shame. And I I basically gave up at this point, I'll be
basically gave up at this point, I'll be honest. And during this this last week
honest. And during this this last week in every evening, I've been trying to go
in every evening, I've been trying to go back to it, having like channeled what
back to it, having like channeled what I've learned from Phil and Co. And I
I've learned from Phil and Co. And I haven't been able to resurrect this and
haven't been able to resurrect this and get it in the way that I want it to
get it in the way that I want it to work, right, and the way it should work.
work, right, and the way it should work. And I actually cheated and I asked on a
And I actually cheated and I asked on a forum um somebody who had a C++
forum um somebody who had a C++ co-outine emulator of the Game Boy and
co-outine emulator of the Game Boy and they sent me the link to their GitHub
they sent me the link to their GitHub repository and I'm like oh how are you
repository and I'm like oh how are you doing this? It looks like it's doing
doing this? It looks like it's doing what I'm doing and every function call
what I'm doing and every function call that it was making that was you know
that it was making that was you know like hey handle read handle add handle
like hey handle read handle add handle whatever was a macro that was
whatever was a macro that was hashdefined and so effectively it was
hashdefined and so effectively it was just one big co- routine. whole loop
just one big co- routine. whole loop every single function call wasn't a
every single function call wasn't a function call. was an inlined macro that
function call. was an inlined macro that then meant that like everything was
then meant that like everything was living in the same co- routine and that
living in the same co- routine and that was kind of oh I don't you know I don't
was kind of oh I don't you know I don't want to take on co- routines but have to
want to take on co- routines but have to use macros that seems like the wrong
use macros that seems like the wrong choice there so I'm going to have to
choice there so I'm going to have to come back to this right co- routines are
come back to this right co- routines are awesome there is a learning curve maybe
awesome there is a learning curve maybe they're too complex I don't know that
they're too complex I don't know that they're too comp about as complex as
they're too comp about as complex as they need to be obviously there are
they need to be obviously there are libraries on top of this that you could
libraries on top of this that you could probably use I've got a lot to learn
probably use I've got a lot to learn okay how we we're doing pretty badly on
okay how we we're doing pretty badly on time I'm going to have to speed up a
time I'm going to have to speed up a little bit so future directions. Where
little bit so future directions. Where can I go with this? So, I've I've done
can I go with this? So, I've I've done I've done co- routines. We did con
I've done co- routines. We did con concepts and con exper and I've showed
concepts and con exper and I've showed you how terrible a programmer I am and
you how terrible a programmer I am and how I shouldn't be allowed to write code
how I shouldn't be allowed to write code anymore. What would be really lovely is
anymore. What would be really lovely is to be able to like take
to be able to like take games and just play them wherever I am,
games and just play them wherever I am, right? Oops. Let me go back here.
right? Oops. Let me go back here. Whoops. Oh, I've given the game away
Whoops. Oh, I've given the game away now.
Bollocks. There we go. And so I'd like to be able to play my game in my slides.
to be able to play my game in my slides. I don't want to have to run a program
I don't want to have to run a program like I was running before. I want to be
like I was running before. I want to be able to play Jetack in a web browser
able to play Jetack in a web browser inside because you know all my other
inside because you know all my other emulators are in a web browser. So I
emulators are in a web browser. So I thought to myself, how could I possibly
thought to myself, how could I possibly Oh, Cisco. Good. I love this game. This
Oh, Cisco. Good. I love this game. This is actually probably a All
is actually probably a All right. This is actually what I was
right. This is actually what I was hoping to be able to do. This is my
hoping to be able to do. This is my ultra stretch goal. And yes, someone
ultra stretch goal. And yes, someone from Rare is here. So yeah, please don't
from Rare is here. So yeah, please don't get me in trouble with this one. You can
get me in trouble with this one. You can take the picture, but just don't please
take the picture, but just don't please don't come after me. Um, I have got the
don't come after me. Um, I have got the cassette tape somewhere. Uh, I promise
cassette tape somewhere. Uh, I promise for the purposes of the legal people.
for the purposes of the legal people. Um, so yes, I wanted to get it working
Um, so yes, I wanted to get it working in in the web browser and I obviously
in in the web browser and I obviously have, which is great. And it was a lot
have, which is great. And it was a lot easier than I thought it was going to
easier than I thought it was going to be. Web assembly isn't that hard,
be. Web assembly isn't that hard, especially when you have a very
especially when you have a very controlled environment where effectively
controlled environment where effectively the entire emulator is an exercise in a
the entire emulator is an exercise in a 64k buffer making changes to itself over
64k buffer making changes to itself over and over again. Right? That's all the
and over again. Right? That's all the CPU is doing to a buffer of numbers,
CPU is doing to a buffer of numbers, right? So there's not really many things
right? So there's not really many things from the outside world that need to come
from the outside world that need to come in. I just need to be able to take the
in. I just need to be able to take the picture out occasionally and put the
picture out occasionally and put the keyboard presses in. So you have to grab
keyboard presses in. So you have to grab something called WY, which is the web
something called WY, which is the web assembly system interface. um it's kind
assembly system interface. um it's kind of like the operating system that you're
of like the operating system that you're cross-co compiling to because you can't
cross-co compiling to because you can't compile for a regular um C++ library
compile for a regular um C++ library there. No such thing exists for for for
there. No such thing exists for for for JavaScript, which is what we're going to
JavaScript, which is what we're going to be embedding in. Um and then you need
be embedding in. Um and then you need the component in the JavaScript side
the component in the JavaScript side that pretends to be like the operating
that pretends to be like the operating system and it says, hey, this big area
system and it says, hey, this big area of memory, I can malakan free it to you
of memory, I can malakan free it to you by handing different amounts of this
by handing different amounts of this this RAM to you. And if you need to be
this RAM to you. And if you need to be able to read and write files, then you
able to read and write files, then you call these functions and it goes out
call these functions and it goes out into JavaScript world and JavaScript
into JavaScript world and JavaScript does whatever JavaScript do. And then
does whatever JavaScript do. And then the build settings are as simple as
the build settings are as simple as this. I just said target wm 32 wy and I
this. I just said target wm 32 wy and I pointed at where I got the um system
pointed at where I got the um system route for the cross compilation to and
route for the cross compilation to and it just worked by and large. There are a
it just worked by and large. There are a couple of caveats which you can talk to
couple of caveats which you can talk to me afterwards. There's some wiring you
me afterwards. There's some wiring you have to do. Everything's C. So there was
have to do. Everything's C. So there was one one of the interrupt uh
one one of the interrupt uh conversations we went to earlier. We
conversations we went to earlier. We talked about this. Of course
talked about this. Of course everything's C. There's some magical
everything's C. There's some magical clang stuff that you put in to sort of
clang stuff that you put in to sort of say how to export things. Um, everything
say how to export things. Um, everything is a number on the remote side because
is a number on the remote side because all you're really doing is looking for
all you're really doing is looking for the window of a computer's RAM like this
the window of a computer's RAM like this virtual machine that your web assembly
virtual machine that your web assembly runs in. So everything's virtual
runs in. So everything's virtual machines and RAM block blocks. Um, you
machines and RAM block blocks. Um, you know, there's some magic in here on the
know, there's some magic in here on the JavaScript side. You don't need to worry
JavaScript side. You don't need to worry about this. Um, you I wrapped it in a
about this. Um, you I wrapped it in a JavaScript class. So I think this is the
JavaScript class. So I think this is the only JavaScript that's been shown at
only JavaScript that's been shown at this conference maybe. Oh, no. There was
this conference maybe. Oh, no. There was there was we had um yeah, we had one.
there was we had um yeah, we had one. Yeah, exactly. Um and so all we're
Yeah, exactly. Um and so all we're really doing is just doing a little bit
really doing is just doing a little bit of very light interoperability between
of very light interoperability between the things where we're we're sort of
the things where we're we're sort of passing the this pointer in uh to a
passing the this pointer in uh to a function. And one of the things we can
function. And one of the things we can do is we can form a an array over a
do is we can form a an array over a subsection of the RAM. So I can call a
subsection of the RAM. So I can call a function in the CC code that fills in a
function in the CC code that fills in a stood vector with all of the RGB values
stood vector with all of the RGB values and then I can map that in JavaScript
and then I can map that in JavaScript side as a UN8 array and then pass it to
side as a UN8 array and then pass it to the blit function to draw it to the
the blit function to draw it to the screen. So that's how we can draw it to
screen. So that's how we can draw it to the
the screen. And when it crashes you get
screen. And when it crashes you get assembly in your web browser. How cool
assembly in your web browser. How cool is that? I couldn't not have a slide
is that? I couldn't not have a slide with assembly on it from a web browser.
with assembly on it from a web browser. But this is not mine. This is built into
But this is not mine. This is built into Chrome, right? So this is what you get.
Chrome, right? So this is what you get. This is what it's a really weird thing.
This is what it's a really weird thing. It's a lot of fun. It runs at about 100
It's a lot of fun. It runs at about 100 megahertz, so it's not as fast as even
megahertz, so it's not as fast as even the the the nav version. I was using the
the the the nav version. I was using the v2 version for this. It is live at
v2 version for this. It is live at spectrum.zenia.org if you want to go and
spectrum.zenia.org if you want to go and play around with it. And clearly C++ is
play around with it. And clearly C++ is the future of web
the future of web development. All right, the real future
development. All right, the real future directions, the performance can be made
directions, the performance can be made better. That is what I love doing. That
better. That is what I love doing. That is my that's really my my happy place is
is my that's really my my happy place is making things go faster. There are some
making things go faster. There are some really neat tricks involving computed
really neat tricks involving computed go-tos in regular emulators to do with
go-tos in regular emulators to do with how um the branch predictor fits into
how um the branch predictor fits into all this which I definitely don't have
all this which I definitely don't have time to go into. I'd love to support
time to go into. I'd love to support more of the spectrum family and moreover
more of the spectrum family and moreover I'd really want to get co- routines
I'd really want to get co- routines working. I think there has to be a way
working. I think there has to be a way and it seems so natural and certainly it
and it seems so natural and certainly it would allow me to add more peripherals
would allow me to add more peripherals more easily without writing horrible
more easily without writing horrible state machines myself.
state machines myself. super stretch goal would be to get just
super stretch goal would be to get just in time compilation. Turning the uh Z80
in time compilation. Turning the uh Z80 into Intel x86 and then just remembering
into Intel x86 and then just remembering the Intel 86 and like just calling it
the Intel 86 and like just calling it over and over again. Turns out to be
over and over again. Turns out to be spectacularly hard because um back in
spectacularly hard because um back in the 8-bit days almost everyone used
the 8-bit days almost everyone used selfmodifying code. So the code would
selfmodifying code. So the code would change all the time. I know I did. Um,
change all the time. I know I did. Um, all right. These are the extra bonus
all right. These are the extra bonus slides I had to put in because Daisy.
slides I had to put in because Daisy. Um, so Daisy nerd sniped me into
Um, so Daisy nerd sniped me into thinking when she was showing how uh the
thinking when she was showing how uh the Clawude command line tool was able to
Clawude command line tool was able to like make changes to the Clang codebase.
like make changes to the Clang codebase. So I thought, well, AI is not coming for
So I thought, well, AI is not coming for my
my job. Um, so I said this to Claude. This
job. Um, so I said this to Claude. This this is a shortened version. This is a
this is a shortened version. This is a part of a presentation I'm doing on
part of a presentation I'm doing on modern C++. I'd love to demonstrate
modern C++. I'd love to demonstrate something cool you could do in my
something cool you could do in my codebase. A suggestion would be to add
codebase. A suggestion would be to add tests. What would you suggest? You know,
tests. What would you suggest? You know, I thought I would give it a really low
I thought I would give it a really low ball idea about adding some tests to a
ball idea about adding some tests to a project. That was one of the things that
project. That was one of the things that she said was kind of it would be good at
she said was kind of it would be good at and it was so I went off and it
and it was so I went off and it ruminated on my codebase and it asked
ruminated on my codebase and it asked some questions and I went back and forth
some questions and I went back and forth and then it came back with this and all
and then it came back with this and all you know I said think harder, right? We
you know I said think harder, right? We learned that that was the hack, magic
learned that that was the hack, magic hack.
hack. This is what Claude came with based on
This is what Claude came with based on my analysis. Realtime ray tracing AI,
my analysis. Realtime ray tracing AI, it's no tests. I'm not writing tests.
it's no tests. I'm not writing tests. Even the AI doesn't want to write tests.
Even the AI doesn't want to write tests. C++ 26 reflectionbased game state
C++ 26 reflectionbased game state debugger. That would be cool. But my top
debugger. That would be cool. But my top recommendation would be interactive
recommendation would be interactive spectrum memory heat map visualizer.
spectrum memory heat map visualizer. This is a preede version. It was a lot
This is a preede version. It was a lot more involved than that. It was so much
more involved than that. It was so much fun, right? So I said, "All right, go on
fun, right? So I said, "All right, go on then." And it bloody did. It just did.
then." And it bloody did. It just did. And I'm I'm going to very quickly try
And I'm I'm going to very quickly try and show you what it did here. So I have
and show you what it did here. So I have to do he it put it behind a command line
to do he it put it behind a command line flag. So it's not on all the time, but
flag. So it's not on all the time, but effectively what it's doing and and also
effectively what it's doing and and also it's got a it wrote me a read me and I
it's got a it wrote me a read me and I keep moving the mouse pointer the wrong
keep moving the mouse pointer the wrong way. Let's go over here. So there's an
way. Let's go over here. So there's an overlay over the screen now. And the
overlay over the screen now. And the hotness as in how often a particular
hotness as in how often a particular memory location is being either read or
memory location is being either read or written to is superimposed over. So the
written to is superimposed over. So the whole 64K is kind of superimposed. So
whole 64K is kind of superimposed. So you can see the red areas are where it
you can see the red areas are where it currently um is reading and writing to
currently um is reading and writing to and it even put in some keys so I can
and it even put in some keys so I can control it and and it it gave me three
control it and and it it gave me three different um uh uh color schemes which
different um uh uh color schemes which is cycling through my uh so there's
is cycling through my uh so there's spectrum, there's gray spa scale and
spectrum, there's gray spa scale and then there's uh heat which I don't
then there's uh heat which I don't actually aren't showing particularly
actually aren't showing particularly well here and I'm just making a pig's
well here and I'm just making a pig's ear of demo demoing it but you can
ear of demo demoing it but you can definitely see some things going on
definitely see some things going on here. These red marks are like wherever,
here. These red marks are like wherever, you know, maybe some important variables
you know, maybe some important variables that the the thing has been using all
that the the thing has been using all the time. I don't know. It's a way of
the time. I don't know. It's a way of kind of getting a sense about what's
kind of getting a sense about what's going on. I've again I got it to compile
going on. I've again I got it to compile last night and I thought, "Oh, sugar. I
last night and I thought, "Oh, sugar. I now I need to write some slides about
now I need to write some slides about this and I don't really know what it's
this and I don't really know what it's doing." But that might actually be a
doing." But that might actually be a cool thing to add, but like it did it
cool thing to add, but like it did it all itself. And so I am mildly worried
all itself. And so I am mildly worried for my job, but not that much. Who was
for my job, but not that much. Who was going to stand up in front of you all
going to stand up in front of you all and tell you about it?
and tell you about it? Eh, all right. Let me go click. So, in
Eh, all right. Let me go click. So, in conclusion, as I am very much over, um,
conclusion, as I am very much over, um, is there hope for me as an old dog? I
is there hope for me as an old dog? I like to think so. I like to think so.
like to think so. I like to think so. So, what are my takeaways? What am I
So, what are my takeaways? What am I going to say to you? What am I going to
going to say to you? What am I going to exhort you to do as you leave this room
exhort you to do as you leave this room and go home? Well, the main thing that I
and go home? Well, the main thing that I learned as a sort of 30ish year veteran
learned as a sort of 30ish year veteran of doing this is I felt frustrated. I
of doing this is I felt frustrated. I felt annoyed. I thought it was stupid. I
felt annoyed. I thought it was stupid. I thought the standards committee have got
thought the standards committee have got no idea what they're doing. everything's
no idea what they're doing. everything's too difficult. And then I remembered
too difficult. And then I remembered learning is uncomfortable. That's what
learning is uncomfortable. That's what learning is, right? You forget if you're
learning is, right? You forget if you're a if you're a senior person like myself
a if you're a senior person like myself who's been doing this a long time, kind
who's been doing this a long time, kind of a lot of things come easily and you
of a lot of things come easily and you kind of get used to that. You get in the
kind of get used to that. You get in the groove of like, yeah, I just got to
groove of like, yeah, I just got to knock out a class. I'm going to do a
knock out a class. I'm going to do a test. Everything's easy. And then
test. Everything's easy. And then something turns out really hard and you
something turns out really hard and you think, "This is dumb." And you go, "No,
think, "This is dumb." And you go, "No, learn how to do it. Spend some time with
learn how to do it. Spend some time with it. Get frustrated. Come back to it a
it. Get frustrated. Come back to it a week later and go, hey, that wasn't half
week later and go, hey, that wasn't half as bad as I thought." You know, that's
as bad as I thought." You know, that's definitely what I've learned from this
definitely what I've learned from this and it's been so valuable. It's the
and it's been so valuable. It's the reset I needed. Challenge your
reset I needed. Challenge your assumptions. Maybe 10 seconds of build
assumptions. Maybe 10 seconds of build time is perfectly reasonable, Matt.
time is perfectly reasonable, Matt. Maybe. Hopefully, you realize that I
Maybe. Hopefully, you realize that I love this, right? Do things that bring
love this, right? Do things that bring you joy. I mean, we're I'm so lucky to
you joy. I mean, we're I'm so lucky to have a time like this to be able to do
have a time like this to be able to do this kind of thing. And I'm so lucky
this kind of thing. And I'm so lucky that I can write a program that is fun
that I can write a program that is fun for me at every level. Well, even when
for me at every level. Well, even when it was frustrating, even when it took me
it was frustrating, even when it took me six weeks instead of the three days I
six weeks instead of the three days I originally thought because software
originally thought because software engineers just can't estimate
time. And then, you know, even though Claude won't do it for you, testing is
Claude won't do it for you, testing is worth it, right? One thing I didn't say
worth it, right? One thing I didn't say is the V2 version that I did worked
is the V2 version that I did worked first time once all the tests passed. As
first time once all the tests passed. As in like I had my suite of tests, it
in like I had my suite of tests, it would tell me, "Oh, no, the ALU is
would tell me, "Oh, no, the ALU is broken. I want fix that." Okay, now
broken. I want fix that." Okay, now okay, the ad isn't working right. Okay,
okay, the ad isn't working right. Okay, that's the ad. Okay, now this
that's the ad. Okay, now this instruction is working. Okay. Oh, all
instruction is working. Okay. Oh, all the tests pass. Cool. All right, throw
the tests pass. Cool. All right, throw manic minor in it. Oh, and off we went.
manic minor in it. Oh, and off we went. It was just a joy. So, write tests,
It was just a joy. So, write tests, they're good. We know that, right? But
they're good. We know that, right? But it's kind of nice to be reminded of
it's kind of nice to be reminded of that. So, let's quickly revisit my
that. So, let's quickly revisit my preconceptions in the no time I have
preconceptions in the no time I have left. Um, build times, they did get
left. Um, build times, they did get worse. Does it matter? Probably not. No.
worse. Does it matter? Probably not. No. Maybe in bigger projects, we need to
Maybe in bigger projects, we need to think about this. And I'm hoping that
think about this. And I'm hoping that the people in this room will sort out
the people in this room will sort out the modules problems and will continue
the modules problems and will continue to give good advice about how to
to give good advice about how to properly segment parts of your codebase
properly segment parts of your codebase so that the build times in one error
so that the build times in one error don't necessarily impact the other
don't necessarily impact the other parts. Bad error messages, no, they're
parts. Bad error messages, no, they're all great. Compilers have gotten so much
all great. Compilers have gotten so much better. You know, this looks bad, but
better. You know, this looks bad, but it's telling me exactly which constraint
it's telling me exactly which constraint wasn't working, and it's pointing at the
wasn't working, and it's pointing at the exact thing that isn't fixing fitting
exact thing that isn't fixing fitting that. And like we saw in the um earlier
that. And like we saw in the um earlier part um e even the stupid symbol names
part um e even the stupid symbol names where I'm using like packing names into
where I'm using like packing names into types it's now showing the name inside
types it's now showing the name inside the type. So that was great. Now I'm
the type. So that was great. Now I'm showing it there right modules and co-
showing it there right modules and co- routines. What do I think about them?
routines. What do I think about them? Modules 2025 might be the year of the
Modules 2025 might be the year of the module. It might be co- routines are
module. It might be co- routines are ready. Go and use them. If they make
ready. Go and use them. If they make sense for you go and use them. There are
sense for you go and use them. There are plenty of good libraries out there that
plenty of good libraries out there that I haven't shown. Um but they make co-
I haven't shown. Um but they make co- routines tractable. Things like just a
routines tractable. Things like just a generator if you can use it that just
generator if you can use it that just works. is great. Um, are they too
works. is great. Um, are they too complicated? Maybe. Or maybe that's just
complicated? Maybe. Or maybe that's just this old dog going back to his old ways.
this old dog going back to his old ways. Tooling wasn't that great. This is um
Tooling wasn't that great. This is um the otherwise excellent Caion getting
the otherwise excellent Caion getting very confused about all the things that
very confused about all the things that I was doing when I kept switching
I was doing when I kept switching backwards and forwards between the
backwards and forwards between the module view and the notu view. So, I
module view and the notu view. So, I didn't like the the tooling support
didn't like the the tooling support wasn't great. Clang D would keep telling
wasn't great. Clang D would keep telling me that things were uninitialized on the
me that things were uninitialized on the line that was initializing them. And I'm
line that was initializing them. And I'm like, no, it's there.
like, no, it's there. Um, build times obviously I'm going to
Um, build times obviously I'm going to complain about them forever. There was
complain about them forever. There was just a woeful lack of any kind of con uh
just a woeful lack of any kind of con uh context for a compile time string
context for a compile time string manipulation. You have to kind of roll
manipulation. You have to kind of roll roll your own and that's kind of
roll your own and that's kind of annoying but it was not the end of the
annoying but it was not the end of the world. At one stage I resorted to
world. At one stage I resorted to turning um the register uh sorry a
turning um the register uh sorry a constant offset into an asky value by
constant offset into an asky value by adding it to single quote zero which is
adding it to single quote zero which is you know like I just want to use stood
you know like I just want to use stood format here please. So stood format
format here please. So stood format would be nice. Uh, I did like the
would be nice. Uh, I did like the performance and actually that version
performance and actually that version two ver thing was great. Um, phrasing it
two ver thing was great. Um, phrasing it in that elegant way where the requires
in that elegant way where the requires clause matches exactly what the spec
clause matches exactly what the spec said was just beautiful and knowing that
said was just beautiful and knowing that it all just worked. The Java light
it all just worked. The Java light modules were good and of course I loved
modules were good and of course I loved learning new things. So we get to the
learning new things. So we get to the thanks slide. Huge thanks to Hanadukova
thanks slide. Huge thanks to Hanadukova who was really the co-author of all of
who was really the co-author of all of the intelligent parts of this com this
the intelligent parts of this com this this presentation. Thanks to everyone
this presentation. Thanks to everyone else. Um, I hope you like your suitably
else. Um, I hope you like your suitably awful spectrum renditions of your
awful spectrum renditions of your photographs and whatnot. In particular,
photographs and whatnot. In particular, the Compiler Explorer. If you've noticed
the Compiler Explorer. If you've noticed how rubbish it's been for the last six
how rubbish it's been for the last six weeks, it's because I've been doing this
weeks, it's because I've been doing this and not actually I mean, I can feel my
and not actually I mean, I can feel my phone vibrating in my pocket as I'm
phone vibrating in my pocket as I'm being texted about more things that are
being texted about more things that are wrong with it, but never mind. So, as
wrong with it, but never mind. So, as the end of this presentation, go and
the end of this presentation, go and build something cool and learn something
build something cool and learn something from it. Thank you very much indeed.
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.