YouTube Transcript:
Keynote: Teaching an Old Dog New Tricks - Matt Godbolt - ACCU 2025

Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.

AutoDub

Understand YouTube Foreign Videos

Immersive YouTube Voice Translation

Break language barriers, embrace global quality content

Solve Foreign Video Barriers Instantly

Video Transcript

Video Summary

Summary

Core Theme

The speaker recounts their journey of learning modern C++ by attempting to build a ZX Spectrum emulator, contrasting an "old-school" implementation with a "modern" one, and sharing the challenges and insights gained from exploring new C++ features like constexpr, modules, and coroutines.

Mind Map

Click to expand

Click to explore the full interactive mind map • Zoom, pan, and navigate

Cion, a crossplatform IDE for C and

C++. Download now. Wow. Well, what a

great conference, eh? Yeah. Once again,

I want to say thank you to Guy and

everyone who's made this possible

because I've had an amazing time this

week. It's been really, really great.

Um, I've learned a bunch of things and

as the person who gets to go last, that

means I have to change my slides in

light of what I've learned while I've

been here, which is will throw off all

my timing. So, when it takes, you know,

the more than the two hours I've been

given, then all right, I'm just trying

to josh with with Guy here. So, this is

a better form, right? Everyone can see

this. So, I don't know how many of you

who all knows what the heck a ZX

spectrum is by hands. That's a lot of you.

you.

That's impressive. And so, did you spot

like the lanyard, you know, and the

spectrum is a really appropriate thing

in so many ways. Obviously, it means

something to me. It can mean a lot of

things to other people as well. And it

kind of encompasses all of those

meanings, I think. But for me, the

Spectrum logo is all about my childhood

and my introduction to computing and

kind of the beginning of my journey to

where I am right here. Um, so I'm the

old dog in this. This is

me. Um, I'm in between jobs at the

moment. I finished my previous job and

for contractual reasons, I can't start

my new job until the end of this year,

which is boohoo, poor me. Um, I'll be

starting with HRT, one of the sponsors

of this conference. So, you know, thank

you HRT. Thank you. Um, I I don't work

for them yet. Contractually not allowed

to. All that good stuff. Anyway, so why

I wanted to do something cool for this

keynote? Um, and uh, it occurred to me

that I'm kind of getting a bit older in

my career now and I have been giving out

wisdom to other folks that may be in

need of updating. In particular, um, I'm

quite canankerous about build times and

compile things like templates and

conextur and all. I know every there's

lots of people looking at me with

tutting eyes right now. I'm sort of an

old school well I'm an assembly

programmer at heart and I've sort of

clawed my way up slowly and it's kind of

hard to let go of the old ways. So I

thought I'd try and teach myself some

new C++ like the newest I could find and

I'd do it through the medium of writing

a program that I know well and I would

enjoy writing and I'm going to do it

twice. Once the old way the way that I

would write it naturally and then the

second time with all the modern things

and then I could compare and contrast

and see what came out of it. So, you all

know me from the website, right? That is

miracle. The Sega Master System

emulator. No, of course I'm joking. No

one really knows that website. It's JSB,

the BBC Micro emulator. Yeah. No, so

this is my hobby. I love doing these

things. Emulators are a great way of

learning about how hardware really works

and also um just having a fun program at

the end of it all that has some nice

output that you can play with. Right.

Spectrum was my first computer. I got it

on my 8th birthday in

1984. Um, one of my dad's colleagues

sons had written a happy birthday

program in Spectrum Basic and gave me a

cassette alongside the computer which I

loaded up and I had, you know, happy

birthday Matthew and it was like a

little swimming person. And I was into

swimming at the time and I was like I

was blown away by that and I'm actually

getting quite emotional now because this

is actually like how I started. This is

what really put me on this journey to be

standing in front of you all. Right? And

so I, you know, hit break, looked at the

program and started learning how to

program. But obviously most of it we

were just playing games. Right. Okay.

There's going to be a lot of game shots

in this. What game is this? Manny Minor.

Exactly. Right. So let me just quickly

go over what my old dog contankerous old

man uh view of C++ is. Contemporary C++.

I think we're beyond postmodern C++ now

as Tony Vanier several years ago said.

Um this is what I don't like about all

of the things in C++ right now. Long

build times. Yeah. No. No one. You're

fine. Everyone's happy with build times.

It's just me. Okay. Terrible error

messages, right? Yeah. Modules. I've

been told that they don't work. Anyone

using modules in

anger? Let the record show. No one maybe

one half hand went up. Uh co- routines

are complicated. Now again, most of

these slides were written before I

attended this conference and there's a

lot of useful talks that I've been to

now that have changed my mind about some

of these. So you're going to see that.

So I'm going to challenge myself to

write and use all the things I say I

don't like, which is constexer and

constival. I'm pointing at Jason Turner

in the front row. I've just come out of

a talk with of his where he goes into

when to use constal. Uh template meta

programming trickery. I've always hated

it. It feels like I'm programming in the

wrong language. I'm not really a

functional programming type of a person.

uh concepts, modules, co- routines, all

these things that like sound complicated

to to my poor assembler programmer

brain. So I chose this spectrum because

it was really important to me. It's part

of my childhood as I say and because I

thought it's a pretty straightforward

machine. There's hardly anything in it.

And so I, you know, chose to write this

twice because not because it's easy, but

easy and because a friend nerd sniped me

into it. Because when I discovered it

was difficult, I mentioned it in passing

to somebody and that person was like,

"Yeah, but I had that one growing up as

well. I'd like to see that, too." And so

I was like, "All right, I guess I will."

And then she said, "I'll help you if you

do it." And this that was Hana Duskova,

for whom the only way I've got any of

the cool C++ things that you're going to

see today, if you think they're cool,

um, the only reason that they're working

is because I had a lot of help from

someone much smarter than me, and that

person was Hunter. So, let's talk about

what a Spectrum 48K is. Um, it has a Z80

chip in it. This is a Z80 chip. There's

a photograph of it up there. Um, this is

actually the Z80 was in my Spectrum Plus

3, which I got a bit later on, and it's

the only part of the computer that

remains. Um, everything was socketed

back in those days. So, we've got an

8bit CPU running at a blistering 3 and a

half megahertz. I know the crowd goes

wild. 48K of RA ROM that RAM and 16K of

ROM and then this custom ULA which is

posh for like a sort of uh custom chip.

It's not that custom. It's just a pile

of logic gates but then I mean I guess

in strictly speaking all chips are just

a pile of logic gates really but you

know like it's a bit more

straightforward than that. And that

custom ULA did everything else in the

entire computer. It generated the sound,

it generated the picture. It dealt with

the loading and saving from mass

storage, which in the case of a Spectrum

was an audio cassette that where

horrible screeching noises were recorded

onto it and then played back to load up

the games. Inside that Z80, we've got a

bunch of registers. So, this is a quick

overview just so that we can explain

what's going on. We we've got 8 bit

registers. The A register is kind of the

main register. It's the accumulator. Uh

the B, the C, the D, the E, the H, the

L, and the F. Obviously it you know the

exception proves the rule of like having

something straightforward. The F

of holds extra special information about

whatever happened with the last

arithmetic thing that you did. If you

subtracted two numbers together and the

result was negative then the flags has a

negative bit set in it. Things like

that. One of the cool tricks the Z80 did

is it let you pair up registers in a

particular way. So B, C, D, E, and HL.

Those are like the B register and the C

could treat them as 16 bit values. So

you could do some limited 16- bit

arithmetic. And this is the only way you

got pointers. So HL could be used as a

pointer to point into memory. And then

you could kind of read and write. So you

know that gives us 16 bit pointers. We

were talking about this in the previous

session. So the uh there's the we have

64ks worth of address space to deal with

a massive amount. Um, interestingly,

although it could do some 16- bit

arithmetic and obviously 8-bit

arithmetic, inside it was actually only

four bit. It was a cheaper part that

they just did twice. So, every single

arithmetic operation would take two

clock cycles to do a two four-bit ads

and then kind of deal with the result.

Anyway, that's off topic. Um, it has

some strange shadow registers which you

could sort of switch in and out from.

Um, there were some genuinely 16- bit

registers, the Ix, the IY stack pointer

in the program counter, and then some

special registers we don't need to talk

about. So, let's look at a Z80 program.

Doesn't really matter what this is

doing. What I really want you to realize

is that computers just look at numbers

in memory. That's all they do, right?

There complicated ways of staring at

numbers in boxes in memory, deciding

what to do with them, and then moving on

to the next number. So, this program,

um, the left hand side is the bytes in

memory. the machine code that's what the

machine sees on the right hand side the

sort of asky text is the human readable

version of it and I use that advisedly

human readable and we're very we we use

you know colloially we just talk about

assembly and assembler code all the time

and we don't really make the distinction

between the machine code and the

assembly code which is the sort of like

source and and binary format of it but

anyway so the machine is going to read

three efff and what that means is load

the value FF into the A register so far so

uh the machine reads 01 3412 and the 01

means this is a 16 bit load of the BC

4 the other way round because you know

nothing's nothing straightforward this

is little Indian right you and which is

the correct Indian anyone going to take

me up on this there's little Indian and

wrong Indian right

um okay um a single bite instruction

here just O2 means load BC C with A and

that brackets means sort of treat it

relatively. So that's where I'm treating

BC as if it's a 16 bit pointer and I'm

storing the A register into it. So the

the left hand side is sort of the

destination. The right hand side is the

source. Increment BC decrement A and

then J RNZ means like NZ gosh been in

America too long. NZ um means jump

relative if not zero. So that's

referring to the last sort of arithmetic

operation that happened. In this

instance that would be the decrement. So

we're saying we decrement the A register

and if it's not zero go back to the top

of the loop and so that 2 means JRn Z

and the FB is like back eight bytes

worth. So that's our loop and then we

return. So now you all know how to read

Z80 assembly and you understand all of

the bits and pieces going on here. Um

obviously you'll be putting the compiler

explorer into Z80 mode next time you're

using it I'm sure. Uh before we move on

um the memory map. So you're probably

used to virtual memory right? You can

page memory in and out. No such luxury

on the Z80. Uh well, on the Spectrum

specifically, the bottom 16K was just

the ROM, right? You read from it, you

get ROM, you write to it, nothing

happens because it's read only. No

faults, no traps, no nothing. It just

didn't do anything. Anything above that

was reading and writing to the RAM with

a little sort of exception that the

addresses between 4,000 and 5800 in hex

actually refer to a piece of memory that

the video circuitry and mean and by that

I mean that ula chip that's also reading

that memory and it uses it to produce a

TV picture. So if we read and write to

that we're seeing what's being displayed

on the screen. Fabulous. So we can write

an emulator now.

Right. Here we go. We're going to have

um some chunk of 64k that I'm going to

say this is the whole memory of the

computer, the whole address space. I'm

going to uh take 16k of it and I'm going

to memcopy some ROM that I've got from

from somewhere. You know, if I kept the

ROM chip, I could have dumped it, but I

found it on the internet. Um and I could

have probably used hash embed to get it

into the into the source code, right?

That would have been a cool news if I

only thought of recently. Um I'm going

to have a function to write to memory.

And all I'm going to do is I'm going to

make sure that you're not trying to

write to ROM. Otherwise you so I just

discard it if you're trying to write

below 4,000. I'm gonna have some storage

for all my registers and I'm going to

treat the registers as all 16 bit and

then if I'm reading the A register I'll

sort of shift it down or I'll mask out

the bits that I need in C. You could use

nice unions for this right and my my

assembly programmer brain wants to use

that but I can't do that in C++ without

Yeah, there's lots of shaking heads

about undefined behavior related things.

So we can't do that. All right, let's

write the emulator then. How hard can

this be? Right, so we're going to say

forever switch on the first bite where

the program counter is. Right, we're

going to read the memory at the program

counter and we're going to see what

number it is. And that number will tell

us what operation we need to do. If it's

zero, that just happens to be not.

That's the easiest one. Fantastic. We

just do nothing and we carry on and

we'll read the next bite the next time

round. If it's one, if you remember,

that was that load BC with the the thing

the 3412, right? So the next two

instruct sorry the next two values in

memory PC++ and PC++ again we're going

to put those into the BC register number

two is that indirect write through BC so

we're going to treat BC like a pointer

pass it to the right function and get

the A register out increment BC BC++ hey

this is easy how many more of these can there

there

be yeah well I mean I just sit down well

this is the table of all of the OP

codes. Um there's 256 of them, but you

notice four of them are a different

color. Those different color ones are

like actually this is the first bite of

a multibby op code. There are more. So

in total there's about 700 instructions.

Now most of them are very very similar

but still 700. That's a lot. And we

don't have illegal instructions on the

Z80. You know, like nowadays, if you

just executed any old random bit of

memory, soon enough you'd find a bite

that wasn't a valid instruction on

whatever CPU were on, and it would throw

an exception at the hardware level, you

get sig ill or Linux or whatever, or a

crash in Windows. I don't know what

Windows does actually. Um, no such

luxury here. Any sequence of bytes we

feed into the CPU has some kind of side

effect. And games programmers, and I'm

looking at games programmers here, they

will find useful sequences of bytes that

do something convenient and useful and

they will use them in their games. So,

we have to emulate everything. And it

turns out about 1,400 of them are

actually valuable and useful. And so, if

we want to emulate games, 1400

instructions. Gosh, that's going to take

a while, right? So, we've got an awful

lot to do. Okay. Which game? Jet Willie.

Correct. This is Maria the housekeeper

telling us off leaving the house in a

state. All right. What about this one

then guy? Um, so I'm going to talk about

the first implementation while well

while this this is actually from the

horizon's tape that came with the

computer. You play press play on it and

it drew a picture of all of the parts.

So I could have just showed you this, I guess.

guess.

Okay, so I've done this before a bunch

and the way that I typically write

emulators when I'm emulating a CPU is I

will grab the list of all of the

instructions as like a text file or if

you're really lucky HTML or XML or

whatever's actually, you know, built

this web page that I stole this

screenshot from. and I will parse that

out and then I will generate the code

because if I've got a you know a

4,000line file that has LD that says

first of all it says KNOP then it says

LDA BC like literally the strings right

that's what you can use to write a

disassembler and then I can write some

NAF bit of Python to split it and then

work out what's happening and then I can

just emit C++ code and then I can

generate each case statement

automatically and that's typically how I

do it and in fact in those emulators you

saw with the BBC micro and the master

system. One of them uses a Pearl script

to generate all of the op codes. The

other one because it's written in pure

JavaScript actually the in the the

creator the thing that looks and

generates all the instructions table is

written in JavaScript inside of itself

and it sort of does this sort of like uh

uh sort of snake eating its own tail

thing where it creates text and then

eval it to turn it into a function and

then starts running it. So it's kind of

just in timing itself. It's it's it's

fun, but I didn't want to do it that way

this time because it feels cheating. It

feels like I'm not doing it right. So,

I'm going to do it another way, which is

I'm going to try to decode it in code

like that switch statement, but I'm

going to try and do it a little bit more

principled, right? And the principle

that I would like to do is something

like this. I'm going to have some kind

of abstract notion of what an

instruction is. It has some string name

and op code. It has how many bytes long

this particular instruction is so I can

move the program counter on. It has some

abstract source and destination where

the source could be which register it is

or it might be hey no this is a value I

need to get from memory or it might be a

constant right and the reason it might

be a constant is because there are

increment and decrement instructions and

there is add instructions. How much more

convenient would it be if I decoded an

increment as just an ad with the

constant one. Hey I have reduced the

amount of code I need to write.

Fabulous. So those operands can be 8 bit

registers 16- bit registers. uh the

destination similarly and then the op is

the operation the actual thing we're

going to do and by doing this what we're

hopefully doing is taking that 1,400

space of all these possible instructions

and boiling it down to maybe a dozen

types of operand and then maybe a dozen

operations and as long as we can decode

those bytes into that I don't have to

keep writing all these different

permutations and combinations so the

code might look like this we do some

kind of decode whatever the heck the

program counter is pointing at go get

the operands that it says that we need

execute them and then write back the

result sort of looking a bit like you've

probably if you've ever studied uh you

know uh hardware design the standard way

that CPUs work which is cool because

we're emulating the CPU and then you

know this is maybe what our decode

routine could look like I can now group

together very common operations you know

like 4 14 24 and 34 hm pattern forming

maybe there's a pattern in this um these

are all increments and so I'm going to

say it's an increment and it's an 8 bit

add and it's adding uh I decode whatever

the the destination is and then the

operand is one. Okay, this is cool. So

far so good. Um I have to emulate the

arithmetic unit. The alou is the

arithmetic and logic unit that's going

to do all of the actual calculations.

And these are all the operations it's

going to do. We got 16 bit ads and 8 bit

ads. And you know they're as

straightforward as this, right? So

doesn't take too long to get something

up and running. But it's of course never

that simple, right? Because two things.

one, I need to update the flags, right?

These additions might cause something to

become negative or it might cause

something to overflow or it might cause

something to have a carry. Um, and also,

as I discovered rather late into this

project, inside the chip, incrementing

isn't actually the same as adding one

because you can do increments a lot

cheaper in hardware if you know it's by

a sync like literally the value one.

It's got a clever little ripple thing

and that doesn't update all the flags in

the same way. So I had to sort of

scratch that which was a

pain. So this is what the flags look

like. We don't really need to worry

about them, you know, they're just

essentially a bit mask of interesting

things that happened in the last

instruction. And this is the awkward

code for adding two 8-bit numbers

together, right? So something just one

like operation on the Z80 is 15 lines of

pretty complicated and gnarly code. The

irony is deep deep deep inside the x86

right which is derived from the 8080

ultimately that is the sibling to the

Z80 because the person who designed the

8080 left Intel and went to form Xylog

and actually made the the Z80 sort of in

the image of it. is so deep inside the

x8 x86 these flags also exist and if I

could get access to them I might be have

been able to use them but I wanted to

write this in like straight nice C++ and

not use horrible assembly intrinsics and

things because that would kind of defeat

the purpose of this anyway um so anyway

putting it all together it looks like a

CPU we fetch we decode we execute and we

write back brilliant okay

done so I get to the point where I have

the 16k ROM of the spectrum load into

memory And I write the world's worst GDB

equivalent here, which is, you know,

it's okay. I can single step through and

I can disassemble stuff and I can, you

know, run a few instructions. I could

even set break points if I had to

because, you know, debugging this is a

nightmare incidentally, right? You know,

there's one thing when you're trying to

debug a program who you have the source

code to. There's another thing where

you're debugging a program for which you

do not have the source code to and you

can't trust that adding two numbers

together that actually

works. So this is what world you live

in. But I had the annotated disassembly

of the spectrum ROM which because there

is a community of folks who do this kind

of stuff and we reverse engineer and

read back over old source and sorry old

assembly and like work out where

everything is and I could sort of step

through and I could develop some

confidence that it looked like it was

booting up. It was clearing all the

memory. It was starting to write to the

screen. Oh, maybe this is cool. So, it

would be nice to see what's going on,

right? So, let's talk a little bit about video

video

output. That area of memory between

4,000 and 5800 is interpreted by the ULA

as this sort of one bit per pixel

display. So, it's a black and white

display except each 8 by8 tile can have

a different foreground and background

color. So, there's kind of two parts.

It's quite clever. So there's some

really neat tricks it does internally

because the hardware is so

embarrassingly simple, but by and large

it's pretty straightforward. I don't

really have time to go into it now, but

that is um the way that the screen is.

So you write some code that interprets

the memory in that array and turns it

into RGB values and then you get on with

your life, right? Then you write like an

SDL app. That's all the rage I heard. Um

I don't think it is. I just asked chat

GPT to make it for me and it did. Um,

and I kind of then fixed a few bits and

pieces. And so basically the loop is

every 50th of a second, emulate a 50th

of a second's worth of CPU instructions

and then take the bit of the memory

inside the chip that corresponds to the

video, turn it into RGB and put it on

the screen. And we get to demo time,

which is, you know, you've been very

patient so far. And now we get to be see

test your patience further by me seeing

if I can actually do this. Oh, look. It

came the right way. Yay. Okay, there we

are. There is a spectrum. Now, there's

just No, no, no. Hold your applause. You

lot. No. Go on. You're very easily pleased.

pleased.

So, who among us has not been into a

department store in the

80s where Okay, there's a few lots of

hands are going up here because there's

a there's a definite divide in the crowd

and got or Dixons, right? for the sun

and broken into whatever demonstration

program they had for their computer and

Yeah. Or or local equivalent.

Right. There we go. Yay. And I now Yay.

Thank you. I pressed the right keys as

well. There we go. So, those of you who

might have looked at the spectrum that

our wonderful photographer brought in,

thank you photographer person wherever

you are. um or the Spectrum Next that

someone else has brought in as well.

You'll notice the keyboard is very odd

and I'm wearing the t-shirt that has

unlike some of you who are wearing the

conference t-shirt which has some of the

keys. This one has all of the keys. And

so you have to know which key has what

because in order to make your life

easier, all of the keywords of the basic

language were on each of the keys. And

so to type print, you press P. Great. To

type go to, you press G. That's easy to

remember. To type load, you press J. J.

Of course, everybody knows that it's

muscle memory for a whole bunch of

people in

here. So, you have to remember them and

there's symbol shift and all sorts of

things. Okay, well, that's boring and

everything. Let me just get my mouse

pointer onto the Linux terminal, the

other screen. Let's do this again. This

time, this time with Oh

gosh, over here. Sorry, it's too many.

And we won't put it in training in now.

I don't

know. This won't come out on the

recording, but some of you are being

tortured in the ears by the noises my

computer is making on stage, which are

just absolutely ghastly. But that was

it, right? I'm going to turn that off

now. Mute it. Oh gosh. So, this was like

one of the first and still probably best

platformers. Um, and I'm going to

demonstrate how excellent I am at it by

Oh, am I gonna make it? Yay. And I've

got the lag of looking at the secondary

screen here as well, which is how I'm

going to excuse how bad I am at this

game. But like this this passed for

high-end entertainment in 1982 or 1983.

So, and I'm you're going to die. Okay.

Right. Well, there you go. Anyway, so

the thing about emulators, if I can find

the control key, there we are. The thing

about emulators is that they have like a

hockey stick like reward function,

right? You got nothing, nothing, nothing,

nothing,

nothing. Runs for a bit and then hits an

instruction you haven't emulated yet.

All right. Nothing. Nothing. nothing.

Okay. And then games start working and

everything works and you're like and

then of course you lose weeks of playing

old games and getting back into, you

know, that's how it goes. And you know,

you get some amusing bugs along the

way. Yeah. Turns out the bits weren't

the way round I thought they were. Yeah.

Anyone the space fairing game elite

originally from the BBC? Yeah. BBC U B,

but then ported to the Spectrum. And

this one was particularly insidious.

this was just like some weird um index

was not straightforward. It's it's a

miracle that it booted and ran to this

point and had this error instead of just

going getting stuck in an infinite loop

somewhere. Um I wrote some tests because

that's what you're meant to do. And in

fact, it was incredibly useful. Oh my

gosh. Right. So the catch two tests I

wrote for this had um you know simple

tests of the ALU like adding two numbers

to the gives me the right answer and the

right flags. Of course, trying to work

out what the right flags were is

difficult because they're not

necessarily that well defined. Um, and

then luckily, because there are some

lots of crazy people on the internet,

somebody has written like a test suite

in Z80 assembly that is a test for the

Z80 and it kind of runs every

instruction possible and then it kind of

hashes together all of the registers

together and then compares it against

the known good value. And if and it's

been run on a real Z80 and so if you get

the right value, it's a perfect

emulation of a Z80. Congratulations. If

it doesn't, you've got no idea which one

of those instructions is

wrong. I passed that now. So, this is my

first approach. I should have said that

before. This is obviously just me

getting it to work and reminding myself

how this all fits together. Overall,

about a thousand lines of code split

between the decoder and the executor.

There's some sharable code that is code

that I'm not going to try and

reimplement in the second part because

frankly, it's just too miserable to get

right. The ALU, the flags, the

registers, that kind of stuff.

Um, build times are the thing that I

care about because I'm a TDD person and

I'm also of short attention span. So,

what I really like to be able to do is

to add a test, hit the button, and get

some reward almost immediately. Right?

If it takes more than a few seconds,

then I'm starting to reread again and

then I'm like, "Oh, sh oh. Oh, yeah.

That's been sat there with an er compile

error for the last, you know, 20 minutes

and I haven't done anything about it."

So what I do is I edit the

implementation such as I would be if I

was testing a change and I run and I get

to the tests and I did it three times

back to back. Each time I'm making that

same change. This is no caching or

anything like that and it takes two

seconds, 3 seconds. That's my happy

place. I'm really pleased if I can just

press buttons and it's and I'm and I'm

seeing the the the fruits of my

labor. This is the code generation. This

is the x86 assembly of the decoder for

the Z80. I know it's confusing, right?

Um, the only thing you should note about

this is that that's massive, right? And

again, I should I tell people when I do

any kind of training, you should never

judge how fast a piece of code is by how

many instructions it's running. But this

is terrible and it's calling other

functions. So, blleh.

Anyway, that said, modern computers,

even this naf old laptop are super fast.

We emulate even with that nonsense code

a 300 megahertz spectrum which is so

cool. That's 100 times faster than the

original. So that's my first one. I was

underwhelmed with my achievements. I'll

be honest with you. It was fun. Of

course it was. Right. I got to play my

own games. I was hoping for a much

faster piece of code out of it all. And

ultimately it was a bit unrewarding. But

I was like, "No, this is cool. That's

great. I'm going to take that and I'm

going to move on to the second version

which is going to be all the cool C++

things." And I had a secret plan that

the way that I'd written the first

version would segue naturally into the

second version. And I could like you'll

see you'll see how well it worked out as

well. So second

implementation, I'm not going to ask

what this one

is. So what did I learn from the first

version? Well, I didn't really show it,

but that decoder, the one that had all

the switch statements, that was still

pretty ugly, right? There was still lots

of cases in there and although I could

group a lot of them together and in fact

at one point I was using a GNU extension

where you can do case value dot dot next

value and it does a it cases all of them

together like between like zero and

eight not standard C++ which kind of

defeats the point didn't help that much

and obviously I hated the code

generation that's kind of my thing um so

this is where I was going from the

beginning this is what I was hoping for

my dream implementation is to have a

relatively straightforward

understandable decoder that I can run at any

any

instruction then I can build a sort of

conext function that does exactly what

that up code decoded to do. So if it

says it's an ad and it takes this

build the bespoke function using context

for trickery that does exactly that

function and then make a table of 256

functions and then just call that

function. So I just literally read the

next bite of memory and jump to the

function that does exactly the right

thing for that instruction. Fantastic.

And all of this at compile time because

you know const constexer constal con

something. So you know ultimately that

whole loop that I had before with the

decode and all that nonsense will be

this read a bite dispatch to it.

Wonderful. Okay. We're going to move

over to uh not valid C++. And I will

make no apology for reinventing CSS

flashing here. Took me a while to get

that working actually. This is how the

flash would work on the uh the spectrum.

There was like a hardware flashing

system. So what I really want to do is I

want to do make an array of 256 pointers to

functions which is the result of taking

the values 0 through 255 inclusive

decoding them as if they were an

instruction and breaking them down into

what I need to do somehow magically

compiling them into a function using I

don't know something something conextra something

something

lambda and then turning it all into an

array right you can see where I'm going

with this perhaps And some of you are

smart enough to know where I am going to

be. So again, this is not valid C++

because there are some that the astute

among you will notice things you can't

do in C++. So this is what the compile

function I would like to be able to

write, right? It's a constive val

function because that means it can only

happen at compile time, right? So I

should be able to do all these things

perfectly at compile time, right? It

takes a decoded instruction, but it also

has the word constexert up there. You

can't do this it turns out because what

I really want to do is to do all of this

work at compile time. So given a

particular instruction I want to work

out how to read the value for that

particular operand. Both the operands I

want to work out how to write the value

for that operand and then I want to work

out what the executor is. What what's

the actual operation I need to do for

it? And that can be decoded statically

at compile time. And then at runtime

this bit runs and just runs those three

things. And the compiler should see

through all that and just do the perfect

thing for that particular decoded

instruction. And then we set it back.

And so this is what you know my oper

again none of these things are valid

C++. You know my my const deval getter

for I give it an operand and I do a

switch conextra which was hilarious when

Daisy actually showed this yesterday in

her her keyn. I was like hey I was

looking for that yesterday or like in my

presentation. Um and so you know for

each operand I want to say like well

this is how to access the thing that has

been decoded for that and this should

just be straightforward right but again

you can't do that switch operand uh

switch context expert isn't a thing

returning a little lambda that gets that

thing I can't do this because I can't

have a context

parameter and I I tried some other

tricks I tried getting you know uh

trying to use templates at this point

and you know here I'm saying like well

if I can't construct something maybe I

can make a template I was trying to

avoid templates templates, right? This

is this basically what this boils down

to is I hate programming using

templates. Templates are brilliant for

generic code, right? If you want a

vector of T and T is an in, fabulous, go

knock yourself out. But if I have to

encode the Fibonacci sequence by

recursively instantiating templates or

whatever, count me out. And so I see

constexer as like a way of doing a lot

of the things that I'd otherwise have to

use templates for. And I was hoping to

use it for this. Um, but you can't do

this. I can't even use templates at this

point here because and as Jason's in

last talk we were in you know as much as

we want it to be true we can't take

anything that's a comp a parameter to a

function and build anything that's

constexer out of it and we can't use it

to instantiate templates or anything

like that so I can't make the values 0

to 255 in this way and call a normal

function with them and just do what I

was doing right it just doesn't work

right and this is this is the first

point of learning for me in my journey

first point of discovery in my becoming

a better programmer was constra doesn't

work like that. So the parameters to

constra functions are not constant

expressions even when you make them

constival val you want them to be. In

fact Jason was just telling us this in

the last me meeting the last um session

and so although the compiler absolutely

does know the value of all of these

things you can't take advantage of that

in the way that you want to I want to

you can't instantiate templates from

parameters that were passed to constant

eval functions um and you can't do any

of the other tricks that I was trying to

do. So, I was very frustrated and

annoyed. And I was even more annoyed,

incidentally, that Thoren did not sit

down and start singing about gold in

this screenshot. I spent ages, right?

Anyone who's played this game, what is

it? The Hobbit. Yes, it's pretty

obvious, isn't it? So, at this point, I

was about to sort of give up and try a

completely different attack. Uh, so

there's we're going to do a brief

interlude and I went on a journey of

discovery because I thought to myself,

I've got like 3,000 lines of code and

I'm currently emulating a chip that has

8 and a half thousand transistors in it.

That's almost like one line of code per

three transistors. That doesn't seem

like the right ratio, right? Surely

modern programming techniques should

mean that I could do this in a more

succinct way. So somehow this chip is

doing something that I need to do much

more simply than I'm doing it. So this

is what a Z80 looks like. If you take

the top of it off and you take a bunch

of high resolution photographs, it's

about the size of your fingernail. So,

hiding in this big plastic package

really is a tiny little thing in the

middle of it and all of the wires are

just there because it has to be able to

plug onto a circuit board. Um, it looks

like this. And if you zoom into that bit

at the top, this here is called a PLA.

It's a it's a programmable logic array.

It is

effectively sort of like a

ROMish on uh on a chip. And I was going

to say sort of like a ROM because it's

not just ones and zeros and strictly

looking up in a table. There is

effectively ands and ors and knots in

here and they're arranged in a regular

pattern and then the shape of the thing

over the top of it says well if this bit

is set and this bit is not set and this

bit is set and this bit is not set or

whatever you pick then I am active. Now

obviously this is a picture. Thankfully

there are very lovely people on the

internet who have have done all the work

for me. This is Ken Sheriff whose blog

is fantastic. If you love this stuff, go

look him up. Um, this is rotated 90

degrees from the picture you saw before.

What we've got coming in at the top are

all of the bits of the op code. You

know, the value that I said that's zero,

that's not, and one is loaded with BC.

And then these are the different rows

that get energized when these particular

patterns of bits come in. And then the

the energized row goes off to turn on, I

don't know, the loading and storing

unit, or it turns on the adder, or it

turns on the thing that turns the adder

into the subtractor. All of these

things, all of these bits of what are in

the hidden in the op code are actually

just relatively straightforward bit

manipulation. Okay, maybe we can use

this. So even simpler than that, someone

even clever than sh Ken has said, okay,

there's an even there's a pattern hiding

even in that that 100 odd line table.

And so what we can do is we can take the

op code coming in that 8 bit value treat

it as x y and zed where x is the top two

bits y is the middle three bits and then

the zed is the bottom three bits and

then we can build a relatively simple

table that's kind of like the the

unified theory of decoding right there's

not as many cases and so what we'll do

is if x is zero and zed is zero then

there are eight possibilities for y zero

means not one means exchange two means

decrement and jump three means jump uh

four means jump conditionally and now

four through seven sorry means jump

conditionally and now we can just look

up in the table what the four different

cases of that jump is okay that's

slightly simpler but not much of a win

so far it's a bit like a switch

statement okay how about if x is zero

and zed is one now we look at the q and

p bits which is sub bits of the y and

there are two cases we're either loading

a 16- bit value or we're adding a 16-

bit value to hl okay right this is simpler

simpler

Increments and decrements are also quite

simple. This is the best one. If x is

two, now bear in mind we've only got two

bits. So there are four possibilities

for x. So having one case for all um for

one value of x is a quarter of all op

codes decode as an alu operation which

we can look up in a table based on y. So

it's add, subtract, exclusive or um and

all that good stuff. And then the

So, there's a there's a there's a

pattern in here that I didn't spot when

I was just doing my case statements. I

kind of could see that there was

something going on, but I thought it

would be far too complicated for me to

work out, and I'm glad someone else did.

So, I've got a new approach. Okay, good.

Right, we've been on our interlude. I'm

going to go back to my C++ code, right?

Because I'm losing some people here. So,

if I get some C++ code on the screen,

um, so my new approach, I'm going to

make a class. I'm going to accept my

fate. I'm going to accept that templates

are going to be part of my life. I'm

going to make a class for each category

of instruction. So 16 bit loads, one

type of class. Um,

uh, al operations, another type of

class, not is a type of class, all that

good stuff. For the ones that have

different behavior depending on some of

the degrees of freedom, you know, the

ones that have parameterized on which

parameterized on whether they have the

carry or they don't take the carry,

things like that, the things that came

out of the table. We're going to

templatize that class on them. So I can

specialize the class for its

particular operands. And then a bit

vague now. Somehow I'm going to use the

x, y, and zed bits to select which of

these um classes is the right class to

do the work for this instruction. And

then somehow we're going to build a

compile time table just as we wanted at

the beginning to dispatch to the right

class. All right, this is the trick that

we came up with. And by we, I mean Hana

came up with it and told me I was stupid

for not thinking of it myself, but

that's Hannah. Bless her. So this is

what a knob looks like. We've got two

sort of fields and it's two static

fields. Really these are just sort of

like namespace kind of objectsish but we

need to be able to address them. Um

we've got a pneummonic class. Pneummonic

is just a string but you can't have

constex stood strings because of

reasons. So this is just a fixed size

buffer that's the biggest pneummonic

that you that any op code has. And we

just sort of copy the bytes in. Um and

then we have an execute function.

Obviously, KNOP is again the easiest

instruction to implement. So, we do

nothing. Fantastic. What about a 16- bit

load? So, this one is a 16- bit load

into one of the four possible 16- bit

registers. And so, we need to get the

pneummonic. And the pneummonic can in

fact use stood string. So, LDS there is

a stood string of L of a load. We're

going to add in the name of whatever the

P register is. And then we're going to

add in this comma NN, which is just the

the placeholder for the number we're

going to be reading in. Um, and then the

result of that is a stood string which

luckily we can copy into our pneummonic

and then the stood string dies before

the end of the compilation. So we're

good on that front. We haven't let a

stood string live beyond uh the compile

time. So so far so good. Okay, I can do

some limited amount of string processing

at compile time. Who knew? And then the

execution. Well, again, we can look up

these things in a table. And this is all

contextra. P is a compile time constant.

These two tables are context for tables

of of of things. And we're just going to

read the two bytes that correspond to

the 16 bit value and poke them into the

reg the relevant halves of the 16- bit

are known at compile time. So the

compiler will generate the exact right

code. There's no further indirection

inside the compiler. Whereas with my

previous approach with all of those opos

and things, although the compiler if the

optimizer really tried hard, it could

sometimes um constant propagate and work

out that I was actually reading a

particular variable uh register. It

would often

not. Okay, so how do we select which of

these classes is right for a particular

op code? Right, I promised some some C++

type things, right? some new C++y things

new to me, probably not new to most of

you. Um, what I'm going to do is I'm

going to have a templated variable,

which is something I didn't know you

could do. You can template on a

variable, and I'm going to specialize it

by instruction category. And that means

I can actually have a variable that is

different, has a different type

depending on the parameters that are

passed into its template, which blew my mind.

mind.

What do I mean? Well, let me talk about

this, right? First of all, we're going

to have a base case. This is a missing

instruction which says I haven't covered

one of the cases and I have to

absolutely have to have all the

instructions covered otherwise it's

going to blow up. And I've used a new

feature where I said it's you can't

construct one of these things and if you

do try and construct it, the constructor

is deleted but print out this cool

message which I learned again from a

recent C++ weekly which it said that you

could do this. This is

apparently then um here is our base case

for all op codes. This variable called

instruction is an instance of the

missing instruction. So if you try to

instantiate this right now, you'd get a

compile time error saying, "Hey, I don't

know what to do. It's a missing

instruction." And now I can phrase and

specialize each of the individual

instruction types using requires, which

is a cool feature I've discovered.

Again, no surprise to most of you in

this room, but it was new, relatively

new to me, that we can say, hey, if x is

zero and y is zero and zed is zero of

this op code, which is, you know, this

is just a non- template type parameter,

this op code thing that has the x, the

y, and the zed. Then if you try and

instantiate an instruction with that,

you get a knob. Fabulous. Similarly, if

you've got x= 0, z= 1, q= 0, it's one of

those 16- bit loads. It is a 16- bit

load which is itself templatized on the

p value. And this this format is exactly

that list of instructions I have on that

website that said this is how to decode

the instructions. I can literally paste

the the um the the text from the the

website and put it in the requires

clause and say hey this is how you

decode this kind of instruction. So that was

was

awesome. So now we're going to work out

how to dispatch based on a dynamic value

to this compile time construction. So

what I really want to do again is have

this table and dispatch to it. How do I

write this table? And again this is

where my my brain was blown because um

this is how we do it apparently and this

is again this sort of raises my kind of

oh my gosh I'm just an assembly

programmer hackles because this is looks

very complicated but we're going to make

a parameter pack and we're going to

expand it and parameter pack expansion

has magical properties that lets me do

some clever things and let me show you

what I mean. So first of all, how do we

make a parameter pack of all of the

numbers between 0 and 255 inclusive?

Well, obviously we call stood make index

sequence, right? That returns a bespoke

type which is a essentially a tupole of

zero and then one and then then two and

then three and then four and five. So

when you get compiler error messages for

this, incidentally, it's quite a long

name as it's just got all the numbers

from. So thankfully there's only 256.

I'm ignoring those extended instructions

for this just for the purposes of making

it simpler. And then we pass an instance

of this class immediately to an

immediately invoked function lambda that

sorry immediate yes a lambda and we

don't care about the value of it because

it's not the value that's important.

It's what's in that template list which

we can unpack into a parameter pack or

pack rather into a parameter pack. So

this has the side effect of turning num

into a parameter pack of 0 comma 1 comma

2 comma 3 comma and so on.

Then we create an array of lambdas. Each

lambda is an instantiation of this code

with a different op code number. Hooray.

Finally, we found a way to do this. I

would really wanted to use that nice

constexer. Just write regular imperative

code like you know normal people would

write. But no, you can't do meta

programming in that way. And so I just

have to accept it. Fine. This is what

this is about, right? I'm an old dog.

I've got to learn some tricks. Here's

one. And then you know you just make a

stood array of those and we're done. So,

it wasn't quite as bad as I thought it

was going to be. It's sort of relatively

isolated to this area. And then coming

soon, those of you who are on even newer

pieces of code, this didn't work on

clang 20, which is what I was using. Um,

you can actually do this syntax to

unpack an index sequence directly into a

num, although there's some interesting

caveats with that. So, does it work?

Well, yes, of course it does. Yes.

Hooray. Um, this one, anyone?

What was it?

Oh, no, no, no, no. Yeah. Daily

Thompson's Decathlon. This was which

coincidentally last a few weeks ago I

was in a friend's museum and I live in

America now and so he's got a computer

museum in America and I found a copy of

Daily Thompson's Decathlon in a drawer

somewhere in his I don't know how he's

managed to get hold of it. Okay, it

works. Hooray. Let's look at the code

generation which is really what I'm here

for. Right. This is beautiful. I know

you don't appreciate it perhaps in the

same way that I do but this is how to

dispatch one instruction. It's exactly

what you'd want. In fact, it's so small,

I was able to annotate every single line

of it. I'm not going to go through it

here, but it just does the minimum

amount of work needed to work out which

thing to dispatch to and then jumps to

it at the end.

Fabulous. Let's look at one of those op

codes to see that there's nothing

extraneous going on inside of it here.

So, this is exchange AF with AF prime.

So, it's just switching two registers

and it just reads the two and writes the

two out again. No further indirection,

no table lookups, no nothing. This is gorgeous.

gorgeous.

How long's the Oh, sorry. I actually

deliberately Yeah, thank you. Jason

asked how long the symbol name was. And

there's a reason why I deliberately left

overflow on. Yeah, this is one of those

things. Oh, hang on. If I can get my

mouse to do the thing. Oh, I'm scrolling

on the wrong window because I'm looking

at my notes. Somewhere in here is a

mouse pointer that can come over. Ah, we

are. Okay, I'm a professional. Yeah, now

I'm going to point something out here.

This is cool, right? So embedded in that

is like a pneummonic, one of those

classes that is a compile time class

that that just holds strings. And as of

clang 20, but not clang 19, it kind of

makes the guess that if you have an

array, a stood array of chars, it should

print them out as that, which was super

cool because otherwise it was 16 numbers

one after another, which obviously was

not very helpful, or 10. Anyway, thank

you for the distraction. That's cool.

Look, it keeps going.

Yeah. And there's the lambda. Hey,

right. Okay.

So that's exchange AF with AF. 16 bit

ads are a bit more complicated. Um but

you know like this is you know we look

it up in the table. Here's a little bit

of code. Um it's it devolves to this.

Just read the two registers, add them

together, write them back out again.

Account for seven cycles. I haven't

really talked about time passing, but

that's an important thing to measure as

well. That's cool. Um I'm going to

ignore the flag handling code because

you saw how hideous the uh the C++ code

is. Nothing to see here, but it is

there. But that was the same in the old

implementation. I and I have some ideas

about improving it. So let's get on to

build times because this is the thing

that's close to my heart. If I touch the

CPP file or one of the CPP files in this

half of the project, takes me three or

four seconds. I'm happy enough with

that. But realistically speaking, all of

the actual guts of the code is in a

header file now because it needs to be

included in a couple of places. One for

the disassembler, one for the um the

actual running part of the code, and for

the tests. And now suddenly my build has

jumped up to a very variable 11 to sort

of 20 seconds and it really depended on

what else was going on my computer

because those each of the compiles was

really big and waiting required all

those massive instantiations of

templates and so took up a lot of RAM.

So I noticed some swapping. Again this

is all measured on my computer so it's

not particularly representative but it's

representative of my flow. So not ideal

but also not really as bad as I thought

it was going to be and I'm sure there

are tricks to make it go down. So maybe

maybe I should just keep you not not

keep banging on about build times. I

don't know. Performance 200 times faster

than the Spectrum. Twice as fast as the

last one. It's fabulous. 650 MHz um

computer. That's pretty good actually

given that you know this is only a um

like one and a half MHz 2 G sorry one

and a half GHz machine. That's not bad.

Not a bad clip. And in fact I can get it

up to about a gigahertz if I slightly

cheat. Cool. Well okay. So, we've got a an

experimental version two that we can

play with. And the first thing I want to

do is say, can I use modules because

I've heard they're they're a bit flaky,

but they might be the solution to all my

build wos and my buildtime wos.

Right. There's some laughs in here for

the for the recording. Yeah. Um, and

also, anyone know what this game is?

It's not Nightlaw, but it is similar to

Nightlaw going Head Over Heels. No,

again, it's a similar engine. It's

World. Oh, okay. Hey, I have found

something so obscure no one else knows

what it is. That's fantastic. All right.

Um, coincidentally, I was asking Claude

for suggestions for which spectrum gain

best represented each section, and it

picked this

one. Um, because of the sort of modular

modular look, I suppose, of the the

isometric game. All right, so my hopes

here, cleaner separation because that's

what modules are really about, right? Is

to get rid of the sort of the

pre-processor and all the hell that

comes with it. And then obviously my

secret hope is a faster build as John

Lakeos is in the front row here and he

he does he knows in the early I started

a business on the basis of his book um

large scale C++ design because it brings

down build times generally and that was

fantastic and ever since then I want my

build times to be low as possible and

I'm hoping modules will do this for me

but we'll see.

see.

Okay, so this is what a module looks

like. So, we start with a big line that

says module. And the cool thing cool

thing about this is that you can't put

any pre-processor nonsense before the

word module or it doesn't work. So,

unfortunately, I can't conditionally

make something a module or not. So, I

have to either be fully module or not

module or do something else. And I chose

to do something else which I regret, but

we'll come back to

that. This is a there's a lot of regret

in this talk. Okay, we sort of save what

we're exporting and you can export a

module. You can also export a partition

of a module. And for what it's worth, I

am not trying to represent myself as an

expert at this at all. This is me trying

to get it to work and then randomly

posting on Blue Sky and getting people

to help me until it worked. Um,

importing is so much nicer. I can just

import all of

STO. No more remembering whether it's

algorithm or is it is it uh numeric or

is it, you know, whatever ranges. Um,

you can also import local partitions

within your own module and say, "Hey,

there's another part of my little

localized area that is supporting the

pneummonic stuff. I'm going to import

that here as well." It's sort of the

equivalent of like including a local

header. And then you just put the magic

keyword export in front of things that

you want to be visible to the outside

world. Fabulous. And then the rest of

the code, you can kind of write it like

it's Java. I hate to say that at this

conference, but no, I don't think

anyone's done a talk on Java, but

anyway. Um, it's just you write it out

in the head of and I suppose again to

sort of explain myself here, most of you

all probably write stuff in headers

because you know you like header only

library type things. I I abore them

unfortunately and I try and hide my

terrible awful code into the CPP file so

compiler. But this was kind of nice

because I didn't have to keep making the

decision. Is this something that should

go in the header or is it something

should go in the CP? It's just all in

there in a big blob. Brilliant.

build support was surprisingly good. I

have a contemporary CMake and I had, you

know, clang 19 and then I moved to Clang

20 during the middle of this project.

You just say, "Hey, this set of files as

CXX modules and then I I chose to call

them all CPPM files for C++ modules.

That seems to work." Um, the the files

are there and I've got this module.

CPPM, which is where I kind of import

the local parts and then reexplore the

whole module as a whole to the outside

world, but with only the bits that I

want the outside world to see, outside

world for this component that is, right?

Hooray. And now unfortunately uh the the

bright and hopeful voice goes away and

we start talking about what actually

happened rather than what should have

happened. Import stood didn't work. It

was lovely. There was this um ridiculous

uh char 256 value I had to set some

mystical value inside cmake to make

import stood work. And it did in its

defense. It did work as much as I could

do import stood. But if I tried to use

any other library that itself pulled in

standard headers, I would get not the

clashes you would expect of like, hey,

stren is defined in two places, but some

unusual ABI tags would get noted as

being different. Like it said, you know,

underscore V2AI tag is duplicate

defined. I'm like, I'm thinking ODR

panic. Oh my gosh, this is going to go

horribly wrong. And I asked around and

the received wisdom was, oh yeah, they

need to fix that. And I'm like, ah, I

also noticed that it that

um because I have uh link time

optimization turn over my release builds

because I do tend to hide all of my code

away from the compiler and my CP files,

but I do like inlininers so I'm kind of

torn. Um there was a weird um thing

where the import stood and LTO just

fought each other and so every time I

was building a release build, it would

just rebuild everything from scratch

every time. I'm like just no, don't do

that. Don't do that. So, I couldn't get

it to work and I eventually turned it

off. And then I was kind of telling a

fib earlier. Um, this format over here

is not actually what I went with. What I

did is I had line 12 be hash include the

CPP file and then I could keep both the

CPP file and the CPPM file and have a

dual build. But then there was all sorts

of other horrible hash if deaths I had

to do to stop us me from including in it

was it wasn't the best choice. So in

fact after I wrote all these slides I

went away and made a branch and I did

modules kind of quote properly and I

still had problems and all the the

things you're about to see in the next

slide this um happened in that branch.

So although if you go to look at the

project now it's on GitHub it has that

horrible hash include hack um this is

what I saw when I did it quote properly

and that is the surprise to me right if

you have two components that share a

header file inside your own project then

if you make a change to that header file

the two files that include it will build

in parallel right they can both include

the header file they can both go off and

do their work it means they're probably

doing the same work twice because they

both have to pause that header file and

do some work with it, right? That's a

pain, but at least I can parallelize

it. If you're using modules, and again,

hopefully someone can correct me on

this, but if you're using modules, I

can't even import the other module until

it has been built, which means that it

runs and it compiles and then it then

and only then can I import from it and

see which symbols it actually has in it

and then use them in my compile. And so

what I found was if I dump out the

builds graph as in like what happens if

I touch this file there was in the

non-module case it was like clusters of

like these four files build together

then we link them and then these four

files build together separate somewhere

else and then we link them and then they

use each other's headers and then those

get linked together you know like the

usual pattern you're expecting the

header files are like the sort of like

the layer of separation. What it meant

is essentially everything was linear.

It's like, oh, this project needs all

these things and all of them need to

build individually first before they can

they can include each other and then

they get linked together and now finally

we can start using things that include

the headers. Does that make sense?

Sorry, I I realized I went a bit gabbly

there, but it serializes the build and

maybe I'm doing it wrong. And build

times went up to 30 seconds. That was

not what I was

promised. You know, for reference,

remember it was 10 seconds before and

now it's like 40 seconds. That's no

good. and a clean build which I didn't

show on the other things is like 2

minutes versus 1 minute. It was a

disappointment. So overall I loved the

modularity. I got so many OD not ODR um

like uh flyby includes

um discovered. So I was using you know

size t instead of stood size t in a

bunch of places. I was using all sorts

of headers. It was lovely for just

getting that sense of I'm I'm definitely

importing the right things and I'm using

them from the right name spaces. Lovely.

Love it. Brilliant. probably more

experience is needed to be able to do

this properly and certainly the

linearized build. Maybe I'm just doing

it wrong. It's quite likely. Tooling

would help. A lot of the tools didn't

work all that well with modules. Um

Clang tidy for example seems to not like

it. Um we need some best practices.

Luckily, I think there are people in

this room who might be able to write

some once things are settled down. But I

do think it's close, right? The fact

that I was expecting Seg 14 compilers or

it just not working at all. I got it to

work pretty quickly and it mostly did

what it was said. It just needs to be

better. And I think that's a great

place. So let's talk about co-

routines. I wish to heck that I had been

to this conference before I preferred

this prepared this talk rather than

afterwards because there were three

separate talks on co- routines two days

ago and I would have done a much better

job of it if I'd have been to those

talks first. So take this with a pinch

of salt and either speak to Phil or one

of the other presenters about co-

routines because I think I'm doing it

wrong. In fact, I know I'm doing it

wrong now. But you can learn something

from my mistakes. That's kind of why

you're here, right? That's head over

heels by the way. Yes, you it says it

sorry in big

letters. So why would I want to use co-

routines in an emulator? What on earth

is the point of that? Right? Co-

routines are for like concurrency or IO

or threading something something. Well,

first of all, they're not just for

threading. They're for things that are

running at the same time. And if you're

emulating something accurately, you're not just emulating the Z80. I know I

not just emulating the Z80. I know I showed a picture earlier where I said I

showed a picture earlier where I said I run the Z80 for a 50th of a second and

run the Z80 for a 50th of a second and then I put the picture on the screen.

then I put the picture on the screen. That isn't how it works really because

That isn't how it works really because video game developers I'm going to stare

video game developers I'm going to stare at the video game developer again like

at the video game developer again like to play tricks like, hey, if I change

to play tricks like, hey, if I change this color while the picture's being

this color while the picture's being physically painted, I can make it look

physically painted, I can make it look like there are more colors available

like there are more colors available than there are on the real computer. Oh,

than there are on the real computer. Oh, so if I emulate them separately, I won't

so if I emulate them separately, I won't see these effects that you get when game

see these effects that you get when game programmers use cool tricks.

programmers use cool tricks. So you actually want to do something

So you actually want to do something like this where here in my when I'm

like this where here in my when I'm reading memory it takes three cycles

reading memory it takes three cycles cycle past time when I did that ad the

cycle past time when I did that ad the one 16- bit op uh it takes seven cycles

one 16- bit op uh it takes seven cycles to do an ad and then what does it 80

to do an ad and then what does it 80 passime look like? It looks like this.

passime look like? It looks like this. Hey just account for seven cycles and

Hey just account for seven cycles and then tell all of the bits of hardware do

then tell all of the bits of hardware do whatever you do in seven cycles please.

whatever you do in seven cycles please. So the ULA, the video chip, the audio

So the ULA, the video chip, the audio chip, the tape system when you're

chip, the tape system when you're emulating a tape, you know, do some

emulating a tape, you know, do some work. Do seven cycles worth of work.

work. Do seven cycles worth of work. That's easy. If all you are is a

That's easy. If all you are is a counter, you just bump your number up by

counter, you just bump your number up by that. But if you're a sort of stateful

that. But if you're a sort of stateful system, I don't know, like a video chip,

system, I don't know, like a video chip, then what you end up writing is sort of

then what you end up writing is sort of a horrible state machine where you have

a horrible state machine where you have to kind of manually say, well, if I'm in

to kind of manually say, well, if I'm in the bit where we're at the top border,

the bit where we're at the top border, then I need to write out some things.

then I need to write out some things. But if we get beyond some point, if this

But if we get beyond some point, if this end this many cycles have passed, now

end this many cycles have passed, now I'm moving to the next state, which is

I'm moving to the next state, which is the now I'm doing the left hand border

the now I'm doing the left hand border until eventually I do the bit where the

until eventually I do the bit where the screen and it's just a pain. It's a

screen and it's just a pain. It's a pain.

pain. And realistically the Z80 should work

And realistically the Z80 should work this way too. It's just that I'm using

this way too. It's just that I'm using the Z80 as the thing that primarily

the Z80 as the thing that primarily drives the rest of the system. You could

drives the rest of the system. You could imagine if I had two Z80s if this was a

imagine if I had two Z80s if this was a multiprocessing system that which Z80 is

multiprocessing system that which Z80 is in charge and which Z80 is told can you

in charge and which Z80 is told can you just run one cycle? How do you run one

just run one cycle? How do you run one cycle of a seven cycle ad? You have to

cycle of a seven cycle ad? You have to write everything as a giant state

write everything as a giant state machine. Right? It would be really

machine. Right? It would be really awkward. But that's what co- routines

awkward. But that's what co- routines are for, right?

are for, right? How much nicer if I could just say my

How much nicer if I could just say my video chip, like the physical video

video chip, like the physical video chip, runs all the time, right? As soon

chip, runs all the time, right? As soon as it's got power, that thing's doing

as it's got power, that thing's doing what it does. And what it does is it

what it does. And what it does is it starts at the top of the screen and

starts at the top of the screen and every cycle for some number of cycles,

every cycle for some number of cycles, it waits for the next clock cycle,

it waits for the next clock cycle, right? Raising edge of the clock cycle,

right? Raising edge of the clock cycle, and it writes out one pixel of the

and it writes out one pixel of the border color. Whatever the border color

border color. Whatever the border color is currently, I'm going to write it out.

is currently, I'm going to write it out. And then I'm going to go around in me

And then I'm going to go around in me loop. And then for the next 192 lines,

loop. And then for the next 192 lines, the whole 192 lines of the screen, yes,

the whole 192 lines of the screen, yes, the whole screen is is uh um 256 by 192

the whole screen is is uh um 256 by 192 in today. I forgot to say, which is

in today. I forgot to say, which is about the same as a capital W on an

about the same as a capital W on an iPhone screen right

iPhone screen right now, how far we've

now, how far we've come. I was actually speaking to someone

come. I was actually speaking to someone in the pub and they were, "Oh, I was

in the pub and they were, "Oh, I was doing a presentation on emulators and I

doing a presentation on emulators and I kept I kept trying to find a high

kept I kept trying to find a high resolution screenshot to put in my

resolution screenshot to put in my presentation and then I realized, oh no,

presentation and then I realized, oh no, they were all that size." You know,

they were all that size." You know, like, yeah, that is actually the right

like, yeah, that is actually the right size. Um, so you know, you you can write

size. Um, so you know, you you can write it this way and we just want to write

it this way and we just want to write wait for the right amount of time and

wait for the right amount of time and let the other things in the system do

let the other things in the system do their thing until it's time to come back

their thing until it's time to come back to me and then let it let the compiler

to me and then let it let the compiler do all the hard work for me. And so co-

do all the hard work for me. And so co- routines and again no expert obviously

routines and again no expert obviously no expert I think by now you've worked

no expert I think by now you've worked out my mo um the co- any any routine

out my mo um the co- any any routine that has the co- statement any of the

that has the co- statement any of the co-statesments co-await co yield co-

co-statesments co-await co yield co- returnturn gets magically transformed

returnturn gets magically transformed through a very arcane and complicated

through a very arcane and complicated but very powerful process um we're not

but very powerful process um we're not going to go into all these details go

going to go into all these details go watch one of uh the talks from this very

watch one of uh the talks from this very conference on it but it I like to think

conference on it but it I like to think of it is it's a bit like a lambda You

of it is it's a bit like a lambda You know how when you when you make a

know how when you when you make a lambda, you know that behind the scenes

lambda, you know that behind the scenes some structure is being made with an

some structure is being made with an oper a call operator and any of the

oper a call operator and any of the captures get copied into that lambda and

captures get copied into that lambda and held as as as variables and then you

held as as as variables and then you just call the call just you call the

just call the call just you call the call operator and magic happens there.

call operator and magic happens there. Right? This is like that on steroids

Right? This is like that on steroids where what happens is for each time a

where what happens is for each time a co- underscore magical operation happens

co- underscore magical operation happens that the code between that and either

that the code between that and either the top of the function or the previous

the top of the function or the previous co- routine is kind of broken into its

co- routine is kind of broken into its own sub function and then we hold in the

own sub function and then we hold in the state in the lambda state. We have all

state in the lambda state. We have all of the local variables that are shared

of the local variables that are shared between everything and then we have like

between everything and then we have like a pointer to which function is the

a pointer to which function is the function we would resume at if we were

function we would resume at if we were to come back to this co- routine and we

to come back to this co- routine and we start off with some initial state

start off with some initial state although that can be configured

although that can be configured everything can be configured that's what

everything can be configured that's what makes them awesome awesome um yes that

makes them awesome awesome um yes that was that was a Freudian uh slip should

was that was a Freudian uh slip should we say let me get my um and so like the

we say let me get my um and so like the in it will do like everything from the

in it will do like everything from the top of it and sort of set all the

top of it and sort of set all the variables to their initial variables and

variables to their initial variables and then say let's go to step one and then

then say let's go to step one and then it returns some magical thing that you

it returns some magical thing that you can control when it will get scheduled

can control when it will get scheduled again, when it'll get resumed. Step one

again, when it'll get resumed. Step one looks like this. This is the top of that

looks like this. This is the top of that bit where the border was happening. We

bit where the border was happening. We kind of wait that many cycles and then

kind of wait that many cycles and then when we reach the right point, we set

when we reach the right point, we set the Y value to zero, which is now the Y

the Y value to zero, which is now the Y at the top of the screen. We go to step.

at the top of the screen. We go to step. I think you get the idea, right? Some

I think you get the idea, right? Some magic has happened behind the scenes and

magic has happened behind the scenes and um and our it's written the state

um and our it's written the state machine for us. That's what I want from

machine for us. That's what I want from it. And that works beautifully. It works

it. And that works beautifully. It works great. Um there is a little bit of when

great. Um there is a little bit of when we first call the function, where are

we first call the function, where are we? Right at the beginning here. when

we? Right at the beginning here. when you call this video run and it spots the

you call this video run and it spots the coowwait um some machinery comes in and

coowwait um some machinery comes in and all that state that we were looking at

all that state that we were looking at that state object gets allocated

that state object gets allocated somewhere. So there's a bit of an

somewhere. So there's a bit of an allocation somewhere. Sometimes the

allocation somewhere. Sometimes the compiler can elide them but in the case

compiler can elide them but in the case of the the situations that I'm using

of the the situations that I'm using here it won't be able to because they're

here it won't be able to because they're very very longived. So that was a pain.

very very longived. So that was a pain. Oh get out of there. The Z80 now would

Oh get out of there. The Z80 now would look something like this. You know this

look something like this. You know this is what the main loop looks like. I

is what the main loop looks like. I didn't show this before but you know we

didn't show this before but you know we do uh um an operate. Oops. Hang on.

do uh um an operate. Oops. Hang on. Haven't done the things. Um, so what

Haven't done the things. Um, so what we're going to do is we're going to say

we're going to do is we're going to say like reading from memory takes some

like reading from memory takes some time. So I have to co-await it. And that

time. So I have to co-await it. And that will effectively deschedule the Z80 at

will effectively deschedule the Z80 at this point. Let the video chip do three

this point. Let the video chip do three cycles of drawing pixels. And then it'll

cycles of drawing pixels. And then it'll come back to me and go, "Hey, three

come back to me and go, "Hey, three cycles have passed. Read the memory

cycles have passed. Read the memory now." And then I need to dispatch off to

now." And then I need to dispatch off to my op code. And this is where the

my op code. And this is where the problem

problem starts, right? Because this function

starts, right? Because this function here, instructions table op code, also

here, instructions table op code, also will need to read memory and let time

will need to read memory and let time pass. So it kind of needs to be a co-ine

pass. So it kind of needs to be a co-ine as well.

as well. But I don't want to create an allocation

But I don't want to create an allocation here to do that. And it can't live on

here to do that. And it can't live on the stack because I know I'm going to be

the stack because I know I'm going to be yielding to god knows how many other

yielding to god knows how many other processes. So there's going to be an

processes. So there's going to be an allocation here. And so what I really

allocation here. And so what I really want to be able to do is, you know, in

want to be able to do is, you know, in our execute routine, I need to be able

our execute routine, I need to be able to do this co-await again. And I

to do this co-await again. And I couldn't work out how to do that without

couldn't work out how to do that without causing an allocation pretty much every

causing an allocation pretty much every cycle of the machine, which is, you

cycle of the machine, which is, you know, a 100 thousand no whatever 100

know, a 100 thousand no whatever 100 megahertz. Yeah.

megahertz. Yeah. million times a second, which doesn't

million times a second, which doesn't sound something that's very feasible.

sound something that's very feasible. Now, I could probably do something

Now, I could probably do something clever with allocators, but there's a

clever with allocators, but there's a lot of stuff going on there. And so,

lot of stuff going on there. And so, this was a bit of a shame. And I I

this was a bit of a shame. And I I basically gave up at this point, I'll be

basically gave up at this point, I'll be honest. And during this this last week

honest. And during this this last week in every evening, I've been trying to go

in every evening, I've been trying to go back to it, having like channeled what

back to it, having like channeled what I've learned from Phil and Co. And I

I've learned from Phil and Co. And I haven't been able to resurrect this and

haven't been able to resurrect this and get it in the way that I want it to

get it in the way that I want it to work, right, and the way it should work.

work, right, and the way it should work. And I actually cheated and I asked on a

And I actually cheated and I asked on a forum um somebody who had a C++

forum um somebody who had a C++ co-outine emulator of the Game Boy and

co-outine emulator of the Game Boy and they sent me the link to their GitHub

they sent me the link to their GitHub repository and I'm like oh how are you

repository and I'm like oh how are you doing this? It looks like it's doing

doing this? It looks like it's doing what I'm doing and every function call

what I'm doing and every function call that it was making that was you know

that it was making that was you know like hey handle read handle add handle

like hey handle read handle add handle whatever was a macro that was

whatever was a macro that was hashdefined and so effectively it was

hashdefined and so effectively it was just one big co- routine. whole loop

just one big co- routine. whole loop every single function call wasn't a

every single function call wasn't a function call. was an inlined macro that

function call. was an inlined macro that then meant that like everything was

then meant that like everything was living in the same co- routine and that

living in the same co- routine and that was kind of oh I don't you know I don't

was kind of oh I don't you know I don't want to take on co- routines but have to

want to take on co- routines but have to use macros that seems like the wrong

use macros that seems like the wrong choice there so I'm going to have to

choice there so I'm going to have to come back to this right co- routines are

come back to this right co- routines are awesome there is a learning curve maybe

awesome there is a learning curve maybe they're too complex I don't know that

they're too complex I don't know that they're too comp about as complex as

they're too comp about as complex as they need to be obviously there are

they need to be obviously there are libraries on top of this that you could

libraries on top of this that you could probably use I've got a lot to learn

probably use I've got a lot to learn okay how we we're doing pretty badly on

okay how we we're doing pretty badly on time I'm going to have to speed up a

time I'm going to have to speed up a little bit so future directions. Where

little bit so future directions. Where can I go with this? So, I've I've done

can I go with this? So, I've I've done I've done co- routines. We did con

I've done co- routines. We did con concepts and con exper and I've showed

concepts and con exper and I've showed you how terrible a programmer I am and

you how terrible a programmer I am and how I shouldn't be allowed to write code

how I shouldn't be allowed to write code anymore. What would be really lovely is

anymore. What would be really lovely is to be able to like take

to be able to like take games and just play them wherever I am,

games and just play them wherever I am, right? Oops. Let me go back here.

right? Oops. Let me go back here. Whoops. Oh, I've given the game away

Whoops. Oh, I've given the game away now.

Bollocks. There we go. And so I'd like to be able to play my game in my slides.

to be able to play my game in my slides. I don't want to have to run a program

I don't want to have to run a program like I was running before. I want to be

like I was running before. I want to be able to play Jetack in a web browser

able to play Jetack in a web browser inside because you know all my other

inside because you know all my other emulators are in a web browser. So I

emulators are in a web browser. So I thought to myself, how could I possibly

thought to myself, how could I possibly Oh, Cisco. Good. I love this game. This

Oh, Cisco. Good. I love this game. This is actually probably a All

is actually probably a All right. This is actually what I was

right. This is actually what I was hoping to be able to do. This is my

hoping to be able to do. This is my ultra stretch goal. And yes, someone

ultra stretch goal. And yes, someone from Rare is here. So yeah, please don't

from Rare is here. So yeah, please don't get me in trouble with this one. You can

get me in trouble with this one. You can take the picture, but just don't please

take the picture, but just don't please don't come after me. Um, I have got the

don't come after me. Um, I have got the cassette tape somewhere. Uh, I promise

cassette tape somewhere. Uh, I promise for the purposes of the legal people.

for the purposes of the legal people. Um, so yes, I wanted to get it working

Um, so yes, I wanted to get it working in in the web browser and I obviously

in in the web browser and I obviously have, which is great. And it was a lot

have, which is great. And it was a lot easier than I thought it was going to

easier than I thought it was going to be. Web assembly isn't that hard,

be. Web assembly isn't that hard, especially when you have a very

especially when you have a very controlled environment where effectively

controlled environment where effectively the entire emulator is an exercise in a

the entire emulator is an exercise in a 64k buffer making changes to itself over

64k buffer making changes to itself over and over again. Right? That's all the

and over again. Right? That's all the CPU is doing to a buffer of numbers,

CPU is doing to a buffer of numbers, right? So there's not really many things

right? So there's not really many things from the outside world that need to come

from the outside world that need to come in. I just need to be able to take the

in. I just need to be able to take the picture out occasionally and put the

picture out occasionally and put the keyboard presses in. So you have to grab

keyboard presses in. So you have to grab something called WY, which is the web

something called WY, which is the web assembly system interface. um it's kind

assembly system interface. um it's kind of like the operating system that you're

of like the operating system that you're cross-co compiling to because you can't

cross-co compiling to because you can't compile for a regular um C++ library

compile for a regular um C++ library there. No such thing exists for for for

there. No such thing exists for for for JavaScript, which is what we're going to

JavaScript, which is what we're going to be embedding in. Um and then you need

be embedding in. Um and then you need the component in the JavaScript side

the component in the JavaScript side that pretends to be like the operating

that pretends to be like the operating system and it says, hey, this big area

system and it says, hey, this big area of memory, I can malakan free it to you

of memory, I can malakan free it to you by handing different amounts of this

by handing different amounts of this this RAM to you. And if you need to be

this RAM to you. And if you need to be able to read and write files, then you

able to read and write files, then you call these functions and it goes out

call these functions and it goes out into JavaScript world and JavaScript

into JavaScript world and JavaScript does whatever JavaScript do. And then

does whatever JavaScript do. And then the build settings are as simple as

the build settings are as simple as this. I just said target wm 32 wy and I

this. I just said target wm 32 wy and I pointed at where I got the um system

pointed at where I got the um system route for the cross compilation to and

route for the cross compilation to and it just worked by and large. There are a

it just worked by and large. There are a couple of caveats which you can talk to

couple of caveats which you can talk to me afterwards. There's some wiring you

me afterwards. There's some wiring you have to do. Everything's C. So there was

have to do. Everything's C. So there was one one of the interrupt uh

one one of the interrupt uh conversations we went to earlier. We

conversations we went to earlier. We talked about this. Of course

talked about this. Of course everything's C. There's some magical

everything's C. There's some magical clang stuff that you put in to sort of

clang stuff that you put in to sort of say how to export things. Um, everything

say how to export things. Um, everything is a number on the remote side because

is a number on the remote side because all you're really doing is looking for

all you're really doing is looking for the window of a computer's RAM like this

the window of a computer's RAM like this virtual machine that your web assembly

virtual machine that your web assembly runs in. So everything's virtual

runs in. So everything's virtual machines and RAM block blocks. Um, you

machines and RAM block blocks. Um, you know, there's some magic in here on the

know, there's some magic in here on the JavaScript side. You don't need to worry

JavaScript side. You don't need to worry about this. Um, you I wrapped it in a

about this. Um, you I wrapped it in a JavaScript class. So I think this is the

JavaScript class. So I think this is the only JavaScript that's been shown at

only JavaScript that's been shown at this conference maybe. Oh, no. There was

this conference maybe. Oh, no. There was there was we had um yeah, we had one.

there was we had um yeah, we had one. Yeah, exactly. Um and so all we're

Yeah, exactly. Um and so all we're really doing is just doing a little bit

really doing is just doing a little bit of very light interoperability between

of very light interoperability between the things where we're we're sort of

the things where we're we're sort of passing the this pointer in uh to a

passing the this pointer in uh to a function. And one of the things we can

function. And one of the things we can do is we can form a an array over a

do is we can form a an array over a subsection of the RAM. So I can call a

subsection of the RAM. So I can call a function in the CC code that fills in a

function in the CC code that fills in a stood vector with all of the RGB values

stood vector with all of the RGB values and then I can map that in JavaScript

and then I can map that in JavaScript side as a UN8 array and then pass it to

side as a UN8 array and then pass it to the blit function to draw it to the

the blit function to draw it to the screen. So that's how we can draw it to

screen. So that's how we can draw it to the

the screen. And when it crashes you get

screen. And when it crashes you get assembly in your web browser. How cool

assembly in your web browser. How cool is that? I couldn't not have a slide

is that? I couldn't not have a slide with assembly on it from a web browser.

with assembly on it from a web browser. But this is not mine. This is built into

But this is not mine. This is built into Chrome, right? So this is what you get.

Chrome, right? So this is what you get. This is what it's a really weird thing.

This is what it's a really weird thing. It's a lot of fun. It runs at about 100

It's a lot of fun. It runs at about 100 megahertz, so it's not as fast as even

megahertz, so it's not as fast as even the the the nav version. I was using the

the the the nav version. I was using the v2 version for this. It is live at

v2 version for this. It is live at spectrum.zenia.org if you want to go and

spectrum.zenia.org if you want to go and play around with it. And clearly C++ is

play around with it. And clearly C++ is the future of web

the future of web development. All right, the real future

development. All right, the real future directions, the performance can be made

directions, the performance can be made better. That is what I love doing. That

better. That is what I love doing. That is my that's really my my happy place is

is my that's really my my happy place is making things go faster. There are some

making things go faster. There are some really neat tricks involving computed

really neat tricks involving computed go-tos in regular emulators to do with

go-tos in regular emulators to do with how um the branch predictor fits into

how um the branch predictor fits into all this which I definitely don't have

all this which I definitely don't have time to go into. I'd love to support

time to go into. I'd love to support more of the spectrum family and moreover

more of the spectrum family and moreover I'd really want to get co- routines

I'd really want to get co- routines working. I think there has to be a way

working. I think there has to be a way and it seems so natural and certainly it

and it seems so natural and certainly it would allow me to add more peripherals

would allow me to add more peripherals more easily without writing horrible

more easily without writing horrible state machines myself.

state machines myself. super stretch goal would be to get just

super stretch goal would be to get just in time compilation. Turning the uh Z80

in time compilation. Turning the uh Z80 into Intel x86 and then just remembering

into Intel x86 and then just remembering the Intel 86 and like just calling it

the Intel 86 and like just calling it over and over again. Turns out to be

over and over again. Turns out to be spectacularly hard because um back in

spectacularly hard because um back in the 8-bit days almost everyone used

the 8-bit days almost everyone used selfmodifying code. So the code would

selfmodifying code. So the code would change all the time. I know I did. Um,

change all the time. I know I did. Um, all right. These are the extra bonus

all right. These are the extra bonus slides I had to put in because Daisy.

slides I had to put in because Daisy. Um, so Daisy nerd sniped me into

Um, so Daisy nerd sniped me into thinking when she was showing how uh the

thinking when she was showing how uh the Clawude command line tool was able to

Clawude command line tool was able to like make changes to the Clang codebase.

like make changes to the Clang codebase. So I thought, well, AI is not coming for

So I thought, well, AI is not coming for my

my job. Um, so I said this to Claude. This

job. Um, so I said this to Claude. This this is a shortened version. This is a

this is a shortened version. This is a part of a presentation I'm doing on

part of a presentation I'm doing on modern C++. I'd love to demonstrate

modern C++. I'd love to demonstrate something cool you could do in my

something cool you could do in my codebase. A suggestion would be to add

codebase. A suggestion would be to add tests. What would you suggest? You know,

tests. What would you suggest? You know, I thought I would give it a really low

I thought I would give it a really low ball idea about adding some tests to a

ball idea about adding some tests to a project. That was one of the things that

project. That was one of the things that she said was kind of it would be good at

she said was kind of it would be good at and it was so I went off and it

and it was so I went off and it ruminated on my codebase and it asked

ruminated on my codebase and it asked some questions and I went back and forth

some questions and I went back and forth and then it came back with this and all

and then it came back with this and all you know I said think harder, right? We

you know I said think harder, right? We learned that that was the hack, magic

learned that that was the hack, magic hack.

hack. This is what Claude came with based on

This is what Claude came with based on my analysis. Realtime ray tracing AI,

my analysis. Realtime ray tracing AI, it's no tests. I'm not writing tests.

it's no tests. I'm not writing tests. Even the AI doesn't want to write tests.

Even the AI doesn't want to write tests. C++ 26 reflectionbased game state

C++ 26 reflectionbased game state debugger. That would be cool. But my top

debugger. That would be cool. But my top recommendation would be interactive

recommendation would be interactive spectrum memory heat map visualizer.

spectrum memory heat map visualizer. This is a preede version. It was a lot

This is a preede version. It was a lot more involved than that. It was so much

more involved than that. It was so much fun, right? So I said, "All right, go on

fun, right? So I said, "All right, go on then." And it bloody did. It just did.

then." And it bloody did. It just did. And I'm I'm going to very quickly try

And I'm I'm going to very quickly try and show you what it did here. So I have

and show you what it did here. So I have to do he it put it behind a command line

to do he it put it behind a command line flag. So it's not on all the time, but

flag. So it's not on all the time, but effectively what it's doing and and also

effectively what it's doing and and also it's got a it wrote me a read me and I

it's got a it wrote me a read me and I keep moving the mouse pointer the wrong

keep moving the mouse pointer the wrong way. Let's go over here. So there's an

way. Let's go over here. So there's an overlay over the screen now. And the

overlay over the screen now. And the hotness as in how often a particular

hotness as in how often a particular memory location is being either read or

memory location is being either read or written to is superimposed over. So the

written to is superimposed over. So the whole 64K is kind of superimposed. So

whole 64K is kind of superimposed. So you can see the red areas are where it

you can see the red areas are where it currently um is reading and writing to

currently um is reading and writing to and it even put in some keys so I can

and it even put in some keys so I can control it and and it it gave me three

control it and and it it gave me three different um uh uh color schemes which

different um uh uh color schemes which is cycling through my uh so there's

is cycling through my uh so there's spectrum, there's gray spa scale and

spectrum, there's gray spa scale and then there's uh heat which I don't

then there's uh heat which I don't actually aren't showing particularly

actually aren't showing particularly well here and I'm just making a pig's

well here and I'm just making a pig's ear of demo demoing it but you can

ear of demo demoing it but you can definitely see some things going on

definitely see some things going on here. These red marks are like wherever,

here. These red marks are like wherever, you know, maybe some important variables

you know, maybe some important variables that the the thing has been using all

that the the thing has been using all the time. I don't know. It's a way of

the time. I don't know. It's a way of kind of getting a sense about what's

kind of getting a sense about what's going on. I've again I got it to compile

going on. I've again I got it to compile last night and I thought, "Oh, sugar. I

last night and I thought, "Oh, sugar. I now I need to write some slides about

now I need to write some slides about this and I don't really know what it's

this and I don't really know what it's doing." But that might actually be a

doing." But that might actually be a cool thing to add, but like it did it

cool thing to add, but like it did it all itself. And so I am mildly worried

all itself. And so I am mildly worried for my job, but not that much. Who was

for my job, but not that much. Who was going to stand up in front of you all

going to stand up in front of you all and tell you about it?

and tell you about it? Eh, all right. Let me go click. So, in

Eh, all right. Let me go click. So, in conclusion, as I am very much over, um,

conclusion, as I am very much over, um, is there hope for me as an old dog? I

is there hope for me as an old dog? I like to think so. I like to think so.

like to think so. I like to think so. So, what are my takeaways? What am I

So, what are my takeaways? What am I going to say to you? What am I going to

going to say to you? What am I going to exhort you to do as you leave this room

exhort you to do as you leave this room and go home? Well, the main thing that I

and go home? Well, the main thing that I learned as a sort of 30ish year veteran

learned as a sort of 30ish year veteran of doing this is I felt frustrated. I

of doing this is I felt frustrated. I felt annoyed. I thought it was stupid. I

felt annoyed. I thought it was stupid. I thought the standards committee have got

thought the standards committee have got no idea what they're doing. everything's

no idea what they're doing. everything's too difficult. And then I remembered

too difficult. And then I remembered learning is uncomfortable. That's what

learning is uncomfortable. That's what learning is, right? You forget if you're

learning is, right? You forget if you're a if you're a senior person like myself

a if you're a senior person like myself who's been doing this a long time, kind

who's been doing this a long time, kind of a lot of things come easily and you

of a lot of things come easily and you kind of get used to that. You get in the

kind of get used to that. You get in the groove of like, yeah, I just got to

groove of like, yeah, I just got to knock out a class. I'm going to do a

knock out a class. I'm going to do a test. Everything's easy. And then

test. Everything's easy. And then something turns out really hard and you

something turns out really hard and you think, "This is dumb." And you go, "No,

think, "This is dumb." And you go, "No, learn how to do it. Spend some time with

learn how to do it. Spend some time with it. Get frustrated. Come back to it a

it. Get frustrated. Come back to it a week later and go, hey, that wasn't half

week later and go, hey, that wasn't half as bad as I thought." You know, that's

as bad as I thought." You know, that's definitely what I've learned from this

definitely what I've learned from this and it's been so valuable. It's the

and it's been so valuable. It's the reset I needed. Challenge your

reset I needed. Challenge your assumptions. Maybe 10 seconds of build

assumptions. Maybe 10 seconds of build time is perfectly reasonable, Matt.

time is perfectly reasonable, Matt. Maybe. Hopefully, you realize that I

Maybe. Hopefully, you realize that I love this, right? Do things that bring

love this, right? Do things that bring you joy. I mean, we're I'm so lucky to

you joy. I mean, we're I'm so lucky to have a time like this to be able to do

have a time like this to be able to do this kind of thing. And I'm so lucky

this kind of thing. And I'm so lucky that I can write a program that is fun

that I can write a program that is fun for me at every level. Well, even when

for me at every level. Well, even when it was frustrating, even when it took me

it was frustrating, even when it took me six weeks instead of the three days I

six weeks instead of the three days I originally thought because software

originally thought because software engineers just can't estimate

time. And then, you know, even though Claude won't do it for you, testing is

Claude won't do it for you, testing is worth it, right? One thing I didn't say

worth it, right? One thing I didn't say is the V2 version that I did worked

is the V2 version that I did worked first time once all the tests passed. As

first time once all the tests passed. As in like I had my suite of tests, it

in like I had my suite of tests, it would tell me, "Oh, no, the ALU is

would tell me, "Oh, no, the ALU is broken. I want fix that." Okay, now

broken. I want fix that." Okay, now okay, the ad isn't working right. Okay,

okay, the ad isn't working right. Okay, that's the ad. Okay, now this

that's the ad. Okay, now this instruction is working. Okay. Oh, all

instruction is working. Okay. Oh, all the tests pass. Cool. All right, throw

the tests pass. Cool. All right, throw manic minor in it. Oh, and off we went.

manic minor in it. Oh, and off we went. It was just a joy. So, write tests,

It was just a joy. So, write tests, they're good. We know that, right? But

they're good. We know that, right? But it's kind of nice to be reminded of

it's kind of nice to be reminded of that. So, let's quickly revisit my

that. So, let's quickly revisit my preconceptions in the no time I have

preconceptions in the no time I have left. Um, build times, they did get

left. Um, build times, they did get worse. Does it matter? Probably not. No.

worse. Does it matter? Probably not. No. Maybe in bigger projects, we need to

Maybe in bigger projects, we need to think about this. And I'm hoping that

think about this. And I'm hoping that the people in this room will sort out

the people in this room will sort out the modules problems and will continue

the modules problems and will continue to give good advice about how to

to give good advice about how to properly segment parts of your codebase

properly segment parts of your codebase so that the build times in one error

so that the build times in one error don't necessarily impact the other

don't necessarily impact the other parts. Bad error messages, no, they're

parts. Bad error messages, no, they're all great. Compilers have gotten so much

all great. Compilers have gotten so much better. You know, this looks bad, but

better. You know, this looks bad, but it's telling me exactly which constraint

it's telling me exactly which constraint wasn't working, and it's pointing at the

wasn't working, and it's pointing at the exact thing that isn't fixing fitting

exact thing that isn't fixing fitting that. And like we saw in the um earlier

that. And like we saw in the um earlier part um e even the stupid symbol names

part um e even the stupid symbol names where I'm using like packing names into

where I'm using like packing names into types it's now showing the name inside

types it's now showing the name inside the type. So that was great. Now I'm

the type. So that was great. Now I'm showing it there right modules and co-

showing it there right modules and co- routines. What do I think about them?

routines. What do I think about them? Modules 2025 might be the year of the

Modules 2025 might be the year of the module. It might be co- routines are

module. It might be co- routines are ready. Go and use them. If they make

ready. Go and use them. If they make sense for you go and use them. There are

sense for you go and use them. There are plenty of good libraries out there that

plenty of good libraries out there that I haven't shown. Um but they make co-

I haven't shown. Um but they make co- routines tractable. Things like just a

routines tractable. Things like just a generator if you can use it that just

generator if you can use it that just works. is great. Um, are they too

works. is great. Um, are they too complicated? Maybe. Or maybe that's just

complicated? Maybe. Or maybe that's just this old dog going back to his old ways.

this old dog going back to his old ways. Tooling wasn't that great. This is um

Tooling wasn't that great. This is um the otherwise excellent Caion getting

the otherwise excellent Caion getting very confused about all the things that

very confused about all the things that I was doing when I kept switching

I was doing when I kept switching backwards and forwards between the

backwards and forwards between the module view and the notu view. So, I

module view and the notu view. So, I didn't like the the tooling support

didn't like the the tooling support wasn't great. Clang D would keep telling

wasn't great. Clang D would keep telling me that things were uninitialized on the

me that things were uninitialized on the line that was initializing them. And I'm

line that was initializing them. And I'm like, no, it's there.

like, no, it's there. Um, build times obviously I'm going to

Um, build times obviously I'm going to complain about them forever. There was

complain about them forever. There was just a woeful lack of any kind of con uh

just a woeful lack of any kind of con uh context for a compile time string

context for a compile time string manipulation. You have to kind of roll

manipulation. You have to kind of roll roll your own and that's kind of

roll your own and that's kind of annoying but it was not the end of the

annoying but it was not the end of the world. At one stage I resorted to

world. At one stage I resorted to turning um the register uh sorry a

turning um the register uh sorry a constant offset into an asky value by

constant offset into an asky value by adding it to single quote zero which is

adding it to single quote zero which is you know like I just want to use stood

you know like I just want to use stood format here please. So stood format

format here please. So stood format would be nice. Uh, I did like the

would be nice. Uh, I did like the performance and actually that version

performance and actually that version two ver thing was great. Um, phrasing it

two ver thing was great. Um, phrasing it in that elegant way where the requires

in that elegant way where the requires clause matches exactly what the spec

clause matches exactly what the spec said was just beautiful and knowing that

said was just beautiful and knowing that it all just worked. The Java light

it all just worked. The Java light modules were good and of course I loved

modules were good and of course I loved learning new things. So we get to the

learning new things. So we get to the thanks slide. Huge thanks to Hanadukova

thanks slide. Huge thanks to Hanadukova who was really the co-author of all of

who was really the co-author of all of the intelligent parts of this com this

the intelligent parts of this com this this presentation. Thanks to everyone

this presentation. Thanks to everyone else. Um, I hope you like your suitably

else. Um, I hope you like your suitably awful spectrum renditions of your

awful spectrum renditions of your photographs and whatnot. In particular,

photographs and whatnot. In particular, the Compiler Explorer. If you've noticed

the Compiler Explorer. If you've noticed how rubbish it's been for the last six

how rubbish it's been for the last six weeks, it's because I've been doing this

weeks, it's because I've been doing this and not actually I mean, I can feel my

and not actually I mean, I can feel my phone vibrating in my pocket as I'm

phone vibrating in my pocket as I'm being texted about more things that are

being texted about more things that are wrong with it, but never mind. So, as

wrong with it, but never mind. So, as the end of this presentation, go and

the end of this presentation, go and build something cool and learn something

build something cool and learn something from it. Thank you very much indeed.

Click on any text or timestamp to jump to that moment in the video

Most transcripts ready in under 5 seconds

One-Click Copy125+ LanguagesSearch ContentJump to Timestamps

Paste YouTube URL

Enter any YouTube video link to get the full transcript

Most transcripts ready in under 5 seconds

Get Our Chrome Extension

Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.

Add to Chrome — Free

Works with YouTube, Coursera, Udemy and more educational platforms

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube TranscriptPreparing your results…

YouTube Transcript:Keynote: Teaching an Old Dog New Tricks - Matt Godbolt - ACCU 2025