0:01 So today's standup, we are going to talk
0:04 about TDD. Okay, we're not going to have
0:06 some topical topic. We're going to go
0:09 after the big stuff. Okay, TDD. Is it
0:13 good? Is it bad? Do you use it? TJ's
0:15 just smiling, looking like an absolute
0:16 champ in that freeze frame, dude.
0:19 Somehow TJ's the only human I know that
0:20 gets freeze framed and it's good
0:23 picture. Yep. Yeah. Look at DJ. Look at
0:27 that. I can handle this right now.
0:31 I cannot handle this right now.
0:33 All right, Casey, you probably have the
0:35 strongest opinion potentially out of all
0:36 of us.
0:38 Okay. Okay.
0:41 I doubt that, but okay. All right.
0:42 Because I also have a strong opinion,
0:44 but I am I'm willing probably you then.
0:45 Yeah, I think Prime Prime, you got to
0:47 lead off the strong opinions here. I
0:48 I'll lead off with the strong one, which
0:50 is So, I recently just tried another
0:53 round of TDD. Okay, Prime Prime, you
0:56 have to say, "Hey everyone, welcome to
0:58 the standup. We got a great show for you
1:01 tonight. I'm your host of Prime Agen.
1:06 With me as always are TJ. Hello trash.
1:09 Hey everyone, sir. Great to be here
1:12 Prime. Thanks. Today's topic is tester.
1:15 Have you ever heard a podcast
1:18 before? Thanks everybody for joining.
1:19 You're watching the hottest podcast on
1:21 the greatest software engineering
1:23 topics, the standup. Today we have with
1:26 us Tee who's in the basement fixing something.
1:32 Is that a Home Depot ad? Look, he's
1:35 showing us his router. We have with
1:37 those trash dev broke a toe winning a
1:39 judo competition still living off that
1:42 high. And Semor Moratory the actual good
1:44 programmer among us. Oh, all right. I
1:46 got a lot to live up to. Yeah, you got
1:48 Don't worry, it's not hard to live up
1:50 to, at least in this crowd. I'll kind of
1:51 start us off on this one. So, we've been
1:52 developing a game called The Towers of
1:54 Mordoria. We came up with the name from
1:57 an AI and AI generated generated that
1:59 name for us. And effectively, I decided
2:01 at one point I have this kind of really
2:04 blackbox experience. I have a deck of
2:06 cards that I need to be able to draw
2:07 cards out. When they're played, they
2:09 either need to go to the discard pile or
2:12 the exhaust pile. When there's no more
2:13 cards in my hand or not enough to be
2:15 able to draw cards, I need to be able to
2:17 take all the discard cards. Dude, teach
2:19 is just flexing on us touching grass
2:21 right now. Mhm. put uh put all the
2:24 discard cards the shuffle everything up
2:26 and then draw new cards back out.
2:28 Replenish the draw pile, right? This
2:29 sounds like a
2:32 TDD pinnacle problem. There's black
2:34 boxes. There's a few operations. I'm
2:35 going to be able to do this. And so I
2:37 just did this like a week and a half ago
2:40 on stream. It was amazing. I did the
2:42 whole thing, designed the entire API in
2:44 the test, then set up all the tests, set
2:47 up it all to fail, then it did work or
2:48 then it failed. Then I made it work. and
2:49 it worked and it was great and
2:50 everything was perfect and I was like
2:52 there we go, we just did it. I just TDD
2:55 and it was a good experience. So then I
2:57 took what I made and I went to integrate
2:59 it into the game and realized I made an
3:01 interface that's super good for testing
3:03 and not actually good for the actual
3:06 thing I was making and I had to redo it.
3:08 And so it always, you know, that's kind
3:10 of been a lot of my experience with TDD
3:12 is that either the thing is simple
3:15 enough and it's super blackbox enough
3:17 that I can make it and both the testing
3:19 interface and the usage interface is the
3:21 same or I create an interface that's
3:23 really great for testing and it's
3:24 actually not that great for the
3:26 practical use case. And I know people
3:28 talk about how it's a great way to
3:29 develop your, you know, your
3:31 architecture and all this kind of stuff.
3:33 I've just yet to really buy it. But I
3:35 will say the thing that I do like doing
3:37 is on problems that are hard to develop
3:40 and have many moving constraints. I do
3:42 like to cocreate tests with it. Meaning
3:44 that I come up with an idea. How do I
3:47 want to use it? I program it out. Then I
3:49 go create a test which is backwards from
3:51 TDD. Go, okay, this is what I want to
3:53 see. Am I even correct on this? Can I
3:55 get a fast iteration loop being like,
3:58 no, no, no, yes. Okay, got it. Next
3:59 thing I want to make sure it does this.
4:01 No, no, no. Yes. Okay, got it. And so I
4:05 do like the fast um iteration cycle of
4:07 getting tests in early but few problems
4:10 feel like I can actually have that and I
4:11 feel like this all things need to be
4:13 tested immediately and need to be
4:15 written first. You just write dumb
4:17 interfaces everything becomes inversion
4:20 of control. Everything's an interface
4:22 and you just have this abstract mess at
4:23 the end of it because testing was the
4:26 requirement not hey use it when it's
4:30 good. So, that's my TDD experience and I
4:31 feel fairly strong about that. And if
4:32 you say TDD is good, I feel like I'm
4:38 I can respond to that. Or Casey, if you
4:40 have strong feelings. No, I think Trash
4:42 Let's Let's uh let's let's go to trash
4:50 You first. You couldn't like deter me
4:53 from doing any task than to tell me I
4:54 need to like start off by writing tests
4:57 first off. Second, I kind of had a very
4:59 similar experience recently building.
5:00 Hold on. Can you Can you rewind for a
5:02 second? What What did you just say for
5:05 your first point? I said you couldn't
5:07 deter me from doing work if you let off
5:10 with me having to write tests first.
5:12 I don't even know what that means. You
5:14 mean you couldn't deter me anymore? Like
5:17 that's the most deterring thing or Oh,
5:20 man. Am I stupid? You are. Can you try
5:23 using a differentuse?
5:25 Okay, let's let's reward this. Flip,
5:27 please take this out. Flip, leave it in.
5:28 I don't like test driven development.
5:33 However, okay, however, however, I just
5:34 built something recently at work which
5:36 was very kind of similar to what Prime
5:39 was saying. Um, I ended up doing a
5:45 but you wouldn't notice this, but I'm
5:47 actually really kind of passionate about
5:50 testing. So I'm going to kind of like
5:52 when I test I would
5:55 say 2% of the time I actually lead off
5:57 with test driven development and the
5:59 rest is usually post implementation because
6:00 because
6:03 one requirements and all these things
6:05 change afterwards, right? So it's like
6:07 why write tests for things that are
6:08 potentially going to change and you
6:09 don't really know what's going to
6:12 happen. But I do write tests for very
6:13 granular things that I think are going
6:14 to be used all the time. So for
6:17 instance, like if I wrote a parser for
6:19 like my CLI, the parser is going to be
6:21 reused by multiple commands, probably
6:23 never going to change. So I typically
6:25 will write that and I'll also also write
6:26 tests for things that take multiple
6:29 steps to test. So like, you know, people
6:30 argue for end to end tests because you
6:32 test just like a user. So that's kind of
6:33 how I like to develop. I like to use it
6:36 like I would use it for anybody else.
6:37 But if I got to do like three steps to
6:39 get to the point I want to test, I'm
6:40 like, "All right, screw this. I'm just
6:42 going to write a test specifically for
6:44 this like instance." and then I save
6:46 what save time that way. So that's
6:48 usually the time I arrive at test-driven
6:50 development is specifically in those
6:52 scenarios. Otherwise it's purely just
6:54 like you know uh what do you call it
6:56 integration testing and stuff like that
6:58 because you end up with stuff like like
6:59 prime said like you when you write tests
7:01 you write everything so granular and
7:03 everything works so modular that you
7:05 just don't even know what it's like when
7:06 you actually put them together. So
7:08 that's why that's kind of like the way I
7:10 approach things. That being said, like
7:11 people that kind of
7:13 like live and die by test-driven
7:17 development, I don't I don't get it. Um,
7:24 proTdd. That's all I got for right
7:27 now. I'm I'm uh pretty much right there
7:30 with trash on this one. My My feeling on
7:32 TDD is pretty much the same as my
7:35 feeling on oop, right? The problem is
7:39 not if you end up with an object that or
7:40 something that looks like an object in
7:43 your code in oop that's fine. The
7:45 problem is the object-oriented part the
7:47 like thinking in terms of that wastes a
7:49 lot of your time and leads you to bad
7:51 architecture. I think TDD is the same
7:54 way. I think testing is very good but
7:56 test driven development meaning forcing
7:58 the programmer to think in terms of
7:59 tests during development like that is
8:01 what your primary thing is as I'm making
8:04 tests to drive the development the TD
8:06 part that's actually the problem. I
8:07 think the reason for that pretty
8:10 straightforward. Uh, and it's kind of
8:11 weird because usually the people who
8:13 advocate for TDD are the same people who
8:16 would poo poo any similar suggestion.
8:18 Like any suggestion that like, oh, we
8:20 should measure performance during
8:21 development. They'll be like, that's
8:22 ridiculous. You're wasting so much
8:24 program time. Like, what about the 8,000
8:27 tests that no one ever needed because we
8:28 ended up deleting that system anyway.
8:30 And they're like, well, you know, test
8:32 driven development is very good. So, I
8:34 think like the problem with all of those
8:37 sorts of things, any of them, is just
8:40 putting emphasis on something instead of
8:42 talking about it like what it really is,
8:44 which is a trade-off. If you're spending
8:47 time making tests or you're orienting
8:49 things towards making tests, that is
8:51 necessarily programming time that isn't
8:53 being spent doing something else. Like
8:55 programming is zero sum. If you spend
8:57 time doing one thing, you're not
8:58 spending time doing another thing. And
9:00 so, for example, if you end up spending
9:02 a lot of your time making the interface
9:04 revolve around tests, you weren't
9:06 spending a lot of your time making the
9:08 interface revolve around actual use
9:09 cases because actual use cases don't
9:11 usually look like tests. Therefore,
9:12 sometimes you get the exact situation
9:14 that Prime was talking about where you
9:15 do test driven development and you end
9:17 up with a completely unusable mess,
9:19 which is actually very common in my
9:21 experience. Right. I wouldn't call it a
9:22 mess. You know, I I write pretty good. I
9:24 mean, it was it was like pretty dizzy.
9:26 Yeah, I just want No, the API the API is
9:28 a mess. Not the not the code, right?
9:30 Like the code has been tested and it
9:32 probably works, but the API is a mess,
9:35 right? I mean, that's what happens. Uh
9:37 because it wasn't designed specifically
9:39 for the kinds of use cases it was meant
9:41 to tackle. And so you end up with that
9:42 problem. And again, it was shifting the
9:45 developer focus was all it took to make
9:47 that happen. Good APIs are hard to make
9:49 in the first place, right? So if you're
9:51 adding more complexity to the
9:52 programmer's workflow when they're
9:54 trying to make an API that's good for
9:56 the, you know, the actual use case,
9:58 you're just going to get a worse result.
10:00 Again, zero sum. So what I would say is
10:02 like testing is better something that
10:05 you do as the thing takes shape when
10:07 you're like, okay, we now have figured
10:09 out like how this system works properly.
10:11 We're really happy with it. The API is
10:13 working nicely. We check the performance
10:15 and we think we've got like reasonable
10:17 we've got a reasonable path towards
10:19 having this perform really well. Now,
10:21 let's start seeing what are the what is
10:23 the overlay like what are the parts we
10:25 can now start to add some little
10:28 substrate to to test that we think are
10:30 going to have bugs that will be hard to
10:31 find because that's usually the way I
10:33 look at like what are the parts of this
10:35 system whose bugs will be either very
10:37 hard to find or really catastrophic like
10:38 they'll cost us a lot of money or
10:40 they'll crash the user's computer or
10:42 whatever right and then you put the
10:45 tests for that right and that's how I've
10:46 always approached it and I think that
10:49 works pretty well and uh you do want to
10:51 do that step, but I don't think you want
10:53 testdriven development. It just seems
10:55 like a really bad idea. And to be
10:58 honest, when I use software that comes
11:01 from people who are like TDD advocate,
11:02 this organizations TD, whatever, it's
11:05 full of bugs all the time, right? Like
11:07 I'm not getting these pristine software
11:09 things. I mean, I'm I would imagine that
11:10 the people at YouTube do a lot of
11:13 testing and code reviews and yet for the
11:16 past eight years, the play button cannot
11:17 properly represent whether something is
11:19 playing or not. for eight years, right?
11:21 Sometimes it's paused and it has the
11:22 play thing and sometimes it's playing
11:24 and it has the play thing and and
11:25 sometimes both of those times it shows
11:28 the pause thing. It's just it doesn't
11:31 seem like the testing is working, guys.
11:33 So, again, I'd say, you know, I think
11:35 you have to kind of rethink how you're
11:37 allocating that time. That's what I
11:40 would say. Can we get TJ in here? Can
11:43 Can Playboy TJ on this one? Can we Am I
11:46 Am I coming in loud and clear? Yeah,
11:48 your voice sounds good. Your voice
11:51 sounds like trash. Looks good. Thank
11:55 you. Thank you. So, I would I'm I'm uh I
11:57 think I agree mostly with what everybody
11:58 is saying. Although, I would say for a
12:00 few kinds of
12:04 problems, the problem is I I I think
12:06 people when you say TDD, they just mean
12:07 10,000 different things. So, this is
12:09 like one of the problems, right? There's
12:11 like people who are like TDD is only if
12:13 you do red green cycle before you write
12:15 the code. you have to write the tests
12:16 and like you can't write any code till
12:18 you've written a failing test and blah
12:21 blah blah which like I get
12:23 uh that maybe they're the people who
12:24 invented the word so they can define it
12:28 that way. Uh but like I find that not
12:32 not very useful for some stuff but uh
12:34 there are like testing or like projects
12:37 where I think I test like basically
12:40 everything and there's just ways that
12:41 you can do it that are like more
12:43 effective versus less effective. Like my
12:45 favorite kind of testing is if you can
12:47 find a way to do like snapshot or like
12:49 golden testing whatever you know people
12:51 have different names for it but where
12:54 you can basically just write a test and
12:56 then say assert that the output is the
12:59 same as it used to be right uh like so
13:01 and you usually I've seen those called
13:03 snapshot tests or expect tests I find
13:07 that those are likeing often yeah yeah
13:08 or and like people will use them in
13:10 regression but like the kinds that I've
13:12 done it before like uh maybe I'm
13:14 building up a complex data structure to
13:17 like represent the state of like a bunch
13:19 of users or maybe it's the parse tree of
13:21 something or maybe it's the type system
13:24 what like whatever it is if you can find
13:26 a way to represent that in like a human
13:27 readable way usually just by like
13:30 printing it in tests that I've done you
13:32 can just like compare the diff between
13:35 this one and before you made the changes
13:37 and they're they should be fast at least
13:38 if they're not fast then that's a
13:40 separate problem like your tests have to
13:42 be fast for people to run them otherwise
13:45 because they don't run them. Um then
13:47 then those kinds of things tend to be I
13:49 find like very effective and give you a
13:51 lot of confidence that you haven't like
13:52 scuffed anything throughout the system
13:54 with your changes, right? Because you're
13:56 sort of like asserting the entire state
13:59 of something and if something changes
14:00 it's really easy to update because you
14:04 just say oh snapshot update this cool
14:06 done one second all your tests are
14:07 updated. You walk through them and
14:09 change them. You get some other side
14:11 benefits too like in code review you
14:14 literally see the diff of the snapshot
14:17 and you can be like um that looks wrong
14:19 that used to say false and now it says
14:20 true those things that shouldn't have
14:23 been flipped between those. So there are
14:25 like certain problems where I've employed
14:26 employed
14:30 like nearly TDD in this snapshot style
14:32 in the sense that the snapshot always
14:34 fails on the first time you run it
14:35 because there's nothing to
14:39 expect and then you accept that snapshot
14:40 if it looks good or keep changing the
14:42 code until it does look good and then
14:44 you like move on to the next thing. So,
14:46 but that but it's kind of like a unique
14:48 case. Like I wouldn't want to snapshot
14:50 test a
14:53 like HTML page, right? It would just
14:55 like there's no way that they're
14:59 reproducible it feels like. Uh anyways,
15:01 uh but like so there's there's it's only
15:02 for like certain subsets of problems
15:04 which have been more effective like for
15:06 me because I've made a lot of dev tools
15:08 in my career and stuff like that. So you
15:11 want to do like verify that all of the
15:13 references of something look like XYZ.
15:17 Okay. very easy to do with snapshots.
15:19 Um, that's so that's kind of the main
15:21 the the one thing that I wanted to throw
15:23 out there is like that style of testing
15:25 I find really powerful and I've used it
15:27 a lot like across projects you wouldn't
15:30 really expect as well. I've always had
15:31 the opposite experience with snapshot
15:34 tests to I I feel like it's okay maybe
15:35 if like it's you and maybe some other
15:37 people that actually know the expected
15:41 output of a snapshot. My experience is
15:43 it changes and they're like, "Oh, let's
15:45 just update it." And then the test pass
15:47 again and then you're kind of just left
15:49 with this snapshot and you're just like,
15:50 "All right." So, it's to the point like
15:52 where it's like, "All right, let's just
15:53 get rid of them because they're just
15:54 changing every time requirements change
15:56 and no one actually knows what the
15:58 expected output is." Um, one more point
15:59 I want to make. Someone said something
16:01 about like test helper refactoring. I
16:03 think that's true after a product is
16:06 actually mature because you're just
16:07 gonna be it's it's kind of like a
16:09 snapshot test where you're just gonna
16:10 keep updating your tests every time
16:12 something changes to the point where
16:14 it's kind of pointless. So I always find
16:16 when tests are like that I usually just
16:19 delete them immediately. Oh yeah, go
16:21 ahead try or I called you trash all my
16:24 trash. Your your comment filled Prime
16:27 with so much joy it translated into
16:30 physical up and down motion. Yeah, I I
16:32 was literally watching him as I was
16:34 talking and I was like, "Let's just see
16:37 how hype we can get Pride." Yeah, I wish
16:41 I could see you guys. I see nothing.
16:43 Yeah. Yeah, we see you. We see you guys.
16:45 Yeah, we could. We can see you, DJ.
16:48 Enjoy. TJ, if you see a strange mushroom
16:50 in the grass, do not eat it. Oh, yeah.
16:53 Don't eat a strange mushroom, DJ. When
16:54 you're mowing, there's a strange
16:57 mushroom. Guys, do I I don't have my AI
16:59 glasses. Do I eat this or not? Do I eat
17:03 it or not, guys? I No, don't eat the
17:06 mushroom. Eat the mushroom. That's all I
17:08 heard. Okay, I'm going for it. Prime,
17:12 say your thing. All right.
17:14 All right. So, the reason why I actually
17:15 agree with Trash a lot on this one is
17:17 that I've done a lot of snapshot testing
17:19 and I think there's probably does exist
17:21 a world where this makes a lot of sense.
17:23 It's just all the times I've ever done
17:26 it, whenever a requirement change or I
17:27 have to make a change that warrants
17:29 changing the snapshot, I've also had to
17:32 make some decent some non-minor change
17:34 also to some logic throughout the
17:37 program. So when my snapshot breaks, I
17:39 also don't know if I got it right. So
17:40 now I'm like trying to hand eyeball
17:42 through this this thing like one at a
17:44 line like okay, is this is this really
17:45 what I meant it for it to be in this
17:47 moment? How do I know I should change
17:48 this from true to false? Like I have to
17:50 re I have to re-reason about everything.
17:52 And I find that true snapshot testing on
17:56 big things is is feels very difficult
17:58 though. I love it. Like I want like
17:59 ultimately that's what I always want. I
18:00 I fall into the hey let's just make a
18:03 golden trap constantly. Like that is
18:05 like my default go-to is just like it's
18:08 golden time and then I always get sad.
18:11 As far as like the whole um unit test
18:13 breaking and mature product thing. I
18:15 just find that testing generally I do
18:17 not think testing helps for refactoring
18:20 unless if it's a true endto-end test.
18:21 Like you use the product from the very
18:24 tippity top and you only have the very
18:26 bottom of the output. You have nothing
18:29 else you look at. But often I don't like
18:31 that's really hard to test because I'll
18:33 write a function that I know is very
18:35 difficult to get right and I just want
18:37 to test those five things. So, I'm gonna
18:38 make a I'm gonna make a test that's
18:40 highly coupled to that piece. And if you
18:43 decide to refactor it, yeah, the tests
18:44 all break. Like, that's just a part of
18:46 it. I wrote it so that this one thing
18:48 works this one way and I wrote a test
18:49 that runs that one thing to make sure
18:51 you didn't screw it up if you had to
18:53 make a minor bug fix to it. I don't try
18:56 to do that whole um I just really think
18:58 this whole idea of unit testing and oh,
19:00 it will prevent you from having
19:02 headaches while refactoring has just
19:03 been a lie from the beginning. And then
19:05 what you'll hear is people do the exact
19:06 same thing every time. Oh, that's
19:07 because you're testing at the wrong
19:09 level. It's just like, I'm just testing
19:11 the things that I want to test. I don't
19:13 test all the things. And what happens if
19:15 you're testing on the wrong level just
19:16 simply means that you broke a whole
19:17 bunch of tests, but then some of your
19:19 tests didn't and that gave you the
19:21 confidence that you did the right thing.
19:23 And I just don't I just am not going to
19:25 test the universe. I'm going to have a
19:26 few end to end. I'm going to have the
19:29 hard parts and that's that. No more. I
19:31 also think like I mean when I'm thinking
19:33 back to times when I've when I've done testing
19:34 testing
19:37 uh like a lot of times when I when I
19:39 will break out testing it's for things
19:42 that I know aren't really all that
19:44 refactorable or changeable anyway. It's
19:46 like okay I you know I'm made this
19:49 memory allocator and we pretty much know
19:51 what it does right you have these like
19:53 two ways you can allocate memory from it
19:54 and you have one way to free or
19:57 something like that and then I just have
19:59 these tests to make sure that you know
20:00 it doesn't end up in bad states or
20:02 whatever it is and and that's a really
20:04 useful thing right I've definitely done
20:06 that I do that for math libraries like
20:09 we kind of all know what the sign of
20:11 something should return so if I'm making
20:13 a math library that implements the sign function
20:14 function
20:16 That's not like I don't have to think
20:18 too hard about refactoring sign because
20:20 math has been around for like you know
20:22 hundreds of years and isn't going to
20:24 change tomorrow. Like oh bad news guys
20:26 like React updated and now the
20:28 definition the mathematical definition
20:30 of sign is different today. So I'm so
20:32 happy that's not even a possibility that
20:34 React does stuff with math because we'd
20:37 be guys there's still time. There's
20:39 still time.
20:42 Yeah. So uh so I do I do like them for
20:45 that and and actually like there was a I
20:47 can think of a specific time uh when I
20:50 was shipping game libraries when uh we
20:54 had a really bad bug uh that caused a
20:56 problem for one particular customer that
20:58 took forever to find and it was because
21:01 it was in a thing that I had added to
21:03 the library just for that customer as
21:05 like a back door, right? They they like,
21:06 "We need this one thing really quick."
21:08 And I was like, "Okay." And I put it in
21:10 there. And it had a bug in it. And of
21:12 course, it wasn't in any of my standard
21:13 testing or anything. It wasn't even
21:16 really in anyone's use case, right? And
21:20 uh and so like, you know, that was to me
21:21 I was like, "Okay, that's a pretty
21:23 that's pretty good evidence that like
21:25 the testing that I was doing before was
21:28 good and like wasn't a waste of time
21:30 because like the one time it didn't
21:32 happen, it caused a significant problem,
21:34 right? And so, you know, presumably I
21:37 would have had more of those in actual
21:40 other customers had the main had all the
21:41 stuff that normally ships to the library
21:43 not been tested in the way that I
21:44 thought it should have been. And, you
21:46 know, from then I was always like, you
21:48 know, don't never do a one-off ad. Like,
21:49 if I'm going to add something for a
21:51 customer, I'll add it like the right way
21:53 and actually put in the testing for it
21:54 or I'll just tell them I'm sorry, I
21:56 can't do that right now because, you
21:58 know, I'm too busy or whatever. And so I
22:02 do think like yeah when you can actually
22:04 talk about this is isolated this is what
22:06 this thing does I'm pretty sure it's not
22:07 going to change I think the tests are
22:10 great but before then yeah it's just and
22:12 also this was just kind of going off on
22:14 a random anecdote but uh just to circle
22:16 back the thing that prime was saying
22:17 absolutely true it's like refactor
22:20 testing not really like that's only you
22:22 can only really get like a very high
22:24 level confidence from that you can't
22:25 really know there could be all sorts of
22:26 things that aren't really working
22:28 because whole system testing like that
22:30 is going to miss it's going to miss like
22:33 weird edge cases that can overlap and
22:35 the only way you catch those with test
22:37 development is knowing about those edge
22:39 cases and and making tests to kind of
22:42 target them and you can't do that if you
22:43 just refactored everything because now
22:45 the edge cases are different right uh so
22:47 yeah I think I think TDD people
22:49 overstate the degree to which that
22:52 really works that's why I'm um I'm a
22:56 huge fan of PDD prodriven development I
22:57 think that Every time you release and
22:59 you have a bunch of people that use your
23:02 product, you should do a slow release.
23:03 You should measure the outcome because
23:05 it's just going to test all the weird
23:06 things you're never going to be able to
23:09 think of or set up. And if it starts if
23:11 it starts breaking, you know that you
23:13 screwed up. Previous version worked
23:15 better than the new version, you can't
23:17 release this new version. You got to
23:19 back it up. Not PD PDD. Okay? Stop
23:22 saying there's no baby oil in this
23:25 situation. Okay? But I mean to me shots
23:27 fired at Sonos or Sonus or whatever that
23:29 company is called. I don't know what
23:30 that I don't know what they did. I don't know
23:31 know
23:35 what you don't you guys are supposed to
23:37 be the webdev people. What's going the
23:40 hell are you talking about Casey? Sonos
23:44 the company that makes the speakers. Nobody.
23:46 Nobody.
23:49 Are you kidding me? You
23:52 guys beat by Dre. No, this is a this is
23:54 like legendary. Like everyone knew about
23:56 this. So there's this company called
23:57 Sonos. They make like the most popular
23:59 internet powered speakers in the world.
24:01 Like they everyone has these things. I
24:03 mean, not me because I'll There's no way
24:05 in hell I'm connecting speakers to the
24:06 internet. That's just a way to make them
24:08 not work, which Sonos went ahead and
24:10 demonstrated for me. They had a product
24:12 that everyone liked. They were extremely
24:15 popular speaker. They then did a big
24:16 software update where like they were
24:18 like, "We're rolling out our new system
24:20 like, you know, kind of like how Google
24:21 does every couple years." are like,
24:23 "Hey, guess what? Gmail is going to suck
24:24 way harder than before today. Like
24:26 you're you have five, you know, you'll
24:28 get five months of being able to switch
24:29 between them and then you're on to the
24:32 new bad one." They did that only there
24:33 was no five months. They just basically
24:35 said like, "Here's the new rollout." And
24:37 it was awful. Like tons of feature
24:40 regressions. It didn't really work.
24:41 Customers couldn't do most of the things
24:43 they used to. And it like tanked the
24:45 company. like it was just like
24:48 immediately everyone hated it and nobody
24:49 wanted to buy their stuff and they had
24:51 to do this walk back and they they
24:53 couldn't roll back because they had
24:55 redeployed all their servers and I guess
24:57 they didn't know how to like undeploy
24:59 them so they couldn't you couldn't go
25:02 back to the old app. It was Go read
25:03 about this. I can't believe no one knows
25:05 about this but me. It was a massive
25:08 failure. Never heard of them. I heard
25:09 it. Okay, Casey. I don't know if you
25:12 guys can hear me but I heard that. Okay,
25:14 we can hear you. We can hear you TJ. We
25:16 can't really see you. You have very few
25:19 expressions. That's fine. But I do.
25:21 Okay. Your point is that hardware is
25:23 definitely different than software. You
25:25 know, if Twitter has a small breaking
25:27 thing because they made something
25:28 Twitter, all these major sites, they
25:30 break all the time, right? I have much
25:32 more forgiveness for that. This is, you
25:34 know, it's that's obviously different
25:36 than, you know, you got to play the
25:38 field a little bit different when I say
25:40 break production. If I'm building a game
25:42 that only, you know, so many people
25:43 play, you don't want it just crashing
25:46 non-stop while you're testing features,
25:47 there's probably some levels. But if
25:49 you're Twitter and all of a sudden you
25:52 can't post for 5 minutes and only 0001%
25:54 of the people couldn't post for 5
25:55 minutes, it's just not going to make a
25:56 huge splash and it's probably not going
25:57 to deter them from actually using the
25:59 product again. Well, I was actually No,
26:00 I was agreeing with you with the Sonos
26:02 thing. What I was saying was they should
26:03 have done exactly what you're talking
26:06 about. They should have taken and
26:08 deployed a small server bank for what
26:10 they were doing. Oh, I see. Given it to
26:12 1% of their users and their users would
26:13 have been like, "This is utter garbage
26:15 and I do not want it." And they would
26:17 have been like, "Okay, pull that on
26:19 back." Nobody saw it. We're fine. Right.
26:21 So, exactly what you're saying. I was
26:22 totally agreeing with you. I'm like,
26:24 Sonos, if they had done that, they would
26:26 have totally saved their company
26:28 literally like six months of hell. Had
26:31 they just given a slice of users this
26:33 thing for a while, they would have been
26:34 to, you know, some significant
26:36 percentage above beta test, right? Like
26:39 so like you said, you know, 5 10% of the
26:40 people got this. They would all have
26:41 been like, "Nope, nope, nope, nope,
26:42 nope." And then they would have like,
26:45 you know, been like, "Okay, okay."
26:47 So that's all I do want to Can I Can I
26:49 say one thing just about the testing
26:53 part though? Is that okay from before?
26:54 Maybe. Sure. Sure. That's what the
26:56 podcast is about. Exactly. I like this.
26:59 I like this. Go for it, TJ. Um, my
27:01 favorite project that I've ever worked
27:04 on, Neoim, by the way. Um, can you go
27:06 back to the grass? You were much more
27:09 like better in the grass. Really? My
27:11 phone if I got I got a notification that
27:14 said your phone is overheating cuz I was
27:16 in the sunshine. I was actually going to
27:18 bring that up in the beginning.
27:20 Like, it literally said your phone is
27:21 over. Make sure your body's the one
27:24 covering. Okay. All right. Sit over.
27:26 Create a shadow, TJ. So they
27:29 can't get your phone. You got to plank
27:30 plank over your phone. Hey, what's up,
27:32 boys? Hey yo, what's up, boys? This is
27:34 This is like getting into like TJ
27:36 profile pick territory where it's like
27:38 on the dating app. Can you tell us some
27:41 of your pet peeves, TJ? Do you like long
27:43 walks on the beach? I'm a I'm a long
27:46 walks enjoyer. I'm really not a fan of
27:48 short conversations. I want to really
27:51 find out about the real you. Um I don't
27:52 know what
27:54 else. I don't even know if any of my
27:56 words are coming through. TJ's actually
27:58 just describing himself right now.
28:01 That's all TJ's doing. I love laughing.
28:03 I love having fun. I like to make you
28:05 laugh. This is just me to prime. This is
28:07 my pitch for getting on the on the podcast.
28:13 It's true. I'll make fun of trash.
28:16 Yo, if if there's a guy named Trash on
28:17 the podcast, I will make sure to make
28:20 fun of him every episode. He uses one
28:24 password. Are you joking?
28:25 All right. All right. All right. All
28:28 right. TJ, what's your take? My take is
28:30 my favorite project that I've worked on
28:31 and like definitely the one with the
28:34 most weird edge cases and probably
28:37 users. Neoim like the test suite for
28:40 Neovim is amazing. We have tons of we
28:42 have different like styles of tests. We
28:44 have different ways to run them. We have
28:46 all this different stuff. Tons of work
28:48 went into making it. And maybe that's
28:50 just partially because Neo Vim is like
28:53 trying to maintain compatibility with
28:55 Vim on some stuff like literally
28:56 forever, right? So there's some other
28:58 requirements that are different, but
29:02 man, every time I did a PR and a random
29:05 test broke, it was my fault. It was not
29:07 the other test. It was not the snapshot
29:09 tests. It was nothing else. Like, and
29:11 like random stuff. I mean, maybe it's
29:13 just cuz Neoim is also like a really old
29:16 C project. So you like change something
29:19 and then a random global is destroyed. I
29:20 don't know, you know, it's like
29:21 whatever. It's maybe not always the
29:24 best, but like so I don't in in my
29:26 experience where people like actually
29:27 cared about it and like cared to
29:30 maintain snapshots and make those happen
29:31 like no one's even getting paid to work
29:33 on Neoim. You know what I'm saying? And
29:35 like the snapshots were valuable, the
29:36 unit tests are valuable, the functional
29:38 tests were valuable, and I used them in
29:40 a ton of different PRs and across like
29:44 really large refactors, adding Lua to
29:45 auto commands, like doing lots of
29:47 different stuff where I had to change
29:50 tons and tons of the code. I was just
29:52 wrong every time until the test went green.
29:54 green.
29:56 I think that probably goes to show that
29:58 tests are as valuable as the people who
30:00 wrote them. I was literally going to say
30:02 that. Yeah, that's what I've been
30:04 waiting to say too is I just wanted to
30:05 say but since I've been lagging I didn't
30:06 want to yell. I just wanted to yell
30:08 skill issue to everything Trash has been saying.
30:10 saying.
30:12 I would I would disagree actually. I
30:15 would I will defend Trash actually.
30:17 Casey's smarter than you. Because here's
30:21 the thing. That's true. Dang it. I
30:25 suspect I suspect that uh some of what
30:27 you're seeing though is not so much the
30:29 skill of the people who did it, although
30:31 they may have been very skilled. It's
30:32 more just about when the tests were
30:35 added. So, at least for me, a lot of
30:37 times what I'll do is I'll go, okay, if
30:39 the if I like let's say I have a bug
30:41 that I actually work on for a day, like
30:43 like this is because most bugs are like,
30:45 oh yeah, I know what it is and you just
30:47 fix it, right? But you get one and
30:48 you're just like, what the hell is
30:50 happening? And you actually, it's like
30:52 that once every few months where you're
30:55 just like, "Okay, this is really weird
30:57 and I'm not sure what it is." When I
31:00 find those, I typically try to add a
31:03 test or assertions or whatever I think
31:06 would have caught it, right? Because I'm
31:08 like, "Okay, this one's nasty and I
31:10 wasn't expecting it. What can I add that
31:13 will trip this kind of thing up and I'll
31:15 add it." And I suspect that much of what
31:17 gets seen in like mature products like
31:20 Neov are just like, hey, if you've got
31:24 some post test discipline, like after we
31:25 made the product, when we're finding
31:27 bugs, we add the test that would have
31:30 caught the bugs. Those are often very
31:32 good because they come out of what are
31:34 the things that are likely to break when
31:36 we do stuff and we put in a net there to
31:38 catch them. And so those are like very
31:41 good and they're the opposite of TDD.
31:44 They're like, "Oh, they're TDD. They're
31:47 testdriven debugging." Uh, maybe you
31:49 could say, or or something like that.
31:52 But, but, uh, so, so I do think there's
31:54 something to that uh, that that's a lot
31:56 more. I had a couple other hot takes
31:57 that I'll get to a second, but I just
31:58 want to leave it. I want to say that one
32:00 based on what you say. I don't know if
32:01 that's true about Neoim, but I I might
32:03 suspect it, right? It definitely is. The
32:05 thing though that like I was more
32:07 pushing back on is like I find the
32:10 snapshot tests in Neoim invaluable. Like
32:12 I they're amazing. I find a bunch of
32:16 these good end toend tests truly end to
32:18 end. Like literally it runs a test and
32:21 does user I don't know if I just dropped
32:23 uh it does like user input and then
32:26 asserts the outputs right like I found
32:28 those super valuable in Neovim. Probably
32:31 just dropped. Yeah. I also No, you
32:32 didn't you didn't drop. We s we heard
32:34 your point. I think also to put on top
32:37 of it is think about the the horizon of
32:39 Neoim versus you starting up a new dev
32:42 tool at work. And maybe that maybe
32:43 there's something to be said about when
32:46 are the goldens introduced during
32:48 exploration phase or like Neoim's just
32:50 not going to be changing. My assumption
32:52 is that a lot of Neovim doesn't change
32:55 super hard. It's not like hey guys let's
32:58 do a whole new data model today. Wrong
33:00 changes. All the ones that's all the
33:02 ones that ever everything you guys have
33:04 poo pooed on that I found them the most
33:06 valuable when I changed literally the
33:07 entire data model and how all auto
33:10 commands work all inside of Neoim. It
33:11 was the most useful to have the snapshot
33:13 tests then. I mean your testing is going
33:15 to depend greatly on what you're
33:18 building 100%. Sure. Yeah. Yeah. But I'm
33:19 just saying like we were fundamentally
33:23 changing the data models. Yeah. That's
33:24 Yeah. I'm just saying that's what
33:26 happened. Yeah. I thought it was a
33:30 pretty smart statement. power.
33:32 You're really smart, trash. I wouldn't
33:34 poo that. I wouldn't poo poo that at
33:35 all. I You've noticed I have not poo
33:38 pooed snapshot testing. True. Uh, good
33:39 job, Casey. I love I love you. You're
33:46 I love that I'm not getting ded. Can we
33:48 start rearranging the windows based on
33:51 who T Te thinks is smartest at any given
33:52 time? And I'm guessing that it has a
33:55 very high correlation to who is agreeing
33:57 with him. It seemed like I'm getting
33:58 that sense. I'll just always put trash
34:00 at the end. It's fine, Casey. You don't
34:01 have to worry. Today, I'm gonna tweet
34:04 that Casey protected me from Tee. Yes.
34:06 This is gonna This is a pinnacle moment
34:08 of my career. I can't even see your
34:11 guys' videos, so I have no idea what's
34:13 happening. I'm talking to a black screen
34:16 right now. Yeah. Yeah. Dude, I've been
34:18 just cracking up at your feed this whole
34:19 time. I just can't handle it. It's
34:21 pretty good. It's very good. Like, the
34:24 result is is topnotch of the feed.
34:25 Unfortunately, I think this will be a
34:28 stream only feature. I don't know if TJ
34:29 will actually have this effect on the
34:30 video because Riverside will be
34:32 uploading it up. That's true, right?
34:34 Yeah, that's true. He'll still be in in
34:37 weird like uh it in his uh profile his
34:39 dating site profile pick phase, but but
34:42 it will be much smoother. Yeah. So, so
34:45 anyway, uh I I wanted to add a I I would
34:47 agree that snapshot testing can be good.
34:49 It depend. It just depends on whether
34:50 those snapshots are going to be stable.
34:52 I mean, that's the bottom line. If you
34:55 have uh if you're I I would imagine what
34:57 Prime and Trash were talking about is if
34:58 you're trying to do snapshot testing at
34:59 a level
35:01 below something that's actually
35:04 consistent across the refactor,
35:05 obviously that's not going to work,
35:07 right? And so if your snapshot testing
35:09 is like someone mentioned, a parse tree.
35:11 Well, if it's a parse of something that
35:13 could be parsed different ways depending
35:14 on how you chose to implement that
35:17 subsystem, those snapshots are useless.
35:19 If the parse always has to be the same
35:20 because it's like parsing for the C
35:22 language specification and it can't
35:24 change because like you know it's
35:26 hardcoded. So if the thing's in C99 mode
35:28 then it had better produce the exact
35:29 parse tree. That's a great snapshot
35:31 test, right? And so I think the
35:32 difference boils down to those things.
35:34 It it has to be about something that
35:37 actually is um you know not going to
35:39 change item potent to the to the release
35:41 of the project otherwise it's a limited
35:44 time test. But you know that's just how
35:46 it is. So probably also the scope of the
35:48 breaking, right? If only 10% of your
35:50 golden is invalid because you made this
35:52 change, that's a lot different than a
35:55 100% of your goldens are invalid and you
35:56 have to like rework all of them because
35:59 of big kind of refactors. That was my
36:01 last my last experience is that I had to
36:02 do a bunch of testing against this uh
36:05 games reporting thing at Netflix and to
36:07 be able to do this autod do and to be
36:08 able to do it I had to like create the
36:11 data model of the incoming events and
36:12 then they are changing the incoming
36:14 events like once a week because it was
36:15 still an exploration phase. So it's just
36:18 like well not only is my input incorrect
36:20 now my output's incorrect. So I have to
36:22 like change both sides. So I'm just like
36:23 I guess I'm just redoing all the things
36:25 again. Here we go. Y and so then it just
36:27 became super hard to know if I was
36:28 actually right or if I was breaking
36:30 things. So I I think at the outset I
36:32 made it pretty clear I don't like test
36:33 different development as an idea. I
36:35 don't like focusing the developers
36:36 attention on tests. I think they should
36:38 understand tests and know how to deploy
36:41 them smartly. But I don't think right I
36:44 do want to say there is a place where I
36:47 do recommend literal test driven
36:50 development and it is as follows. That
36:54 is if the programmer is finding that
36:56 they are low productivity on a
36:58 particular task and that task does not provide
36:59 provide
37:03 feedback. So for example, if you have to
37:06 implement some low-level feature
37:10 component that has no visual output, no
37:13 auditory output, no anything. It's just
37:16 like an abstract thing that happens
37:18 inside the computer, right? Then
37:21 sometimes programmers like they get
37:24 bored working on these sorts of things.
37:25 But a lot of
37:27 programmers will like focus on things
37:30 like numbers that come out of a thing.
37:32 Like if a thing if you run a thing and
37:36 the number like you know 37% pops out
37:37 suddenly the programmer is like super
37:40 motivated to get it up to like 45% and
37:43 50% and right. So sometimes it's like,
37:46 oh, okay, if you can turn this problem,
37:49 if you can make a test that turns
37:50 whatever you're working on into
37:52 something with a concrete output that
37:55 you feel some reward about when it goes
37:56 up or
37:59 improves. That can be a valuable use of
38:00 test-driven development. And we do this
38:02 in performance a lot. We'll put make a
38:04 little test around something that
38:06 outputs things like how many cache
38:08 misses did it have or how many, you
38:10 know, uh, cycles did it take to complete
38:13 or whatever. and it's like oh now I'm
38:15 like trying to get that number down and
38:18 that's a lot more fun than I am just
38:19 trying to make this system faster or
38:22 whatever. So like a lot of times you can
38:23 get a you can make it so that those
38:25 things go together and then you get the
38:27 test or the good statistics gathering or
38:29 whatever it is that you actually wanted
38:31 anyway and you get this motivation for
38:33 the programmer and it they all kind of
38:35 work together in this nice way. So I I
38:36 just wanted to put that out. There's one
38:38 place where I think it does kind of work
38:42 to drive development by a test of some
38:43 kind. First thing that comes to my mind
38:47 is like code coverage reports and you
38:49 see that number really low and I have
38:51 the opposite reaction where I just don't
38:56 quit. I don't think I've ever saw like a
38:58 low code coverage report and I was like,
39:01 you know what, today we're making that
39:02 number higher. I'm kind of just like,
39:04 "Guys, we suck and I'm not going to help
39:07 us get better. There's no way." Oh, no.
39:10 All right. I I mean, to be fair, if I go
39:12 to a place like 100% code coverage, I
39:15 also go, "Uhoh." Right. Like, I'm also
39:17 kind of worried about the 100% code
39:19 coverage uh code cases as well. I'm not
39:20 going to say the company I was at, I've
39:23 been at many. Netflix and there was a
39:28 team that required 100% test coverage to
39:31 the point where people were just like
39:33 writing the most worthless test just to
39:36 get like this like PR bot to just shut
39:38 up and
39:40 was and to make it even like funnier
39:42 like bugs every day like in production
39:44 just like I was just like what's
39:45 happening here? I don't know if anyone's
39:47 ever experienced like a 100% like test
39:50 coverage quota by their team.
39:51 Yes, I did with Falkor when I was at
39:53 Netflix. 100% code coverage bugs every
39:57 day of my life. Oh, her Falkor the one
40:06 Wait, F. Did you say Falor? Uhhuh. Like
40:08 the luck dragon from the Never- Ending Story.
40:10 Story.
40:12 What? Why was it called that? You don't
40:13 know what Falkor is? Because it was
40:15 named after I do know what Falkor is.
40:16 It's the luck dragon from the Never-
40:18 Ending Story. Yes. named after the luck
40:21 dragon from the never- ending story.
40:23 Tell the story. Tell it all so Casey can
40:25 hear it and then we'll be done. I'll do
40:27 a quick one. Casey Casey Snow, how how
40:28 it works is that you would provide
40:30 effectively an array with keys in it as
40:32 the path into your data model in the
40:34 back end. So say you want videos, some
40:36 ID, title. It would return to you the
40:38 title in that kind of as a tree. Videos
40:40 or as a graph or maybe as a forest is a
40:43 better way to say it. Videos ID, title.
40:46 So if you wanted say a list, I want list
40:49 0 through 10. title, it would go list
40:50 0ero through 10, which would be ids,
40:52 which would then link to videos by ID,
40:53 title. So, it's kind of like a
40:55 graph-like data structure that makes it
40:57 so that if you make duplicate requests,
40:59 it's able to go, oh, okay, I already
41:00 have videos, whatever. I already have
41:02 list item zero. I can kind of not
41:04 request so much from the back end. The
41:06 problem was is that what happened if
41:09 list uh seven doesn't exist? You don't
41:11 know. So, we have to return something
41:13 back to you that says it does not exist.
41:15 So we return something called like a
41:18 it's a it's a boxed value. It's an atom
41:20 of undefined. It it it lets you know hey
41:21 there is a value there and the value
41:22 there is nothing. You don't need to
41:24 request it again. It's not going to be
41:25 something. So we created something
41:27 called materialize on the back end. That
41:29 means you need to materialize missing
41:31 values. Well what happens if you request
41:34 list 0 through 10,000 through 10,000
41:36 through 10,000 title? Well you just
41:39 created a gigantic number that we're
41:41 going to materialize everything for you.
41:44 And so with one simple line of code, a
41:46 while loop and bash, you can do DOSs or
41:48 at one point 2016, you could have dossed
41:51 down and took down Netflix permanently.
41:53 There was no rolling back. It was in
41:55 production for like two years. You could
41:56 do it for all endpoints, for all
41:58 devices, for everything and just it
42:00 would never run again. And that's it. It
42:03 was a simple while loop. And I I wrote a
42:04 lot of beautiful code about it and I did
42:06 a lot of beautiful stuff with it and it
42:07 was very nice code. And then I also
42:09 discovered it and reported it. And then
42:11 later on it got named the repulsive
42:16 grizzly attack. Repulsive grizzly. Yeah.
42:19 So does has no one at Netflix watched
42:20 the never- ending story? Shouldn't it be
42:22 called the Gamorc or something like it?
42:25 It's a movie. It's probably on Netflix.
42:27 Okay. That No. No. This wasn't some this
42:29 wasn't some 9,000. It was a completely
42:30 separate team that when they found out
42:32 about it, they had to do a bunch of work
42:34 and and it ended up being, you know, a
42:36 fix on the Zool Gateway and all this
42:37 kind of stuff. All right. because there
42:38 are no grizzly bears in the never-
42:40 ending story. I didn't read the original
42:41 book because I think it's in German or
42:43 something. I don't know. But yeah. Yeah.
42:45 But the nice news is I I I originally
42:47 didn't come up with the materialized
42:49 idea. Uh but I did do a lot of
42:52 refactoring that while refactoring and
42:53 recreating everything, I stopped and
42:56 went, "Wait a second. Something seems
42:58 funny here." So then I trashed uh
42:59 staging for a day where no one could use
43:02 staging for a day. And then that's when
43:05 I realized we made a mistake.
43:08 When I say we, I mean me. Yeah, I didn't.
43:10 didn't.
43:13 And that's it. And he's off.
43:15 I told my story and I'm not
43:17 participating in this podcast anymore.
43:19 The transitions at the end of this
43:21 podcast are just getting worse and
43:25 worse. We just end stream. He walks off.
43:27 What's happening? All right. Do you have
43:28 anything you want to add to this all?
43:31 He's stream. Are we still live? We're
43:34 still live, dude. Find a friend that
43:36 smiles at you like TJ smiles at us. I
43:38 think today's big takeaway is don't
43:40 bring trash dev into a project that
43:43 needs help. He'll just quit.
43:46 Yeah. No, actually I will help your your
43:49 product get better. Bye. TJ drive your
43:51 code coverage metrics down to zero.
43:54 Yeah. Code coverage is a lie anyways.
43:57 Are we live? And then quit. Oh, TJ's
43:59 back again.
44:03 All right. Well, what is happening?
44:14 I think we had a really productive
44:16 meeting today, everyone. Yeah. Everybody
44:18 enjoy your weekend. Next quarter is
44:20 going to be big. It's going to be the
44:21 biggest quarter of our lifetime.
44:26 Absolutely. We got to hit our OKRs. Yep.
44:28 Well, okay
44:30 then. Bye, ladies and gentlemen. Another
44:33 professional podcast from the standup.
44:36 All right. And ending it. All right.
44:37 Hold on one second. Let me I'm going to
44:38 I'm going to end the stream. Hey, thanks
44:41 everybody for being around here. And TJ