0:02 OpenAI just released their Codeex app,
0:04 which I'm pretty sure is just a guey for
0:06 their Codeex CLI to help you handle
0:09 agents. I wanted to see what this has to
0:11 offer. So, let's check it out. So,
0:13 here's their their marketing page here.
0:15 It's kind of showing a little bit of the
0:18 codeex UI. Seems like you've got um
0:21 projects on one side. We got a chat and
0:24 then preview on the other side. Seems
0:26 like there's some built-in um commits
0:28 and maybe you can open your project
0:31 there. You install this and just point
0:32 it to the repo or project that you're
0:36 working on locally and have it vibe code
0:38 for you. So it says, "The best way to
0:40 build with agents built to drive real
0:42 engineering work from routine pull
0:44 requests to your hardest problems.
0:47 Codeex reliably completes task end to
0:49 end like building features, complex
0:51 refactors, migrations, and more. Powered
0:54 by OpenAI's frontier coding models."
0:56 They really shoved in a bunch of
0:58 buzzwords in there, didn't they? You
1:00 know, I'm always skeptical of these
1:03 coding tools because having used coding
1:05 agents, having used different coding
1:07 tools, I know how reliable their
1:10 completion of tasks can be and their end
1:12 to end work usually has a lot of holes
1:15 in it. And complex refactors and
1:17 migrations tend to need a lot of
1:18 handholding when it comes to using
1:21 agents and any coding tool. At least
1:23 from my experience, it oneshots some
1:25 stuff, but not always the case. So
1:28 designed for multi- aent workflows, the
1:30 Codex app is a command center for
1:33 agentic coding with built-in work trees
1:35 and cloud environments. Agents work in
1:37 parallel across projects, completing
1:40 weeks of work in days. Really trying to
1:42 push this as like we'll get more work
1:44 done than your developers will, aren't they?
1:46 they?
1:47 They're probably right. So it seems like
1:49 you can have agents and sub aents and
1:51 have multiple projects going with
1:53 multiple agents running in the
1:54 background, which I'm pretty sure you
1:56 could already do. So this does just seem
1:58 like a way to manage it in a guey rather
2:02 than having to use a terminal UI and
2:04 CLI. But uh let's let's keep going.
2:06 Let's see. Adapts to how your team
2:09 builds with skills. Codeex goes beyond
2:10 writing code to directly contribute to
2:13 the work that turns pull requests into
2:15 products like code understanding,
2:18 prototyping, and documentation aligned
2:20 with your team's standards. Like it
2:22 says, generate a cloud theme hero image
2:26 for my landing page using image gen so
2:30 that you can add AI generated images to
2:32 your AI generated code. So it's a slop
2:35 on slop on slop. You got your AI images
2:38 with your AI code on your vibe coded
2:40 application that you're shipping over a
2:42 weekend. Let's continue. I I digress.
2:44 So, what's this? Uh made for always on
2:47 background work with automations. Codeex
2:50 works unprompted picking up routine but
2:52 important work like issue triage, alert
2:55 monitoring, CI/CD, and more. So, you can
2:57 stay focused on building. Well, let's be
2:59 honest, we're not building. The AI is
3:00 building. So, you can stay focused on
3:03 vibing. raises the bar across your team.
3:06 Codex raises baseline quality with more
3:08 thorough designs, comprehensive testing,
3:10 and high signal code reviews. So, issues
3:13 are caught early and your team ships
3:14 with confidence. I can tell you that
3:17 anybody that's shipping just AI
3:19 generated code isn't shipping with
3:22 confidence. And um hopefully you've got
3:25 a human reviewing the code that you're
3:28 pushing to your production codebase cuz
3:30 soon AI is going to write it, review it,
3:31 and deploy it all. And we're just going
3:34 to sit back press in that tab key. Or
3:35 are we already doing that? I don't know.
3:37 Let's go check out the demo video.
3:39 Today, we're excited to introduce the
3:41 Codex app. Now you have one place to
3:44 manage your projects and delegate real
3:46 work to Codeex. Instead of juggling
3:48 terminal windows, you get a single
3:50 command center to run and supervise
3:53 agents. Let me show you how it works. On
3:55 the left, you can see my projects. And
3:57 for each of them, I can see the tasks
3:59 that Codex completed. and some of them
4:01 are running live as we speak. On the
4:03 right, this is the main conversation
4:05 screen. Let's start with this iOS app
4:08 I've been building. It tracks the ISS in
4:10 real time. And let's say I want to build
4:12 new features for it. I can type or
4:15 better yet, I can just speak my mind and
4:18 dictate. Add a new screen.
4:20 >> So, I will say that that's something
4:22 that's kind of nice having the uh
4:24 built-in dictation. I mean, you could do
4:26 it with other tools. I'm using like
4:29 Super Whisper. I found that to get
4:32 tiresome like typing out prompts over
4:34 and over. So having built-in dictation
4:36 is kind of nice, but you can have that
4:39 with other tools and still use a CLI. I
4:40 don't know. I I like I like gooies, too.
4:42 Like sometimes working in the terminal
4:44 with open code like, you know, you get
4:47 one long terminal window and it's not
4:50 the most userfriendly thing to kind of
4:51 like look through
4:52 >> that shows the astronomy picture of the
4:54 day from NASA and that's it. Codex is
4:56 now taking care of all that for me and
4:58 we'll figure out the right APIs to build
5:01 it. While that's running, let's take a
5:02 look at another project I've been
5:05 meaning to migrate for a while. Here,
5:08 Codeex is updating all the dependencies.
5:10 And in that task, it's even migrating
5:13 from websockets to WebRTC for our
5:15 speechtoech integration.
5:17 >> Okay, so it's just handling like all
5:19 your agents on different windows, right?
5:22 You're just running multiple agents. It
5:23 just keeps up with that. Now, some of
5:25 these tasks take a while, minutes if not
5:27 hours, especially when you're working on
5:30 a large codebase. And that's a big shift
5:32 in how we build software. Now, you can
5:34 supervise your agents and simply check
5:37 in when they're done. But let's go back
5:39 to our first task. Here you can see how
5:41 Codeex has been approaching the problem,
5:43 the steps it's taking, and the progress
5:46 as it goes. And it's done. Now, if I
5:48 click on the right side, I see the diff.
5:50 I can review exactly what changed in the
5:52 Swift code. I can leave in line comments
5:55 for feedback, ask for another iteration
5:58 or just merge if it looks right. Now, if
5:59 I need to go deeper, I mean, that's kind
6:01 of convenient if you can just point out
6:03 exactly what line instead of having to
6:05 like prompt it and tell it what line or
6:07 highlight the block and paste it into
6:09 the terminal to be like, you know,
6:11 change this block. Letting you do it in
6:13 line's kind of nice. You know, the the
6:16 one thing I don't like about these tools
6:17 is that you kind of get that vendor lock
6:20 in. All right. Now, everybody's going to
6:22 make one of these. I'm assuming you'll
6:24 see uh you know, something like this
6:25 come out for Clawude Code if it doesn't
6:27 already exist. Like, I'm not super up on
6:29 the times with a lot of this stuff, but
6:31 it'd be nice to see like an open code
6:32 version of this where you can just use
6:35 whatever model you want since models are
6:36 changing all the time and there's always
6:39 like a new best model. It's really nice
6:41 like what Open Code did where you could
6:42 just bring in your model and not have to
6:44 like use Anthropic if you're using
6:47 Claude Code or use um OpenAI if you're
6:50 using the codec CLI. So, it'd be nice to
6:52 see one of these guies with um you know
6:54 that lets you import your own keys and
6:56 use whatever models you want. I can
6:59 always open the changes in Xcode, but in
7:01 this case, I'll just build and run the
7:03 app directly from here. Working with the
7:05 Codeex app makes building more fun. you
7:06 know, I don't have to like spend a lot
7:09 of time editing code, but rather just
7:11 like thinking about how to shape the app
7:12 the way I want it to be. Let's bring the
7:15 iOS simulator on screen and here it is.
7:18 I just gave one sentence to Codeex and
7:20 it built this entire new feature for me.
7:21 Did they did they like speed through
7:24 that because it looked terrible and just
7:27 like here's a big blob of text with no
7:29 uh no line breaks or anything, but it's
7:31 like here it did this and then they like
7:34 hurried it up real quick. I mean, it
7:36 executed what what it asked for, but not
7:38 the best looking UI. How is this for
7:40 developers if you can't see the code? I
7:42 mean, it's it's the same idea as using
7:44 the terminal, right? Like the terminal
7:45 shows you the previews of the changes
7:47 that it's made, but if you're running in
7:49 the project, you just look at the code
7:51 in your IDE. I mean, it really is just
7:53 like a a guey with organization and
7:57 whatnot for what I'm assuming Codex CLI
7:58 already does.
8:00 >> Sentence to Codex and it built this
8:02 entire new feature for me. Isn't that
8:05 amazing? Let's switch to another project
8:07 I've been working on. It's a web-based
8:09 fitness tracker. Well, for visual tasks that
8:09 that
8:11 >> I got to say, I don't know if it's
8:14 because of the OpenAI models, but both
8:16 UIs. I'm more of a UI developer, right?
8:18 Like I I I work a lot on the front end
8:21 and I've worked in design and stuff.
8:24 This looks terrible. It looks like a
8:27 junior developer put this UI together.
8:30 Super basic, like not very visually
8:32 appealing. kind of looks like crap. Like
8:34 it looks like u if somebody had just
8:36 started learning how to code and they
8:38 built out like an interface, I'd be
8:39 like, "All right, that that looks good."
8:40 You know, there's there's room for
8:42 improvement, but that's a good start to
8:44 actually like demo this as like, "Hey,
8:47 here's our our app that this new tool
8:50 has built for us. Look how great it is."
8:51 And again, they kind of like speed
8:54 through it super fast in the demo. Just
8:56 uh seems like they don't they got a
8:57 little bit of shame behind it. like that
9:00 last uh bit that they showed on the uh
9:03 mobile app that they demoed. It's just
9:05 like the UI doesn't look look very good.
9:07 And I don't know if that's a model issue
9:09 or if it's just uh they didn't care for
9:09 the demo.
9:11 >> Another project I've been working on,
9:13 it's a web- based fitness tracker. Well,
9:16 for visual task like this one, I want to
9:18 stick closer to the experience I'm
9:20 building. So, by clicking on the top
9:22 right corner, I can pop the conversation
9:25 out like this and bring Codex with me
9:27 and I can dictate
9:30 >> animate the bars to simulate progress.
9:32 And now I'll be able in a few seconds to
9:34 see the changes apply live. It really
9:36 feels like collaborating with a
9:39 teammate. And here we go. Now, another
9:41 powerful part of the Codex app is
9:44 skills. Skills let CEX connect to all of
9:45 your favorite
9:47 >> What's tricky about these demos, right?
9:49 because it's super happy path and then
9:52 it's a recorded demo that you don't know
9:54 how many times it took it to nail that
9:56 one little feature that it asked, right?
9:58 They're showing you like the perfect
9:59 scenario where it's like animate the
10:02 bars and do this. All of that work
10:04 waiting for the the agents and the model
10:07 and the code to generate and have it
10:09 work if it oneshotss it or if it doesn't
10:10 and having to tweak it like all of that
10:13 has been edited out. It's like, hey, I
10:15 asked it to do this. And then
10:17 immediately it's like, there it is.
10:18 Let's move on to the next feature that
10:20 we do. And it's just like, well, I would
10:22 I would like to see how how much uh time
10:25 it took for that simple thing to do
10:28 that. And I'd like to know like how many
10:29 prompts did you have to give it? Did it
10:31 did it oneshot it or did you have to
10:33 give it more information? because the
10:37 prompt that he gave was super simple and
10:39 I feel like it maybe it wasn't enough to
10:41 to tell it exactly how you wanted it to
10:43 behave and how you wanted that feature
10:45 to be implemented. So they they're
10:48 really targeting it for non-technical
10:50 people. It it seems like that's who
10:51 they're trying to target with a lot of
10:54 things now, but I I think that most
10:56 non-technical people aren't going to be
11:00 using this tool. And if they are,
11:03 they're moving into way more technical
11:06 stuff that could become like a foot gun
11:07 if they start trying to use this and
11:10 then get stuck and now they have this
11:13 nice UI that they don't know how to use
11:16 because it's gotten too complicated and
11:18 things got real. In fact, this app was
11:20 actually implemented using the Figma
11:22 skill which automatically used MCP
11:25 behind the scenes uh and set things up
11:27 for me. Let me quickly show you the
11:29 design. This is the homepage and if I
11:31 click one component for instance, you
11:35 can see how the spacing, the text styles
11:37 and all of that are defined right there.
11:39 So, Codex has not been working from a
11:41 screenshot. It's actually reading the
11:43 structure of the design file including
11:45 those variables to generate real code
11:47 using our design system.
11:48 >> Buddy of mine was just telling me about
11:52 Figma's MCP. I mean, seems like a a good
11:55 tool like this exact example that he's
11:57 showing. Like, I mean, developers would
11:59 use this because if I was just being
12:02 handed over some design files in Figma
12:03 and I can just click a button and have
12:05 it at least start it for me, like that
12:07 reduces a lot of workload. But again,
12:09 that's not really something specific to
12:12 this tool. That's just using Figma's
12:13 MCP, which you can use,
12:14 >> and that's why it matches the design so
12:16 well. Of course, you can also create
12:18 your own skill for yourself or for your
12:20 team, so it can really fit your
12:21 workflow. Now, what's really powerful
12:23 with the Codex app is that you can even
12:26 turn these skills into automations.
12:27 Imagine you have tasks that you want
12:29 running at a specific cadence. Maybe
12:32 that's triaging uh you know, alerts from
12:35 Sentry or bugs and tickets from linear.
12:37 Well, you can now have that handled in
12:38 the background while you focus on
12:41 building. There are many more features
12:43 in the Codeex app, including working
12:46 with isolated environments called work
12:48 trees. So this way you can give each
12:49 agent a copy of the code, so you don't
12:51 have to worry about conflicts or
12:54 breaking your setup. You can also
12:56 delegate any task, especially the
12:58 longunning ones, to Codeex in the cloud
13:00 with the exact same interface. This was
13:04 just a quick tour. Eh, like I mean I I I
13:07 see the value, but really all this means
13:10 is that now everybody's going to come
13:13 out with their next, you know, codeex
13:15 like app that does a lot of the same
13:18 stuff. It's awesome to be a like a
13:20 consumer, especially like a technical
13:23 consumer right now in this day and age
13:25 because there's just all of these
13:28 companies fighting for us to use their
13:30 product. So, it's kind of nice because
13:32 we get a lot of these new tools. Yeah,
13:35 that demo was pretty weak. Again, it
13:37 doesn't seem like it's marketed to the
13:41 technical people. It's like Seale
13:46 management type people demo. Like, if it
13:47 was marketed to technical people, it
13:49 would have done a much deeper dive into
13:51 like how it works and stuff like that.
13:54 It was really surface level. Here's this
13:56 cool new shiny thing we made. Check it