This content introduces a revolutionary AI-powered rendering workflow that significantly accelerates the process of transforming 3D layouts into high-quality 2D visuals, enabling rapid style exploration and content generation using free, open-source tools.
Mind Map
Nhấn để mở rộng
Nhấn để khám phá sơ đồ tư duy tương tác đầy đủ
2 years ago, I said AI is the future of
rendering, but I'm honestly surprised
how far we've come since then. AI now
lets you reimagine your 3D layouts with
simple prompts and reference images. But
it doesn't just add textures, lighting,
and depth of field. It will also
generate smoke simulations, water
splashes, and explosive debris based on
the movement in your scene. That way,
you can go from rough layout to fine
rendering in minutes. You can easily
change the style by swapping out the
reference image, or you can even use
multiple reference images for different
parts in your rendering. We also built a
custom notepad that allows you to render
scenes of any length without crashing
your PC. So, today I'm going to show you
how you can set this up using free
open-source tools that [music] run
entirely on your own computer.
This video took a lot of time to
research, and developing these workflows
was weeks of trial and error. The fact
that we can share them for free is made
possible entirely by our amazing Patreon
supporters. If you want to support our
work, get access to advanced workflows
and our amazing Discord community, check
out the link in the description. So,
traditional rendering is the process of
turning a three-dimensional scene into a
2D representation, a 2D image, for
example, by calculating how light
bounces off the surfaces. Ray tracing
generative AI rendering works
differently. And that's probably why I
got a community note when I last used
that term on X. Instead of
mathematically simulating light
transport, a neural network predicts
what the image should look like based on
patterns it learned from millions and
millions of training images. We feed in
additional information like depth map,
outlines or post data from our 3D
scenes, and these are called control
nets. When we then add a reference image
and a prompt, the model is able to
interpolate the style of the reference
image over the duration of the shot. The
challenge was to find a model that
adheres to the scene geometry precisely,
but still has the freedom to generate
new detail and understands the reference
image good enough that is able to
generate new scene information in new
areas that were obstructed previously.
And we were very close to giving up
because every model we tested only had
some parts of these functionalities. But
then we found this model merge by inner
reflections. Inner reflections combined
two different video models. Skyreel's
reference to video is designed to really
understand reference images. You can
load in references for characters,
backgrounds, and styles, and then the
model will merge them all together,
creating your final scene. The problem
is it can't understand control nets, so
we can't feed it our 3D geometry. One
vase solves this problem. You can feed
it control nets and it's able to follow
them pretty precisely. The problem is
that vase's reference feature is just
not as good. Though when the camera
moves to a new area of the scene, Vase
is not able to generate new detail here
that matches the style and vibrance of
the original reference. The merge model
completely fixes that because Skyre's
references just work so much better, but
they still work in tandem with the
control nets. It's just like the model
understands two languages. Now, before
we can run another scene, we need to
create our control net passes in our 3D
software. And I recommend two. The
outline pass is good if you want to
preserve the exact composition, but give
the model freedom to generate new detail
between these lines. The depth pass is
better if you need the model to follow
your geometry more precisely. I
recommend exporting both. That way, you
can test out which one works better for
your scene. You can also merge them
together to have the best of both
worlds. You can use any tune shader with
outlines in any 3D program to create the
outline pass. In Blender, I used the
freestyle tool in the past, but that can
be super slow for some reason. So
instead, I recommend changing the render
engine to workbench, selecting flat
lighting, go to color, single, and make
it black and then activate outlines and
make them white. But you can see that it
only does outlines for one object at a
time. So if you need more detail, I
recommend activating freestyle. You can
find the freestyle settings in the view
layout tab. Scroll down here, activate
as render pass. And if we now render an
image, we can come over to the
compositing tab, use nodes, create a
viewer node, connect the freestyle, and
you can see the outlines. They are
looking good, but they are black. So for
this, go to view layer, scroll down to
the settings. Let's change the freestyle
color to white. Render the image again.
Maybe we can make them a little bit
thinner even. Let's try two. And we can
just deactivate use alpha. Now create a
file output node and connect it. And
that's it for the outline pass. Next,
let's set up the depth pass. For this,
we go to view layer, activate Z. Now we
need to render the image again. Connect
the viewer to the new depth output.
Let's add a normalize node to use it as
a control net. We need to invert it.
After that, I just added this curve node
here and just pretty much increase the
contrast. So you can see on the
character, we now have more separation
of the character's depth. create another
output node, connect it, and save out
your image sequence. Another thing you
can do is just render out your layout,
whatever you have in your viewport, and
then use our free AI preparation
workflow to create the outlines and the
depth map. But keep in mind that these
are just approximations and will not be
as good as the rendered ones. We have
two options for guiding the style of our
rendering. We can create one reference
start frame or we can throw in multiple
references. Let's start with the first
option as it renders a bit faster. The
easiest is probably just to use CHBT or
Nano Banana or whatever you have access
to. Load in your start depth or outline
image and describing what you want.
Something like this. And this worked
perfectly well. But there are also some
amazing local options. So let's switch
over to CompuI. I think one of the best
models for that currently is Z image
Turbo together with the new control nets
for it. And I created this free workflow
for you that lets you transform your
start image. Just drag and drop it into
Confui. Go to manager, install missing
custom nodes. if you have any red notes
and then you need to download these
models that you can find right here.
Download them, put them in the right
folders and make sure that they are
loaded right here. Then you can come
over here and just drag and drop in your
control image for the first frame of
your sequence. And I'm choosing the
outline pass. If you already combined
all these images into a video sequence,
you can also come down here and upload
that right here. This will then load
only the first frame. if you activate
the option right here. Next, we need to
create a prompt right here. And I like
to start simple and see what it gives
me. Before we run this, let's come down
here. This is where you set the control
net strength. And I usually recommend
going as low as possible as this will
give you the maximum quality. Like let's
try something like this. Click run. And
this looks pretty cool. You can see the
tail is now in a different position than
it is in the control net image, but that
is usually not a problem. This is just a
reference. It doesn't need to be 100%
perfect. So, this is good enough. Let's
create another image in another style.
For example, let's try this anime style.
This is looking really cute. If you like
the vibe, but want to try out more
options, you can also just change the
seat right here. Yep, really cute. And
it got the tail right this time. So,
let's render our scene. Now, I like to
first convert all these images into
H.264 uh video sequences just so it's a
bit easier to handle. For this, I'm
using a setup like this. You can use of
course any editing program, but you can
also do it in comi. Create a load images
node and then a video combine node from
the video helper suit. Select the frame
rate. In my case, it's 24 frames. Give
it a name and change the format to this
one right here. Click run. And here's
our video. Let's do the same thing for
the outlines. Just copy this in here.
Let's now install the main workflow, the
free AI renderer. Installing this is
very easy. Just drag and drop the
workflow into the Confui interface. Open
the Confui manager. Click install
missing custom nodes and install all of
them. Once it's done, restart Confui.
Next, you need to download the models.
And you can find all the download links
in the notes on the left side of the
workflow right next to the corresponding
model loaders. To help you get set up,
we've also created a free guide that you
can check out on Patreon and our
website, along with a detailed
step-by-step video installation
tutorial. If you don't have a powerful
GPU, it might make sense to run these
workflows on Runpot. We've prepared
readyto-use templates that handle the
setup for you. Plus, if you sign up with
our link, you'll get a random credit
bonus between $5 and $500 when you spend
the first $10 on the platform. Next, we
come over to the video input up here.
You can set the resolution for our
lizard. We're using a square format. So,
I'm going to select this one right here.
And we don't want to skip any frames.
Just drag and drop the outline pass
right here. And I'm also going to drag
in the depth pass right here. This note
will actually blend them together if you
want that. But for now, I want to test
it with only the outline pass as it
gives the video generation process more
freedom. So I will disable this option
right here. When we now click run, you
can see it actually just loads in the
first one. If we activate it, it would
look something like this. Up here, you
can also change the blend value. Next,
come here and drag and drop in your
reference image. Let's start with the
realistic one. And next, you can just
create a simple prompt like this
describing the shot that you want to
create. Make sure to add in little
details, like for example, if you want
to see wrinkles on the suit, just add
that in. You can also copy this prompt
over to any large language model and
refine it there. And that's it for the
workflow. Now, you can just click run
and wait for it to finish. And this is
looking really good. Let's quickly try
out the anime image that we created
earlier just to see how it compares. And
I just copied over this prompt to claude
and just said change the style to anime.
Well, and this worked really nice. I
love how to edit details. Like for
example, it's blinking now. But as I
said, you can use this workflow not just
with a start image, but you can also
combine multiple reference images. But
to demonstrate that, let me show you
this other scene here. I created this
one just by downloading Migamo
animations and putting that inside of
Blender. I created my outline pass like
this. And then we can create our
characters. I like to do that by
creating character sheets. You can use
any AI image generator to do that, but
because I showed you this already, I'm
going to load in the Z image Turbo
control net workflow. And I'm just going
to deactivate this control net right
here. Create this empty latent node and
plug it into the latent image. So, this
is basically just the normal Z image
workflow. Now, we can create a prompt
like this. Let's run that. That's a
pretty cool fox. Let's create another
character. Actually, let's just reuse
one of the lizards that we already have.
Maybe this lizard right here. And now we
activate the second reference by
selecting these two nodes and clicking
CtrlB. Once two of these image inputs
are selected, the workflow automatically
switches from start image mode to
reference image mode. So, all you need
to do is just also drag and drop in the
image of the fox. And then I'm
activating the third reference because
we need an environment. Okay, I found
this futuristic city with a river. Next,
I'm adjusting the frame load cap to 81.
And I'm also going to switch the format
to 720p
16x9. Prompting for this reference
approach is pretty easy. You just need
to tell the model what to do in natural
language. So, a prompt could look like
this. Two characters are dancing in a
futuristic city. They are standing in a
shallow river. On the left side of the
image, a humanoid lizard wearing a suit
and holding a brown leather bag is
jumping up and down. On the right side,
a humanoid fox wearing sunglasses and
green thief's clothing is dancing. Yep,
and that just worked super well. It
combined all these elements into one
video. But you can see the outfit of the
fox changed a little bit, and you can
usually improve that by being more
specific in the prompt. So, this
workflow can generate up to 120 frames
at 24 frames per second. So, around 5
seconds of video. You can actually go a
bit higher than that, but eventually
you'll run out of VRAMm or the quality
will degrade too much. But if you
support us on Patreon, you can not only
get all the example files and test
renderings we created during the
creation of this video, you can also get
your hands on the advanced version of
this workflow. This one includes a
custom node pack that automatically
splits your video into batches and runs
them one iteration after another, taking
the last frames of the previous
iteration as start frames, ensuring that
it's a super smooth and consistent
video. And you can see it looks actually
really similar because all the changes
are under the hood if you go into the
sampler subgraph here. So all you need
to do is use it the exact same way. Let
let me load in this test shot right
here. This is a very very very
challenging shot. We have a lot of
detail. We have a creature with a lot of
tentacles. The camera is moving. So,
let's see what this will create. Let's
go back to the top here. I'm going to
set the resolution to 720p 16x9.
And let's load in around maybe 300
frames. Something like this. We're not
blending it with the depth map at this
time. This is our start frame. So, we
have a stormy ocean. And then this is
our prompt. Now we have a few more
options below here. For example, you can
set how many frames should be created
per iteration. And I recommend depending
on your GPU, keep this between 41 and
121. 81 is a sweet spot. The number of
start frames is important for the
blending process. You see, once you hit
generate, it will generate the first 81
frames. And this is set so it will
actually use the last 11 frames as start
frames for the next generation ensuring
that the transition is super smooth. Now
if you already generated a sequence with
this shot and for example just in the
ending something is off you can actually
start this workflow again resuming from
a later iteration. Down here you can set
the total iterations. And for our longer
shot we need actually we need five. Okay
I'll shorten it a little bit. I'll take
I I'll do four. And finally below that
you have the seed. And yep, that's
that's all you need to to set. And
here's the final shot. Look at these
amazing effects. Look at the
consistency. And look at these water
splashing. That's so cool. So here are
some more variations with other start
frames. For example, this mossy forest
or this anime style. I think the speed
here is the real benefit. You can
explore 10 visual directions in the time
it takes up to set one traditional
render. But of course, you're trading
control for speed. And the physics are
unsimulated. meaning sometimes they look
really convincing and sometimes they are
a bit off and all you can do is just run
the workflow again changing the seat or
adjusting the prompt. Still, I think you
can reach really impressive quality with
this open-source solution and for rapid
prototyping or previous work, this is
really amazing. You can also do many
other things with this workflow which I
will show in future videos. So,
[clears throat] make sure to subscribe.
But that's it for this one. If you
create anything with these workflows,
feel free to tag me in your work or show
it to me on Discord. I always love to
see what you come up with. And huge
thanks to our amazing Patreon supporters
who make these deep dives possible.
Thanks for watching and see you next time.
Nhấn vào bất kỳ đoạn văn bản hoặc mốc thời gian nào để nhảy đến phần đó trong video
Chia sẻ:
Hầu hết transcript sẵn sàng trong dưới 5 giây
Sao Chép 1 Chạm125+ Ngôn ngữTìm kiếm nội dungNhảy đến mốc thời gian
Dán URL YouTube
Nhập link bất kỳ video YouTube để lấy toàn bộ transcript
Form Trích Xuất Transcript
Hầu hết transcript sẵn sàng trong dưới 5 giây
Cài Tiện Ích Chrome Của Chúng Tôi
Lấy transcript ngay mà không cần rời khỏi YouTube. Cài tiện ích Chrome để truy cập transcript của bất kỳ video nào ngay trên trang xem, chỉ với một cú nhấp.