0:00 um hello replay attenders glad to see
0:03 you today um as you probably can say by
0:05 my accent I'm not from B I'm from bellus
0:09 which is a quite different place uh but
0:12 the reason today is uh because 5 years
0:15 ago a single tweet actually changed the
0:17 whole direction of the way how I think
0:19 and how I perceive the software
0:21 implementations and the software
0:22 architecture and today I'd like to tell
0:24 you the story from this tweet to the
0:26 moment once we start implementing EI
0:28 workflows in our application
0:30 in applications for our
0:33 customers so my name is Anon I'm the CTO
0:36 of company spal Scout uh we provide
0:38 software development bus uh software
0:40 development for our customers around the
0:42 globe for around 15 years uh and uh by
0:46 person who is maintaining the team and
0:48 tasked to make sure that we do good job
0:51 as a tech leader I have to always make
0:53 sure that the job and the task the tools
0:56 we use is always optimal and we don't
0:58 spend additional time on doing the
1:00 typical bootstrap code or anything that
1:03 well we don't want to do as a passionate
1:05 coder I love to mitigate that by
1:07 actually creating my own tools and for
1:10 the span of my career I created a number
1:11 of Open Source instruments well in
1:13 closed Source instruments everything
1:15 from Frameworks to orms to database
1:17 layers template and agents in DSL and
1:20 Etc But as time been passing and our
1:23 client pool been growing and the
1:24 complexity been growing with it we soon
1:27 realized that even we have our own
1:28 toolkit and the team which knows that to
1:30 use it back in the day it was mostly PHP
1:33 we have been lack in one of the very
1:34 large abstractions which been seemingly
1:37 hard to get and this abstraction as you
1:39 guys know today by this presentation and
1:41 this conference is workflow engine so
1:44 what the first logical solution that
1:46 every engineer will do if he cannot get
1:48 the instrument he want in his stack well
1:51 let's build it itself very smart idea so
1:54 we started doing the research and
1:56 started to look in in the ways how we
1:58 can start implementing the worklow
1:59 engine in our products we used to work
2:02 with Amazon swf and it looked like a
2:04 very nice solution for many things but
2:06 it still was quite proprietary and hard
2:08 to use in ecosystems like open source or
2:11 outside of
2:13 Amazon uh at this moment uh once we once
2:16 I had one first prototypes we soon
2:19 realized that amount of amount of edge
2:21 cases that we uncover by running this
2:24 engine just become grow exponentially
2:27 every day and every moment we've seen
2:29 more and more uh problems arise from the
2:32 things we expect just to work that's the
2:34 moment where I decided to come back and
2:36 try to do additional research and see
2:38 maybe there is a new tools in the market
2:40 maybe there is a new Solutions or some
2:41 better pattern around the time uh I
2:44 found a similarly well uh simly well
2:47 experienced guy in Twitter who was
2:49 talking all about workflows and durable
2:51 executions the PO they can Implement and
2:54 application that can run for the span of
2:55 days and months and Etc so I thought
2:58 myself okay I mean I he has his solution
3:01 I have my own stack why not try to talk
3:03 to him and see if he can collaborate to
3:05 bring it in so I wrote a Twitter message
3:07 and S to my surprise this person said
3:10 yeah let's talk so five years ago by
3:12 conversation with ma Maxim fatv it
3:15 kicked off The Well quite long
3:17 collaboration in which we created the
3:19 temporal phps DEC and we began to use an
3:22 adopt temporal for our own products and
3:24 for the products of our
3:26 customers so at this moment everything
3:29 looks very nice and Coy we have a one
3:31 stack we have powerful workflow engine
3:34 what else can you dream about what else
3:35 do you want and that's and that's about
3:38 the moment where gpt3 dropped on a
3:40 market once you see this model and once
3:43 you realize what state-of-the-art llm
3:45 can do they can interpret your user
3:47 requests they can make hiu making jokes
3:49 or help you to process any information
3:51 it become seem it become very obvious
3:54 that there is an immerse potential how
3:56 we can use these solutions to build
3:58 something more complex
4:01 uh
4:03 yet we seen that while implementing this
4:06 solution and building our first
4:08 pipelines well summarizing tweets making
4:10 the pool request uh reviews and Etc we
4:13 still been seeing this pattern over and
4:15 over that even if you have this powerful
4:17 technology powered by state-of-the-art
4:19 models built by very powerful companies
4:21 in the world the actual process of
4:23 implementation is not that different
4:25 from 20 years ago you still go through
4:27 the planning phase design phase
4:29 implement ation and iteration so we kind
4:31 of seen a situation where we have these
4:33 keys from Lamborghini but we use it just
4:36 to drive to Costco why we still have
4:38 these powerful models that we cannot
4:40 actually use to enhance our main work
4:43 these days you obviously we have copilot
4:45 and we have any uh many other tools but
4:48 we decided to actually come back to the
4:49 drawing board and challenge ourselves to
4:51 a bit different question can we create
4:54 the software that can not only be
4:56 programmed ahead of the time by
4:58 Engineers but the soft that can actually
5:00 program itself and expand its own
5:02 functionality as it go with
5:04 collaboration with user and why not
5:06 maybe by itself just trying to see how
5:08 it can be optimized by this moment we
5:11 clearly know that this is going to be a
5:13 very challenging task it's going to be a
5:15 very complex architecture which is going
5:16 to be spawning for many domains and many
5:19 uh parts of the system that has to be
5:20 collaborating seamlessly well only if
5:22 you have some nice engine that will help
5:25 us to cope with this complexity with
5:26 this engine is obviously temporal since
5:28 I'm speaking here today so let's try to
5:30 dive and see what we can do in terms of
5:32 llm payloads and llm workflows within
5:35 your temporal
5:37 application so the first what we have to
5:39 do to talk about that is to properly
5:41 Define the boundaries how we actually
5:44 Define the llm calls within our
5:46 workflows and surprisingly in terms of
5:48 the actual workflow implementation and
5:50 in terms of the actual workflow uh data
5:53 flow the llm can be defined quite easily
5:55 it's a blackbox and many Engineers
5:57 actually Define them as a blackbox as
5:59 well it's it's very powerful and magical
6:00 abstraction you put some data in you get
6:03 some data out sometimes this data is
6:05 good sometimes this data is just garbage
6:07 well that's what but that's something we
6:09 have to live with but at the same time
6:13 uh as you guys know when you use llms
6:16 while this solution has been extremely
6:17 powerful and extremely versatile at the
6:20 same time it's quite unreliable you will
6:22 be seeing everything from failers on API
6:24 calls to timeouts to the plain situation
6:27 when EI just saying oh you know what I
6:29 don't want to do this job well what you
6:30 can do about the situation so if you'll
6:33 take a look at this if you'll take a
6:35 look at this implementation partn you'll
6:38 see that one side of equation you have
6:40 extremely powerful abstraction which is
6:43 highly UND deterministic highly
6:45 unreliable and yet extremely powerful on
6:47 another side of the equation you have
6:49 engine which was designed to actually
6:51 mitigate things like that to write
6:53 deterministic and very durable workflows
6:56 and Implement them in a quite easy
6:58 fashion so if you take a look it kind of
7:00 makes total sense to combine them
7:02 together you use one engine to actually
7:04 mitigate issues done by
7:06 another if you're seeking to implement a
7:09 lemon new
7:11 application you're most likely going to
7:13 start with two quite simplistic patterns
7:16 which in many cases probably going to be
7:18 80 or 90% of your whole llm workloads
7:22 you're going to start with rock
7:23 pipelines the pipelines the designed to
7:26 go to some data source maybe Vector
7:28 database maybe external
7:30 uh website or anything and gather
7:32 information which is the most relevant
7:34 to the user query and return this
7:36 information which in a in a way that
7:38 user or maybe other AI can comprehend
7:41 and act on from Another Side you have
7:43 type of the workloads which kind of
7:45 doing the same the only main difference
7:47 is that now instead of doing the return
7:50 of the text to the user you can actually
7:52 perform some kind of arbitrary action
7:54 based on a decision done by llm on
7:56 behalf of the user it's as easy that as
7:59 send email asking to cancel your order
8:01 well any order being canceled and
8:03 account deleted well be careful what you
8:05 wish for when you work with
8:08 llm see looking deeper into rock
8:10 pipelines uh and maybe in pretty much
8:13 every paper which you're going to find
8:15 about rock pipelines uh you will see
8:17 they have like some distinctive steps
8:19 you always have parts that will be
8:20 collecting and aggregating normalizing
8:23 chuning Dot embedding into a vector
8:25 store maybe reshuffling or clustering it
8:27 from another side you'll have part part
8:29 s which are responsible essentially for
8:31 retrieving that and pushing the answer
8:33 to the user but what is the most curious
8:36 part about rock pipelines if you take a
8:38 look how they've been displayed inside
8:40 these papers and inside pretty much
8:42 every article people wrote they all have
8:44 distinctive steps they all have these
8:46 blocks and arrows between them which
8:48 surprisingly look exactly what we need
8:50 it is a simply data workflow and data
8:54 passing I'll be sh examples today in PHP
8:57 but I think I do that mostly for visual
8:59 purpose purposes it can be easily done
9:00 on any language you love in Python
9:02 nodejs temporal allows you to switch
9:04 Stacks quite easily but if you're going
9:06 to implement the rock pipeline the very
9:08 simplistic approach is most likely going
9:10 to look like that it doesn't require
9:12 much thinking it's just a number of
9:13 steps some of the step will be used in
9:15 LM like to summarize query some of the
9:18 steps are going to be going to external
9:19 source to find this information and push
9:21 it back into a pipeline they can spun
9:23 for many many actions have some
9:25 branching or some additional
9:27 conditions the action pipe plant once
9:30 again they're not that much different
9:31 from temporal perspective the only major
9:33 difference is that instead of giving the
9:35 response back to user you're trying to
9:37 act based on this response and temporal
9:40 makes this approach quite simplistic
9:42 because when you're trying to act you're
9:43 trying to execute something within your
9:44 environment and temporal already
9:46 connects to all of your environment so
9:48 gluing that to your activities and
9:49 calling one of your services is
9:51 extremely
9:52 simple if you'll take a look on LM
9:55 activities and this will become
9:56 important in a lot of slides you will
9:58 also notice right inter syt every time
10:01 you're trying to make an llm call First
10:03 Step what you're going to do is to
10:04 assemble the context that will be sent
10:06 to llm or prompt as we call it this
10:09 context in a simplistic terms can be
10:11 represented as simplistic template you
10:14 just have a number of variables number
10:15 of things you found in a knowledge base
10:18 something you found from the user maybe
10:19 on internet who knows you put them all
10:21 together and you just send them and then
10:23 you just send them to eii you wait for
10:25 this response from EI and then you
10:26 interpret the result in some in some
10:29 structured form the first thing we
10:31 noticed while doing pipelines like that
10:34 and writing actions like that that it is
10:36 actually extremely important to validate
10:38 the eii response within a single
10:40 activity you can generally speaking get
10:42 the EI response send it back to temporal
10:45 and then do the execution in different
10:47 activity but the problem is you can't
10:49 actually trust eii so what will start
10:52 happening is that in some cases your
10:54 activity will be executed successfully
10:56 everything is okay your activity is done
10:58 but the payload that been generated is
11:00 actually completely invalid and your
11:01 work just stuck you cannot execute next
11:04 activity at all so it does make sense to
11:06 combine them in order to make sure that
11:08 you never leave activity with invalid
11:10 data generated by
11:14 AI so far if you will take a look at
11:16 this
11:17 workflows they don't possess any threat
11:20 to any of the engineers they quite
11:22 linear in some cases it's dark in some
11:24 cases you can even describe them in some
11:26 DSL language but at the end of the day
11:29 they adjust temporal workflows the only
11:31 thing you do is you replace some actions
11:33 inside this pipeline from normal
11:35 activities to the activities that go to
11:37 lolm and it just work there is no
11:39 additional magic and there is no
11:41 additional things you have to do except
11:43 of just assembling this
11:45 workflow the problem will start arising
11:47 once you'll start making this workflow
11:49 long enough and complex enough to start
11:51 processing more and more information
11:53 because modern day llm models they're
11:56 quite hungry for tokens and some models
11:58 can comprehend up to 1 million uh tokens
12:01 which is a lot of pages of the text so
12:03 if your worklow will be growing and
12:05 information going to be passing remember
12:07 that temporal stories all the payloads
12:09 that pass in and out your activities in
12:11 temporal history this will cause a very
12:13 nasty problem later on because you will
12:15 know you will never have a very
12:18 confidence that your worklow won't die
12:20 simply because some llm decided to write
12:22 a poem instead of giving you the correct
12:25 action
12:27 oh so the way how we decided solve it
12:29 and how you can solve it uh you have
12:32 multiple options option number one is to
12:34 don't do anything just write smaller
12:36 pipelines and in in many cases when
12:38 you're doing something very simplistic
12:40 it just works you don't necessarily care
12:42 and you can always retry or maybe just
12:45 ask eii to be a bit shorter in other
12:47 cases you can be a bit smarter and try
12:49 to use implicit data referencing where
12:52 you're going to implement your own data
12:53 converter and your own Interceptor layer
12:55 that you'll be detecting uh that payload
12:58 is larger than you want and uploading
13:00 that to external data store to be used
13:02 later but what we found working the best
13:04 for us and that's the moment why I want
13:06 to remember how promts work the moment
13:08 we found work the best for us is to
13:10 actually use explicit referencing
13:12 because at the end of the day all the
13:13 information that you put to eii all the
13:15 information that EI is trying to act on
13:17 this information is actually only needed
13:19 in a moment when you compile your prompt
13:21 you don't actually need any of this data
13:23 or any of user pii inside your workflow
13:26 so don't do it all together just keep it
13:28 outside and user referencing using some
13:31 links some IDs or database uh database
13:34 keys this becomes handy when you're
13:36 trying to assemble information from
13:37 multiple systems because by implementing
13:40 Universal referencing mechanism you can
13:42 actually combine information from
13:43 multiple parts of your application and
13:46 then just combine them all and resolve
13:48 them all in a one distinctive place
13:50 where you actually send information to
13:52 eii this way your workflows will be
13:54 completely free of any user information
13:56 and yet they will be used to orchestrate
13:58 this process all together okay so we
14:01 have dock workflows action workflows we
14:03 did the Der referencing probably nothing
14:05 else we want to do users are happy right
14:08 no users want not just to use a button
14:11 where they click on something expect
14:12 something they actually want to talk to
14:14 AI because that's how many of the users
14:16 in the market perceive eii uh today uh
14:20 what you see in the picture is actually
14:21 the give uh of one of the sessions we
14:24 had with one of the agents which based
14:26 on the user request perform additional
14:28 actions run to some of the activities
14:30 and pull information in to give the
14:32 correct answer but implementation of
14:35 this workflows might look simil complex
14:37 at start until you realize it's actually
14:40 not that complex because the model of
14:42 temporal allows you not only to write
14:44 linear workflows that begin and end but
14:47 also the workflows in which you can
14:48 Implement such thing as a main Loop by
14:51 making the main Loop and running the llm
14:53 activity in it and populating this loop
14:55 with information what the workflow
14:57 receives using signals you you can
14:59 Implement quite sophisticated system
15:01 that actually leaves on a site with user
15:03 and answers and answers his question in
15:05 real time at the same time you do
15:07 maintain whole state and at the same
15:09 time you do maintain whole control of
15:11 this process you can see how much token
15:13 llm already consumed you can see how
15:16 fast it responds and you can do the
15:18 actions based on that implementation
15:21 once again can be done in any language
15:23 but it can fit on a screen it's not that
15:26 large temporal makes it so easy because
15:28 by exposing you to the code level you
15:31 can simply implement this Loop like that
15:33 and voila it just
15:35 works also by doing that you're going to
15:38 get a lot of benefits from composability
15:40 model of temporal which means from the
15:43 user perspective while user send the
15:44 message and go the response back it
15:46 doesn't necessarily mean you have to do
15:48 a single action by going to AI you can
15:50 do something else specifically before
15:52 this message is sent to eii what you can
15:55 do is enrich it with additional context
15:57 replace this block with your pipeline
15:59 that connects to your knowledge Source
16:01 let's say information about your product
16:02 and voila you have customer support
16:04 board that now talks to you about your
16:06 product
16:09 specifically if you're trying to do long
16:11 conversations or conversations that span
16:14 for days and months maybe it's an email
16:16 threat sooner or later you're going to
16:18 enter situation that your context of
16:20 your agent will be overfilled and the
16:22 agent won't be able to act once again
16:25 because you run all of this process
16:26 inside temporal inside main Loop it is
16:29 exceptionally easy to detect this moment
16:31 and see how much tokens eii consumed and
16:33 use this approach to actually offload
16:36 the past conversations and restart the
16:38 new llm session with with conversational
16:40 history in essence all you do you
16:43 summarize the past messages you put them
16:45 back into the history or context or the
16:48 prompt and you run again the user won't
16:50 even notice however from agent
16:52 perspective he start from a blank slate
16:54 just knowing something from the past
16:56 conversation okay so we can talk
16:59 we can see what can we do and that's the
17:03 next uh and that's the next thing which
17:04 you probably going to learn when when
17:06 we'll be working with a lot of models
17:08 most of them right now expose a new way
17:10 for the models to communicate with your
17:12 environment and this called tool colon
17:15 at a screen you can see actually the
17:16 agent creating the tool on demand which
17:18 later is going to be executed to run
17:20 some analytical query based on user
17:22 request but what essentially you do to
17:25 make the tool call in inside the
17:27 temporal well again it's so easy it's
17:30 going to be probably the keyword in
17:32 today's presentation because once you
17:35 tell AI which type of functions you can
17:37 it can call and once you go these
17:38 functions as a result from a activity
17:41 all you have to do is simply map them to
17:43 one of your activities or one of your
17:45 workflows why not get the result and
17:48 push them back to the queue but be
17:50 careful what you want to do is to make
17:52 sure that the message that user send
17:54 cannot be sent in between these tool
17:56 calls otherwise LM model will die they
17:58 all want to get response immediately
18:00 without any interception well use the
18:03 blocking mechanism and Implement them
18:05 inside your signal method it's not that
18:09 complex code once again is quite
18:12 straightforward all you have to do is to
18:15 receive the list of tools you want to
18:16 call map them to parts of your system
18:19 such as activities maybe other workflows
18:21 maybe something else get the result back
18:24 and you can get this result in a
18:25 sequential fashion you can get this
18:27 results in parallel fashion to provides
18:29 you abstractions to do that in every
18:31 language get result back and push them
18:33 back in a message queue easy next call
18:35 that user do or eii invokes will receive
18:38 responses and eii will be able to act
18:41 based on that so if you if you have that
18:44 you have tool calling and you have
18:46 models you can talk you have models you
18:48 can communicate and that can look to
18:50 information that can execute arbitrary
18:52 action in some cases do retri even by
18:55 themselves like in many cases EI will
18:56 notice that tool call does not work let
18:59 me try to do it once again you might be
19:01 asking so what next what can you do with
19:03 these
19:04 pns and the question you can ask
19:06 yourself do we even need a user when you
19:09 running this workflows they uh they open
19:11 you while many challenges such as
19:14 hallucinated tool calls or skipping tool
19:16 calls they do possess they do open a
19:19 huge amount of ability to run workflows
19:21 or agents in our case agentic workflows
19:25 that will be executing by themselves
19:27 autonomously gathering information and
19:29 writing the solution as they go the main
19:33 problem which you're going to have in
19:34 this case is that while you communicate
19:36 with agent directly as a user uh you can
19:39 supervise him you can say yeah you know
19:41 what you're doing it wrong please try
19:43 something else don't call this tool G me
19:45 information from different part of the
19:47 system when you work when you run agents
19:50 autonomously you don't have the user so
19:53 what you should do you should replace
19:54 the user and what can you replace the
19:56 user inside the temporal workflows
19:59 another workflow so in this position you
20:01 are going to create your own supervision
20:03 layer which essentially going to play
20:05 the role of the user and this
20:07 supervision layer is going to be
20:08 responsible for receiving the command
20:11 and you still need some kind of trigger
20:13 either web hook or user or something
20:15 else but based on this command it will
20:18 automatically form the first prompt or
20:20 the first message and task the agent to
20:22 execute it the tricky part here could be
20:25 is how to
20:27 evaluate the the agent actually did some
20:29 avilable work first thing you might
20:32 notice in applications like that and
20:33 this is very nasty thing to see that
20:35 agents love to Loop because the moment
20:38 agent is making a mistake and trying to
20:39 correct it in a very different way but
20:41 still incorrect one he's going to make
20:44 two error calls and okay I'm agent I did
20:47 two error calls it is in my context well
20:50 what should I do next probably make
20:51 another call because it s so logical so
20:54 what you might see in some cases that
20:55 agent will try be calling your tools
20:57 over and over and over again especially
20:59 when tools have been dynamically created
21:01 and fail eventually because of the
21:03 self-destruct they will simply
21:05 overpopulate their context and will
21:07 offload well you just can't do anything
21:11 thanksfully because you run temporal you
21:13 orchestrate and you collect all the
21:15 information about all the tools that eii
21:17 calls all the payloads and all the
21:18 errors you can Implement many mechanism
21:21 how to detect that AI is not doing what
21:24 you want to do you can do that atically
21:26 by simply looking for the partents and
21:28 Tool call and seeing the loops when
21:30 something happens over and over or you
21:32 can do something more complex and use
21:34 another eii model or another eii agent
21:36 to actually look at the result and
21:38 decide if this agent is faulty or the
21:40 result has not been done to the purpose
21:42 you're going to be creating deeper and
21:44 deeper n uh deeper and deeper chains
21:47 which kind of lead us to next question
21:49 if you have one agent why can we have
21:51 many agents can we use them in
21:54 collaboration or can we use them by
21:56 embedding them into much deeper chains
21:59 of decisions and using them to run more
22:01 and more sophisticated workflows inside
22:03 your system well the answer is obviously
22:06 yes because again we deal in a temporal
22:08 conference and there is nothing
22:09 impossible inside the temporal you will
22:12 have to use mostly signals and workflows
22:14 to compose applications like that but
22:16 the composition of application that run
22:18 multiple agents in parallel or have the
22:21 collaboration factor is not that complex
22:23 and not that different at this video we
22:25 are seeing single agent that communic
22:28 Ates and delegates the task to other sub
22:30 agents that will be executing tools
22:33 written by other agents in order to
22:35 execute some arbitrary command and
22:37 return the result back to the hub which
22:39 user communicates to to implement PN
22:41 like that what we found works the best
22:44 for us and I'm pretty sure that it's
22:46 going to be a lot of patterns how you
22:47 can compose applications like this you
22:50 are creating the common supervision
22:51 layer or how we call it a gentic pool
22:54 assist a single place inside your
22:55 workflow which is the workflow that
22:58 essentially orchestrates the commands
22:59 between multiple child workflows or your
23:01 agents you delegate one of these
23:04 workflows to begin to be essentially The
23:06 Hub or arbitrary uh agent which you
23:09 communicate from outside that's your
23:11 entry point or maybe something which you
23:12 communicate with user and you let this
23:15 agent to communicate with other agents
23:17 so how can you do that well tool calling
23:20 from the perspective of your Hub agent
23:22 the delegation of the task is not that
23:24 much difference from actually calling a
23:26 single activity inside your system
23:28 all you have to do is to take the
23:30 payload that EI decided to put to this
23:32 delegated task and send it to other
23:34 agent and that's another nice place
23:37 where temporal is going to help you
23:38 tremendously temporal architecture and
23:41 especially the way how you write
23:42 workflows allows you to actually say
23:45 that this tool call is not an activity
23:47 this tool call is a signal and you can
23:49 use this signal to send the command to
23:51 parent supervisor Loop uh pool that will
23:54 automatically spawn the child workflow
23:57 or your agent deleg task to the child
23:59 workl agent and will wait for the
24:01 resultant signal that will be containing
24:03 the resultant payload take this payload
24:05 and send it back to your hop agent and
24:08 you have the ability to delegate tasks
24:10 while your hop agent doesn't even
24:12 actually know how it works it just think
24:15 it did a tool call which was very smart
24:17 and did some very F uh did some very uh
24:20 good work
24:23 inide another thing which you can do and
24:26 this is something which we experimented
24:27 a lot is the ability to start composing
24:30 these agents and composing them in
24:33 combination with more deterministic and
24:34 more simplistic functions you might
24:36 create the process of code generation or
24:39 code analysis which spans for very long
24:42 time some parts of this process are very
24:44 deterministic let's say do a git pool
24:46 some parts of this process are very
24:47 simplistic let's do simple AI llm
24:50 analysis we don't really need agents for
24:52 that but composing them together and
24:54 using ability of temporal to converge
24:57 few of the abstraction but yet very
24:59 powerful obstructions to one common
25:01 system inside the workflow you can start
25:03 creating deeper and deeper and deeper
25:06 networks that are able to execute much
25:08 more complex commands but yet while
25:11 you're doing that you still retain all
25:13 the visibility you still know every step
25:15 that agent took you still know every
25:17 step that has been delegated uh or where
25:20 the error happened and you can correlate
25:22 for that and you can compensate for that
25:24 once again if you're trying to implement
25:26 that at the end of the day all you do
25:27 you're create create a number of
25:29 processes that depend on other processes
25:31 that depend on other processes doing
25:33 that classically possible you can do it
25:35 in many languages some languages
25:37 specifically designed for that like
25:39 maybe airong but temporal makes it so
25:42 easy to use in any stack and temporal
25:44 makes it as well durable because even if
25:46 you shut down your worker if you kill
25:48 your agents it will still complete well
25:50 thanks to their
25:53 model so how do we use Solutions like
25:56 that how do we use it for our own
25:58 purpose proc we create applications for
26:00 ourself and our customers that are able
26:02 to solve arbitrary tasks that previously
26:04 would require engineering time which
26:06 actually no one want to spend do you
26:08 really want to have your senior engineer
26:09 creating your self mopper for Excel
26:12 every week because you have a new form
26:14 received from your vendor in this case
26:16 we found that there is a huge amount of
26:18 pance and a huge amount of parts of the
26:20 applications which we don't actually
26:23 want to do so let's ask agents to do
26:25 them we can ask agents to create them to
26:27 validate them execute them and test them
26:29 as they go spending not weeks but
26:31 minutes to get the working
26:34 result so why do we think temporal is
26:37 the best solution for youi well if if
26:40 presentation didn't say about it
26:42 explicitly they made in a completely
26:45 opposite spectrums this uh this very
26:47 powerful abstraction that allows you to
26:49 run NLP and generally speaking thinking
26:53 to execute some action is kind of
26:55 pointless to run on itself you need to
26:56 embed it to something and temporal
26:58 provides a very rich environment to make
27:01 this edance so easy and so simple and at
27:04 the same time so durable so combining
27:06 them together allows you to create very
27:07 complex chains and very complex
27:09 applications that can be pulling
27:11 information from many sources for the
27:12 span of minutes hours maybe days and
27:15 then execu on them to provide you result
27:19 result I can talk about that for days
27:21 maybe weeks but if you guys want to chat
27:24 more please visit us at our Boo or let's
27:26 get a drink later today
27:28 thank
27:36 you I guess any
27:49 questions have you run it and how much
27:51 does it cost a lot but how much how much
27:55 engineering time cost no I I I'm just
27:57 curious
27:59 um sign significant amount for for like
28:02 a mediumsized company but the way that
28:04 we can iterate on a pace that we never
28:06 seen before like in many cases we can
28:08 receive the working PC in 20 minutes on
28:10 a call with stakeholders like this
28:13 process will involve five people back in
28:15 the
28:16 day we don't want to use Ai and we don't
28:19 want to use it for everything because we
28:22 kind of don't trust it in many cases but
28:23 it's always going to be the use cases in
28:25 your work and your process which you
28:27 just don't care
28:28 mappings API calls data transformation
28:31 something you can easily verify and see
28:33 by your own
28:35 eyes I think
28:41 there well you evaluate it sorry
28:45 question how you evaluate a gentic
28:48 pipeline well that's a beautiful part
28:50 about temporal from the a from the
28:52 temporal perspective a gentic pipeline
28:54 is a huge workflow which yes you can
28:56 test by the steps and you can make sure
28:58 that all them work from the workflow
28:59 perspective however from the user when
29:01 you send the command it's just a
29:02 function so you evaluate result by
29:04 evaluating the quality of the function
29:07 result if if you're doing something that
29:09 is going to be fetching information from
29:10 your database like rock pipeline there
29:13 is a bunch of solution on a market which
29:14 can run it and see how well it
29:16 correlates with the actual information
29:18 so you evaluate result without actually
29:19 evaluating all the steps taken inside
29:22 you kind of don't even worry about them
29:23 agent is willing to do what he thinks is
29:25 the best
29:40 yeah
29:43 sorry why would we choose using the
29:45 child agent
29:47 if uh the good question I mean the
29:49 reason for that is because the context
29:51 uh window of each of the agents is
29:53 limited it's quite large but still it's
29:55 limited so if you have to perform a
29:57 simple action that that can only be done
29:59 based on information that can be
30:00 collected from many different parts of
30:02 the system by just collecting this
30:04 information you're already going to over
30:05 pollute the memory of the agent and he's
30:07 going to just be working much slower and
30:09 much harder and much more expensive so
30:11 instead of that you want to isolate this
30:12 process and only get the
30:21 result
30:23 run it can be in different
30:26 language how do we run what sorry yeah
30:29 generat
30:31 to we run it one runs uh we run it uh
30:35 right in the temporal so like the thing
30:36 I didn't say in that presentation the
30:38 referencing layer which was a single
30:39 slide is actually where we spend the
30:41 most of our time because when agent
30:44 defines the tool we actually defines it
30:46 as part of our system which we use
30:48 temporal at syncing layer to sync it to
30:50 our run times which makes it immediately
30:52 available for EI to use so basically by
30:55 eii creating the tool inside the system
30:57 it automatically declared it and makes
30:58 it available to EI if it's been
31:00 connected uh well in Declaration of this
31:04 agent it can be any language at the end
31:06 of the day and we quite we think that
31:08 eventually the language when you work
31:10 with the application is probably going
31:11 to M the
31:16 L um is there any other like temporal
31:19 specific limitations or things that you
31:22 ran into that you didn't
31:24 expect um there there there's few but I
31:27 mean they're not that large and they're
31:29 not that much different from what you
31:30 will do in temporal if you run a very
31:32 long decision chain the agent that can
31:34 spawn for many files and run many
31:35 iterations you're eventually going to
31:37 run to the con uh to the position then
31:40 when your workload just has to be
31:42 restarted and restarting the workflow
31:44 that has hunger potentially of child
31:46 workflows in a tree is quite a challenge
31:49 so you might need to implement your own
31:51 mechanism to properly collapse all these
31:52 workflows and restart them over in the
31:54 next iteration
32:04 mention at the beginning that
32:32 well right now we validated by the user
32:33 observation you just test it right in a
32:35 mix so you see if it works or not we
32:37 don't trying to create well huge
32:39 application servers using these tool
32:41 calls we just create a simple
32:42 Integrations which are much easier to
32:44 test but at the end of the day you can
32:45 actually feed this tool back into the
32:47 agent and that's another property of the
32:48 reference layer we create every tool
32:50 that agent create they actually become
32:52 part of the knowledge base that agent
32:53 can use to learn to create new tools or
32:56 read existing Tools in analyze if they
32:58 work correctly or they can just generate
33:00 the
33:06 test we have one more
33:24 question en
33:36 well you can move the llm call to the
33:38 separate task que and have a rate limit
33:41 on this task
33:42 que that's about it but in our case we
33:44 actually have our own backend that that
33:46 encapsulates all the llm calls where we
33:48 have additional priority queue with
33:50 additional rate
33:51 limiting it's it's it's a simple kind of
33:54 side effect that we allow multiple us
33:56 multiple organiz ations use the same
33:58 model but at the same time we can split
34:00 the model used between organizations so
34:02 they never collapse in this
34:07 regard but at the end of the day even if
34:09 you don't have it and it fails well it's
34:12 just going to be retried