0:06 [Music]
0:09 okay quick recap from part one our
0:11 culture is based on agile principles all
0:14 engineering happens in squads and we try
0:16 to keep them loosely coupled and tightly
0:19 aligned we like cross-pollination and
0:20 have an internal open source model for
0:23 code squads do small and frequent
0:26 releases which is enabled by decoupling
0:28 our self-service model minimizes the
0:30 need for handoffs and we use release
0:33 trains and feature toggles to get stuff
0:33 into production
0:35 early and often and since culture is all
0:38 about the people we focus on motivation
0:40 community and trust rather than
0:43 structure and control that was part one
0:45 and now I'd like to talk about failure
0:48 our founder Daniel put it nicely we aim
0:50 to make mistakes faster than anyone else
0:52 the idea is to build something really
0:54 cool we will inevitably make some
0:56 mistakes along the way but each failure
0:59 is also a learning so if we fail fast we
1:01 learn fast and therefore improve fast
1:03 it's a strategy for long-term success
1:05 it's like with kids you can keep a
1:07 toddler in the crib and she'll be safe
1:09 but she won't learn much and won't be
1:11 very happy if you still let her run
1:13 around and explore the world she'll fail
1:15 and fall sometimes but she'll be happier
1:17 and develop faster and the wounds well
1:19 they usually heal so Spotify is a failed
1:22 friendly environment we focus more on
1:24 failure recovery than failure avoidance
1:26 our internal blog has articles like
1:29 celebrate failure and story is like how
1:30 we shot ourselves in the foot
1:32 some squads even have a fail wall where
1:34 people show off their failures and
1:36 learnings failing without learning is
1:39 well just failing so when something goes
1:41 wrong we follow up with a post mortem
1:43 this is never about whose fault was it
1:45 it's about what happened what did we
1:47 learn what will we change post mortems
1:48 are actually part of our incident
1:50 management workflow so an incident
1:52 ticket isn't closed when the problem is
1:54 solved it's closed when we've captured
1:56 the learnings to avoid the same problem
1:57 in the future
2:00 fix the process not just the product in
2:02 addition all squads do retrospectives
2:04 every few weeks to talk about what's
2:05 working well and what to improve next
2:07 all-in-all Spotify has a strong culture
2:09 of continuous improvement driven from
2:11 below and supported from above
2:13 failure must be non-lethal or
2:16 don't live to fail again so we promote
2:19 the concept of limited blast radius the
2:21 architecture is quite decoupled so if a
2:23 squad makes a mistake it will usually
2:25 only impact a small part of the system
2:27 and not bring everything down and since
2:29 the squad has enter and responsibility
2:31 for their stuff without handoffs
2:33 like you usually fix the problem fast
2:35 also most new features are rolled out
2:38 gradually starting with just a tiny
2:39 percent of all users and closely monitored
2:40 monitored
2:42 once the feature proves to be stable we
2:44 gradually roll on out to the rest of the
2:46 world so if something goes wrong it
2:48 normally only effects a small part of
2:50 the system for a small number of users
2:51 for a short period of time
2:54 this limited blast radius gives squads
2:57 courage to do lots of small experiments
2:58 and learn really fast instead of wasting
3:00 time trying to predict and control all
3:03 the risk in advanced mario andretti puts
3:04 it nicely if everything is under control
3:07 you're going to slow alright let's talk
3:10 about product development our product
3:11 development approach is based on Lean
3:14 Startup principles and is summarized by
3:16 the mantra thinking bill it should be
3:18 tweak it the biggest risk is always
3:21 building the wrong thing so before
3:22 building a new product a major feature
3:26 we try to define a narrative kind of
3:27 like a press release or elevator pitch
3:30 showing off the benefits for example
3:32 radio you can save or follow your
3:34 favorite artists we also create
3:36 prototypes to get a sense of what the
3:38 feature might feel like and define
3:40 hypotheses how will this feature impact
3:42 user behavior and our core metrics will
3:44 they share more music will they log in
3:47 more often whenever possible we put real
3:49 prototypes in front of real users once
3:51 we feel confident this thing is worth
3:53 building we go ahead and build an MVP
3:56 Minimum Viable Product just enough to
3:58 fulfill the narrative but far from
4:00 feature complete you might call it the
4:03 minimum lovable product anyway the real
4:04 learning happens once we put something
4:06 into production so we want to get there
4:07 as quickly as possible
4:10 again we deploy the MVP to just a few
4:12 percent of all users and use techniques
4:14 like a be testing to measure the impact
4:16 and test our hypotheses the squad
4:18 monitors data and continues tweaking and
4:21 redeploying until they see the desired
4:23 impact then they gradually roll out to
4:25 the rest of the world while taking the
4:27 time needed to sort out practical
4:29 stuff like operational issues and
4:31 scaling by the time the product or
4:33 feature is fully rolled out we already
4:35 know it's a success because if it isn't
4:38 we don't roll it out impact is always
4:40 more important than velocity so a
4:41 feature isn't considered done until it
4:44 has achieved the desired impact so with
4:46 all this experimentation going on how do
4:48 we actually plan how do we know what's
4:50 going to be released by which date well
4:53 the short answer is we mostly don't we
4:54 care more about innovation than
4:57 predictability and 100% predictability
5:00 means 0% innovation on a scale we'd
5:03 probably be somewhere around here of
5:04 course sometimes we do need to make
5:06 delivery commitments like for partner
5:08 integrations or marketing events and
5:10 that sometimes involves standard agile
5:11 planning techniques like velocity and
5:13 burn up charts but if we have to promise
5:15 a date we generally defer that
5:17 commitment until the feature is already
5:20 proven and close to ready by minimizing
5:22 the need for predictability squads can
5:24 focus on delivering values that have
5:25 being enslaved as someone's arbitrary
5:28 plan one product owner said I think of
5:30 my squad as a group of volunteers that
5:32 are here to work on something they are
5:34 super passionate about an amazing new
5:36 product always starts with a person and
5:38 a spark of inspiration but it will only
5:40 become real if people are allowed to
5:42 play around and try things out so we
5:44 encourage everyone to spend about 10
5:45 percent of their time doing hack days or
5:47 hack weeks that's when people get to
5:49 experiment and build whatever they want
5:51 no limits like this dial a song product
5:53 basically a Spotify enabled analog phone
5:55 just dial the number of the song you
5:57 want to listen to is it useful doesn't
5:59 matter the point is if we try enough
6:02 ideas we're about to strike gold from
6:04 time to time and quite often the
6:05 knowledge gained is worth more than the
6:08 actual hack itself plus it's fun in
6:11 addition twice per year we do a Spotify
6:13 wide hack week hundreds of people
6:15 hacking away for a whole week the mantra
6:17 is make cool things real do whatever you
6:19 want with whoever you want in whatever
6:21 way you want and then we have a big demo
6:24 and party on Friday it's amazing how
6:26 much cool stuff can be built in just a
6:27 week with this kind of creative freedom
6:29 whether it's a helicopter made of
6:31 lollipop sticks or a whole new way of
6:33 discovering music turns out that
6:35 innovation isn't really that hard people
6:37 are natural innovators so just get out
6:40 of their way and let them try things out
6:42 you notice we have an experiment
6:45 friendly culture Thule or tool B let's
6:47 try both in compare do we really need
6:49 sprint planning meetings don't know
6:51 let's skip a few and see if we miss them
6:54 should this button be in the middle or
6:57 in the corner let's try both an a/b test
6:59 even the Spotify wide hack week started
7:01 as an experiment and now it's part of
7:03 our culture so instead of arguing an
7:05 issue to death we talked about things
7:08 like what's the hypothesis what did we
7:10 learn and what will we try next this
7:12 gives us more data-driven decisions and
7:14 less opinion driven ego-driven or
7:17 Authority driven decisions although we
7:19 are happy to experiment and try
7:20 different ways of doing things our
7:23 culture is very waste repellent or lean
7:25 if you prefer that means people will
7:27 quickly stop doing anything that doesn't
7:29 add value if it works
7:32 keep it otherwise dump it for example
7:34 some things that work for us so far our
7:37 retrospectives daily stand-ups Google
7:40 Docs get and guild on conferences and
7:41 some things that don't work for us our
7:44 time reports handoffs separate test
7:46 teams or test phases and task estimates
7:49 we mostly just don't do these things we
7:51 are also strongly allergic to useless
7:53 meetings and anything remotely near
7:56 corporate bs one common source of waste
7:59 is what we call big projects basically
8:01 anything that requires a bunch of squads
8:04 to work tightly coordinated for several
8:07 months big project means big risk so we
8:09 are organized to minimize the need and
8:11 instead try to break projects into a
8:13 series of smaller efforts however
8:16 sometimes a big project is necessary and
8:18 in those cases we found some practices
8:21 to be essential visualize progress using
8:23 various combinations of physical and
8:25 electronic boards do a daily sync
8:28 meeting where all squads involved meet
8:31 up to resolve dependencies do a demo
8:33 every week or two where all the pieces
8:35 come together so we can evaluate the
8:36 integrated product together with the
8:39 stakeholders these practices reduce risk
8:41 and wastes because of the improved
8:43 collaboration and short feedback loop we
8:45 found that a project also needs a small
8:47 type leadership group to keep an eye on
8:48 the big picture
8:51 typically a tech lead product lead and
8:53 sometimes a design lead no project manager
8:54 manager
8:56 so far but that might change in general
8:58 we're still experimenting a lot with how
9:00 to do big projects and we're not so good
9:03 at it yet one of our big challenges is
9:05 growth painting as we grow we risk
9:07 falling into chaos but if we
9:08 overcompensate and add too much
9:10 structure and process we risk getting
9:12 stuck in bureaucracy instead and that's
9:15 even worse so the key question is really
9:17 what is the minimum viable bureaucracy
9:19 the least amount of structure and
9:20 process we can get away with to avoid
9:24 total chaos both sides cause waste but
9:25 in different ways so the waste repellent
9:28 culture and agile mindset helps us stay
9:30 balanced the key thing about reducing
9:32 waste is to visualize it and talk about
9:34 it often so in addition to
9:35 retrospectives and post-mortems
9:38 many squads and tribes have improvement
9:40 boards that show things like what's
9:41 blocking us and what are we doing about
9:44 it we also like to talk about definition
9:44 of awesome
9:47 for example awesome for this squad means
9:49 things like really finishing stuff
9:51 easily ramping up new team members and
9:54 no recurring tasks or bugs and our
9:55 definition of awesome architecture
9:58 includes I can build test and ship my
10:01 feature within a week I use data to
10:02 learn from it and my improved version is
10:05 live in week 2 awesome is a direction
10:07 not a place so it doesn't always have to
10:09 be realistic but if we can agree on what
10:11 awesome would look like it helps focus
10:12 our improvement efforts and track
10:15 progress here's an example of an
10:16 improvement tracking board inspired by a
10:18 technique called Toyota improvement
10:21 Kutta top left shows what is the current
10:22 situation in this case the squad was
10:25 having quality problems bottom left
10:27 shows definition of Awesome in a perfect
10:29 world we have no quality problems at all
10:31 top right is a realistic target
10:33 condition if we were one step closer to
10:35 awesome what would that look like and
10:37 finally the bottom right shows the next
10:39 three concrete actions that will move us
10:41 towards the target condition as these
10:43 get done new actions are identified by
10:46 the squad boards like this live on the
10:47 wall in the squad room and are typically
10:49 followed up at the next retrospective
10:51 all right I realize that maybe this
10:53 video makes it seem like everything at
10:55 Spotify is just great
10:57 well truth is we have plenty of problems
10:58 to deal with and I could give you a long
11:01 list of pain points but I won't because
11:03 it would go out of date quickly we grow
11:05 fast and change fast and quite often a
11:07 seemingly brilliant solution today
11:09 we'll cause a nasty new problem tomorrow
11:11 just because we've grown and everything
11:13 is different however most problems are
11:15 short-lived because people actually do
11:17 something about them this company's
11:18 pretty good at changing the architecture
11:20 process organization or whatever is
11:22 needed to solve a problem and that's
11:24 really the key point healthy culture
11:27 heals broken process so the culture is
11:29 so important we put a lot of effort into
11:31 strengthening it this video is just one
11:33 small example no one actually owns
11:35 culture but we do have quite a lot of
11:37 people focusing on it groups such as
11:39 people operations and about 30 or so
11:41 agile coaches spread across all squads
11:44 and we do boot camps where new hires
11:47 form a temporary squad they get to solve
11:49 a real problem while also learning about
11:50 our tech stack and processes and
11:52 learning to work together as a team all
11:53 in one week
11:56 it's like cultural shock therapy they
11:57 often manage to put code into production
12:00 in that time which is impressive but
12:02 again failing is okay as long as they
12:04 learn mainly though culture spreads
12:06 through storytelling whether it happens
12:09 on the blog at a post mortem a demo or
12:11 at lunch as long as we keep sharing our
12:13 successes and failures and learnings
12:14 with each other I think the culture will
12:16 stay healthy at the end of the day
12:19 culture in any organization is really
12:21 just the sum of everyone's attitudes and
12:24 actions you are the culture so model the
12:27 behavior you want to see that's it I
12:29 hope you enjoyed this story thanks for listening
12:30 listening [Music]
12:33 [Music] you