This content traces the development and impact of Very Long Instruction Word (VLIW) architecture, a radical approach to computer processing that achieved significant speedups through aggressive compiler-driven parallelism, despite ultimately facing market challenges.
Mind Map
Zum Vergrößern klicken
Klicke, um die vollständige interaktive Mind Map zu öffnen
You are walking down an alleyway late one night, and suddenly behind you,
a mysterious dark figure appears and says.
Sir, I promise you a computer. A computer up to 20 times faster than what is on the market.
You think. Well, what sort of unholy, insane 2-nanometer silicon will this computer be using?
No, the dark figure replies. Ordinary hardware. Same as what everyone else is
using. Maybe even a bit slower, clock time-wise.
So what is the catch?
If you accept this computer, then you must also accept some
of the most brilliant but maniacal compiler software ever created. Oh.
In today's video, we trace a radical idea to a computer 10-30 times faster than anything
thought possible. They said it was impossible. An intrepid platoon of geniuses proved them wrong.
## Beginnings
I want to start with a video game that I liked to play during the pandemic called Overcooked.
In this game, you play a chef in a relatively bizarre kitchen. And
you must cook and serve finished plates of food to fulfill incoming customer orders.
The recipes are largely simple. Two that I remember - and are illustrative
of where we are going - are the salad and the hamburger. Let's begin with the salad first.
The basic salad is easy. Your chef character gets the lettuce from the bin, brings it to
the chop station and chops it, sticks it onto a plate, and then delivers to the customer.
The burger is a bit more complex. You take a bun, lettuce, and beef steak out
of the bin. You must chop the lettuce and beef steak. Do not chop the bun.
However, the beef steak also has to be cooked on the oven.
Do not cook the steak until you have chopped it first.
The world of the CPU is a little like Overcooked. A CPU is a kitchen that
takes in raw data inputs/ingredients and transforms it to get finished outputs.
The various steps to produce a salad or burger dish are Instructions.
An instruction is essentially just a string of 1s and 0s that tell the CPU what specific
action to take and on what data. Get the lettuce. Chop the meat. Plate the burger.
The CPU's life is to fetch instructions out of its high-speed memory (called its "register"),
decode them, and then execute them.
There are more steps in this lifecycle but fetch, decode, and execute are the basics.
When a programmer writes software - or vibe-codes it using Claude Code - they are most often writing
that program in a higher level language like C or Fortran that they can read.
But the CPU hardware can't read that. So the vibe coder’s program
must be translated or compiled into instructions that the CPU
hardware can follow. Chop the tomato. Chop the lettuce. Plate the salad.
This is done by a special software called the compiler. The compiler
can also do a whole bunch of other stuff, but let me get to that later.
Put together, all the instructions that a CPU can handle make up what we call the instruction set.
In a way, we can consider the CPU itself the literal physical manifestation of its instruction
set. After all, what sits inside the kitchen tells you what that kitchen is capable of, no?
## Parallelism
This all makes sense, right? Now, how might we get our food faster?
A simple way is to just make the chefs move around faster. In that,
I mean to raise the CPU's clock speed. So straight Moore's Law.
We shrink the transistor so signals travel faster from source to drain.
The fabs were doing that. But is there anything else? Early in computer history,
various scientists proposed concepts to speed up operations with a little cleverness.
Let us consider the burger. You might decide to do all the steps - bun,
lettuce, meat - one after the other.
But if you think about it, we have time spent doing nothing while waiting for the meat to cook.
Why not use that time to chop the lettuce?
So let us do the meat first. Then chop lettuce.
This is parallelism - executing various
instructions out of order for faster overall processing.
Specifically, this is a category of techniques that we call Instruction-Level Parallelism or ILP.
I hope this made sense. I am now leaving the
Overcooked metaphor behind before it ... overcooks.
## Out of Order The trouble with parallelism however is that
many instructions depend on the outputs of their priors.
You cannot grill the burger meat unless you have chopped it first. You can’t decode something until
you have fetched it first. You cannot add A and B together until we know what A and B actually are.
We generally break up a program's code into blocks.
Each block is usually only a few instructions long. Just about six.
Each block tends to end with a conditional branch like an if-else statement, function call or loop.
The name tells it all, the conditional branches create branching paths in the code.
The presence of such branches means that we don’t
know what instructions will run or what data they will run on.
Without this, we cannot run that many instructions in parallel. A famous 1970 paper looked into this,
and with a few assumptions concluded that program code on average contained
enough independent work to only do about 2 operations at the same time.
So that seemed to settle it. Parallelism can’t do that much for us ... right? But what if we decided
to throw out those assumptions? Free ourselves from all the legacy stuff? What then is possible?
## A Radical Idea
In the late 1970s, a graduate student at the Courant Institute of Mathematics at NYU named Josh
Fisher joined a project to build an emulator of the supercomputer CDC-6600. They called it PUMA.
The CDC-6600 is an iconic computer, but PUMA would try to shrink it using modern
integrated circuits. Fisher's role at the start involved making tools for computer
aided chip design. Programs to layout and route wires as well as simulation.
He then started working on the 64-bit microcode for the emulator. This microcode sought to break
down pieces of program code - originally meant to be run linearly - into smaller,
simpler blocks that can be run in parallel.
As he did this, it occurred to him that these two tasks - chip layout and code scheduling - were
conceptually similar. Both ingest one-dimensional lists and put out two-dimensional maps or grids.
A program producing a chip layout takes in a netlist - which is a
simple one-dimensional list of an IC’s transistors - and produces a layout:
A two-dimensional map of each of those transistors' placements.
With code scheduling, it was similar.
Turn a list of operations normally performed linearly - like fetch, decode,
execute - and turn it into a two-dimensional grid where those operations can be run in parallel.
This led him to an optimization technique known
as Trace Scheduling. Gosh, how am I going to do this one ...
Let me revisit the concept of a program being broken up into many
blocks. Each block is maybe just a few instructions long and as I said before,
ends with a conditional jump like an if-else statement or loop.
Such things create branching paths in the code. Since it had been assumed that
we cannot predict the future state of the program, we are left to only search
for parallelism opportunities within these tiny blocks. There are not many.
Trace scheduling busts through this assumption.
The compiler traces through the entire program like as if it is a single block and predicts
a likely code execution path using either heuristics or actual performance information.
In essence, assuming that the program's conditional jumps do not exist.
This magical compiler then aggressively schedules all
the instructions in that trace - moving up anything that can be parallelized.
These instructions are bundled into a very long instruction word.
Sometimes the trace gets it right. For example, most of the time with a loop,
it’s probably most likely that the program just jumps back to
the start of the loop again. It can just presume that is the case.
If the trace gets it right, then we blast through the program like the Millennium Falcon - achieving
parallelism speedups of 10 to 30 times, far beyond what was previously thought possible.
Sooo what happens if the trace doesn't get it right?
With existing CPUs, there are extra circuits or hardware to fix such compiler mistakes on
the fly. It is nicer for them because they know the variables at runtime.
Trace scheduling eschews that because it means more complex hardware, which goes against its
philosophy of shifting all complexity to the software. So we need something else.
Imagine we are cave diving - and I will never go cave diving - and we come across a divergence,
we might attach a string or line so we can backtrack.
Compensating Code is the compiler's cave dive string. If the compiler decides to move up an
instruction, it must add code so that if the trace goes wrong, the computer can backtrack or redo.
The compiler might also continue on tracing a different path, hoping that it eventually
returns to the fold. There is a real risk of code bloat, where the compiler adds SO much
compensating code that the thing cannot achieve the promised performance goals.
Hopefully, you can see why this can get a bit tricky. We must guess the
program's future before we run it. This compiler practically has to be a time-traveling Mary Sue.
## Hardware and Compiler Together After finishing his PhD,
Fisher found a tenure-track post-doc job at Yale, moving there in 1979.
Soon thereafter, he gathered a team of smart and talented students and tried to
implement in software the trace scheduling technique that he had written about before.
In 1980, Fisher consulted for General Electric, working to adapt trace scheduling for a computer
made by a company called Floating Point Systems. But it ended up failing because
the hardware’s fussiness and complexity made it difficult to achieve much parallelism.
Embarrassed by the failure, Fisher dug up manuals for other computers like the CDC Cyberplus and
found that those too were too complicated to get the degree of parallelism he wanted. It
always seemed like the highest possible parallelism speedup was about 2-3 times.
Fisher got frustrated. Eventually, he came to believe that the only way to get the gains he
sought was to implement both the compiler and hardware simultaneously. And he starts
thinking about what kind of architecture would such a computer need to have.
## VLIW
In 1982, Fisher submits a paper to the International
Symposium of Computer Architecture about his work.
It is titled “Very Long Instruction Word architectures and the ELI-512",
and details his new trace scheduling-centric computer: The ELI-512.
The ELI stands for “Enormously Long Instruction”,
and also serves as an inside joke for Yale people. It was a simplified RISC
device, but modified to run a pack of multiple instructions. Fisher writes:
> Everyone wants to use cheap hardware in parallel to speed up computation.
One obvious approach would be to take your favorite Reduced Instruction Set Computer ...
> let it be capable of executing 10 to 30 RISC-level operations
per cycle controlled by a very long instruction word. (In fact,
call it a VLIW.) A VLIW looks like very parallel horizontal microcode.
The paper's real audacity was less its hardware than its
Trace Scheduling-enabled compiler, named Bulldog.
Today this paper is acknowledged as one of the greats. But when it was released,
people greeted it with polite incredulity. A graduate student at Carnegie Mellon named
Bob Colwell recalls reading the VLIW paper and thinking this guy was nuts:
> I thought, he wants to do what with a compiler? This guy is nuts.
He wants to move code all over the place and then make up for that intentional code
misbehavior with yet more code. You'll never get away with it, it seemed to me,
and even if you can patch things back up, the new overhead will kill performance.
He thought the complexity would eat them all alive. Thusly,
it was inevitable that Colwell would later on down the line work with Josh Fisher.
## Notorious H.A.R.D.W.A.R.E.
As the months passed, Fisher discovered a curious effect.
He discovered that whenever he came to talk about the VLIW computer, people filled the room - mostly
to tell him that he was wrong and that his computer was impossible. The energy was electric.
By contrast, whenever he came to talk about the trace scheduling compiler technique,
it was crickets. With gentle, general agreement that why yes this can work. He later reminisced:
> You can get more people, a LOT more people,
to come to your talk if you promise them bizarre sounding hardware instead of a compiler technique.
To Fisher, this made little sense because to him the hardware was the easy part! The compiler does
all the hard work of arranging and scheduling the instructions across blocks for parallelism.
The hardware just does whatever the compiler tells it to do.
Get the compiler right and the rest falls into place. And why does this
approach feel so alien to people? Is this not what John Cocke of RISC fame also advocated?
Fisher started writing papers evangelizing the VLIW approach,
with provocative titles that got the people going. It brought him notoriety,
but young and restless, he soon realized that he wanted more.
People kept telling him that his computer was impossible. He wanted to prove them wrong.
But building such a computer from scratch took more resources than
what can be found within academia. And the big computer companies like IBM or
DEC seemed to have no interest in funding something so unproven as the VLIW technique.
## Founding Multiflow
The early 1980s were an interesting time for hardware startups.
In 1979, new US laws allowed pension funds to invest into venture capital funds,
greatly expanding their assets under management.
Total venture capital funds would grow ten-fold between 1980 and 1989.
A lot of this VC funding went into computer startups seeking to challenge
entrenched players or just try new things. Apollo Computer, Silicon Graphics, Compaq,
Thinking Machines. They were all founded around this time.
As the VC boom grew, computer science academics
began leaving to found their own computer hardware startups. Much
like how many university researchers in AI today are leaving to do AI Neolabs.
After five months of pondering and consulting with family and colleagues,
Fisher too decided to left Yale and do a startup. He was joined by his
graduate student John Ruttenberg and systems manager John O'Donnell.
Since this was the VC boom, they naturally wanted to take VC money. But in late 1983,
they met with Apollo Computer, the famed workstation maker. Apollo offered to fund
the VLIW computer's development. Once it was done,
Apollo would market it. To get started, the company offered a $500,000 loan.
Now for a name. The obvious one was VLIW Technology - there was a company out there
called VLSI Technology too - but Josh Fisher wanted something warm and fuzzy,
because the term VLIW by then had already gained a little notoriety.
They pondered "Mercury" - because it was in the similar line as Apollo - and Elm
City Supercomputer - because that was their city.
In the end, Ruttenberg coined the name "Multiflow", which seemed to
convey the visual of logic flowing through the computer. Despite concerns that people
might think their company drilled for oil or produced high-tech toilets, they chose it.
And thus in April 1984, Multiflow got started. Just six months later,
the Apollo deal collapsed after its CEO was replaced. With money running low, Fisher and the
other cofounders went to the VCs - personally borrowing money to keep paying employees.
In February 1985, they closed a $7 million round. Another $26
million over two rounds would be raised over the next two years.
By the way, before we continue, I want to highly recommend the book "Multiflow
Computer" by Elizabeth Fisher. Elizabeth is Josh's wife, and thus had a front-row
view to the whole saga. It is fantastic. Read it for an in-depth look into life at a startup.
## The Mini-Super Boom
Multiflow sought to develop and sell high performance computers
for the scientific and engineering markets.
These silicon monsters do the most complex, time-consuming calculations. Often with many
decimal points of accuracy. They also cost upwards to tens of millions of
dollars which restricted use to the biggest government labs on a time-share basis.
What if we can produce a supercomputer with a significant percentage of performance at
a fraction of the price and size? This would make more compute power available
to companies who needed them to run increasingly special calculations.
There had always been "small supercomputers", but those truly were supercomputers - made by
traditional supercomputer makers like Cray or Fujitsu. They were still quite hefty.
Then in 1985, a small startup in Texas called Convex released the C-1 computer.
They marketed it as a "mini-supercomputer", or "mini-super" or "super-minicomputer" - and called
them a new category of computer. Some people dubbed them "Crayettes" which is funny to me.
Enabled by advancing VLSI semiconductor technology, the C-1 was less a small
supercomputer than a souped-up minicomputer. That made them more of a threat to the famed
minicomputer-maker DEC than Cray. But anyhow the category took off with a wave of new entrants.
There were a lot of these guys. There was Alliant Computer Systems, founded the same year as Convex.
Scientific Computer Systems, which booted up in 1983; And then after that,
a half dozen new companies like Cydrome (remember this name), Gould, and of course Multiflow.
Multiflow would arrive in the second wave of this supermini boom,
meaning that they had to take on several established players. They
had turn the ELI-512 paper's concepts into a working, manufacturable product. Then
convince actual customers that the "impossible computer" was indeed real and worth adopting.
Over the span of two years, the team frantically worked days,
nights and weekends to 2 AM or later to put together their first computer: the TRACE 7/200.
## The Hardware
The VLIW philosophy says to keep the hardware simple.
Keep it simple so that it can be manufactured fast and at scale.
The TRACE computer had multiple execution units to do arithmetic, logic, floating-point numbers,
loading from/storing to memory, plus a conditional branch unit.
If the computer was to be highly parallel ... if you want to give the compiler the greatest
freedom to do whatever, whenever ... then the device had to also be unusually interconnected.
Fisher's original paper showed a rough sketch of his vision. The global interconnection
had 16 clusters - which contain the various execution units together with their memories.
They are interconnected both to their sides and across with buses, or wires.
The whole thing looks vaguely abyssal - as
if we are trying to summon the VLIW demon from the architectural depths.
Manufacturing such a complicated structure conjured similar terrors. The TRACE computers
targeted scientific use cases, which demanded larger 64-bit double precision floating point.
That is a lot of data, so very large buses. 64 separate copper pins plus
the control signals on a connector. And with dozens of buses - remember
to count the buses to the sides as well as across - that adds up to thousands of pins.
The persistent proliferation of petite pins occasionally made
it difficult to plug them all into the computer's backplane.
The aforementioned Bob Colwell writes that the hardware lab had a big, heavy, and world-weary
rubber mallet to coax these VERY expensive pins into their place. They called it "the Persuader".
Elizabeth Fisher relates a story that happened as
the hardware design approached ship date to its fabricator.
It was the night before the deadline,
and the hardware team was trying to fit all the memory registers into the computer,
which was a struggle because the hardware had to be so interconnected and space was so limited.
If you recall, the register refers to the high speed memory holding
the chip's runtime variables as it does stuff.
The spec called for the registers and arithmetic units to be fully interconnected. But the hardware
team couldn’t fit and connect all that together. There simply was no space.
Desperate to make the deadline and with no one on the compiler team around,
the hardware people saw no choice but to split the
pair of register chips - attaching one to the outsides of the two arithmetic units.
The compiler team was infuriated because this in certain cases can create a split
brain problem where the two arithmetic units see different things and disagree. But it was
too late. The silicon was locked in. Fortunately it wasn't catastrophic.
## The Compiler
But the core of the TRACE series was not hardware. It was the software, the compiler.
As I mentioned, the hardware is so simple because the compiler takes
on all the complexity. Anything that can be shifted over, was.
Each cycle during operations, the CPU fetches its very large instruction
word from its memory with the multiple instructions that the compiler bundled
together. After the word unbundles, its instructions go straight to the units.
For this first 7/200 computer, each word had up to 7 instructions bundled together
and is about 256-bits large. Multiflow later released the 14/200 - which packed
together 14 instructions - as well as the 28/200 with 28 instructions and 1024 bits.
Without hardware circuits to direct the flow of data through the buses
or resolve memory conflicts at runtime like with other CPUs,
it is all on the compiler to coordinate that. It's got to do everything.
Modeled on the Yale Bulldog compiler, the Multiflow TRACE compiler turns programs
written in Fortran or C into high performance code for the computer. It does this over three phases.
In phase 1, the compiler takes the Fortran and C
code and turns it into an intermediate representation called IL-1. The idea
is to capture language-specific rules and programmer intent for later phases.
In phase 2, the compiler takes the IL-1 representation and reinterprets
it again at a lower level for the machine. It runs an optimization
step to reduce the amount of computation and increase the amount of parallelism.
For example, the compiler addresses loops by unrolling them - copying the loop's body for
some number of iterations as determined by some heuristic for the scheduler. After cleaning up
some variable names, the unrolled loop is ready to be exploited for max parallelism.
The output of phase 2 is another intermediate representation called
IL-2. We are now finally ready for Phase 3. This is where the
actual trace scheduling algorithm is run and instructions are scheduled.
As I said earlier, the algorithm runs through the program code and guesses a
likely path using heuristics or profiled data given to the compiler by the user.
After scheduling, the algorithm will insert compensation code
to cover up any potential off-track branches.
The result is a compiler that enables the TRACE computers to outperform a RISC-based
MIPS computer in well known benchmarks like LINPACK anywhere from 2 to 10 times.
Real world performance however did depend on the individual program, which infuriated
salespeople on both sides to no end. "Your mileage will vary" was a common refrain.
It's definitely not perfect. One flagged issue was that the compiler runs very
slowly - four times slower than one of DEC's RISC-based workstations. In part
because the Multiflow compiler creates six representations of
the program throughout its three phases. Compilation sometimes took up to 3 days.
Nevertheless. It was a marvel. The trace scheduling algorithm worked. The Multiflow
compiler team were wizards, and produced a software program that was surprisingly
reliable for something that had to literally predict the future.
## Debut
Multiflow debuted the TRACE series in April 1987 at a glitzy event at the World Trade Center.
Multiflow lined up three beta customers including the Supercomputer Research
Center - a division of the US NSA. They all gave glowing endorsements. Grumman
Data Systems said that the computer was running their software two hours after being uncrated.
They then took the TRACE to a 1988 supercomputing conference held in Santa Clara. For years,
people had told Josh Fisher that VLIW was impossible. But now here it was,
running UNIX and working like a real computer.
The CAD chief at Sikorsky Aircraft said about the TRACE:
> To many of us, what the Multiflow people told us it could do seemed
like black magic ... [but now] not only do you have a reasonably priced
supercomputer, but you don't have to rewrite software significantly
To convince people that they were not taking a risk on some "radical architecture",
Multiflow launched a massive PR and marketing program.
The campaign was masterminded by Brian Cohen - PR machine and future angel investor.
He threw himself into the task and so completely believed in it that he even named his son Trace.
It helped that the computer was blazing fast too. The TRACE 7/200
did 53 million instructions per second and 30 million floating point operations per
second. The followup 28/200 boasted specs four times higher than that.
The story also sold well. Fisher was a willing subject with a compelling
personal story. And the VLIW technology itself was intriguing. The notion of this
compiler correctly guessing some 90% of the branches in a program is eye-catching.
The computer's debut got covered by a wide variety of press outlets, including a full page
in Business Week. All in all, it was a triumphal moment. The impossible computer was real.
## Selling the TRACE
Multiflow originally targeted scientific customers: University and government labs.
Such labs wanted supercomputer-like performance at a fraction of the price.
A Cray would cost maybe $5 million as compared to a TRACE's $300,000.
They can load in their Fortran-coded programs and
let the complier optimize it without additional modifications. Moreover,
scientific application programs were thought to have lots of opportunities for parallelism.
The computer turned out to be very useful for commercial users too. By the end of 1989,
they sold about 100 machines to 75 customers and about half
of those were commercial like P&G, Hewlett Packard, Motorola, and more.
In 1989, Multiflow released an upgraded line of machines:
The 7/300 series - four times faster than their predecessors. Impressively,
those gains were almost entirely achieved with an improved compiler. Just software.
Unfortunately, it was also too late. By then,
the company was already in a financial tailspin that it would not escape from.
## The End of the Minisuper
The minisuper boom had attracted a rogues' gallery of players like Convex, Alliant,
Cydrome and DEC. You can count up to 20 vendors in the market.
But in 1987, analysts had estimated the whole market size to be just about
$350 million. Considering it might take $20-30 million to develop a minisuper,
you don't need to think a long time to conclude that 20 is too much.
The presumption had been that the Crayettes would go after the real
deal. But supercomputer vendors like Cray upped their game. At the high-end,
they added the Cray Y-MP with way more power than the minisuper can offer.
Cray then shored up their low-end flanks with an extension of the older Cray X-MP.
Though pricier than a minisuper at $14 million, it offered compelling
price-performance for labs and weather stations that could not afford the Y-MP.
Moreover, Cray defanged many of the minisuper vendors' top selling point by
adopting a flavor of UNIX called UNICOS in 1985. This grew their platform and
gave national labs confidence that their applications would work on Cray hardware.
## The Killer Micros
But it was on the low end where the most serious competition was: The "Killer Micros".
It is a phrase that emerged in the 1980s that refers to powerful UNIX
workstations equipped with highly integrated single-chip CMOS CPUs.
Such RISC chips like Sun's SPARC, IBM's RS/6000, MIPS, and even Intel's i860
were getting more powerful each year - subsuming computer categories that once
existed like minicomputers and the minisupers. Put another way: Convergence, driven by the CPU.
Multiflow gained some benefits from having its CPU set up as clusters of discrete
compute modules. But this separation also meant that they could not benefit
from the exponential scaling of Moore's Law - which granted both size and power advantages.
And while Multiflow might have had the best software in the business,
their hardware was painfully lacking. George Weiss at Gartner Group would say about them:
> "The technology was good; they put tremendous effort into software,
but they needed to duplicate that on the hardware side,"
The low-end i860 ran at 25 hertz and used just a few watts. The TRACE on the other hand ran
at 8 hertz and needed multiple kilowatts and big copper buses to distribute that power. In the end,
ever faster cycle times let the Killer Micros make up for any architectural disadvantage.
And at just $100,000, these workstations' price points cannot be beaten. Frankly,
big iron mainframes just fundamentally could not
keep up with the greatest cost scaling items in human history.
Analysts had once estimated that the mini-supercomputer market would almost
quadruple to over a billion dollars in 1991. That never happened. Instead,
starting in the summer of 1988, the whole category began to implode.
Vendors resorted to steep price cuts. And when that didn't work, exited stage
left. Celerity Computing, a San Diego-based UNIX vendor who tried to enter the space,
fell apart and was acquired in 1988 by Floating Point Systems.
But Floating Point itself was also struggling. Alliant too reported quarterly losses.
Cydrome - the only other minisuper company pursuing VLIW - folded without
commercially shipping a product. The losses came fast and hard.
## The Flow Ends
When the US market started to crash in 1989, Multiflow found itself on the wrong course.
The company lost money from the start - always one step behind
Convex and Alliant. They took too long to enter international markets
like Japan and Europe - not expanding there until it was way too late in 1989.
Weiss, the Gartner analyst, added that great but not game-changing
performance gave the company little chance to overcome its ecosystem disadvantages:
> "Multiflow never really delivered a dramatic performance — certainly
not enough to grab market attention ... Convex has a more aggressive sales force,
a more sophisticated hardware platform and had more software ported earlier in the game."
As 1989 came to a close, management increasingly focused on an acquisition by DEC as their last
chance. DEC deeply evaluated the VLIW technology as a potential platform for future work.
And while it passed many of their evaluations,
powerful voices both in and out of the company shot it down. Particularly
those behind a competing high-speed computer project called the VAX 9000.
In Christmas 1989, DEC started to backpedal - saying that it could
not do a deal right then because adding Multiflow's expenses to
the income statement would cause them to report their first financial loss.
And then in March 1990, DEC told Multiflow that the acquisition deal
was truly dead. Two of the company's venture capitalists tried to salvage it but failed.
There was no Plan B. They were out of money. With that, the board decided that Multiflow
should voluntarily liquidate. At the time, the company had about 160 employees. They gathered
for a meeting the next day to hear the news - and then got to work dissembling the company.
## Conclusion
In the end, Multiflow the company did not find the economic success that it desired.
But judging by what they were able to do and their influence on the computing world,
they succeeded beyond their wildest dreams. And ironically,
going out of business perhaps helped spread its ideas farther and wider.
The sheer amount of talent they had gathered was shocking,
considering how small they were. Fisher is a winner of the Eckert-Mauchly Award - widely
acknowledged as the most prestigious citation for computer architecture.
Another winner was the aforementioned Robert Colwell, who joined Intel and
became the chief architect for iconic chips like the Pentium Pro, Pentium II,
III, and Pentium 4 CPUs. Man is a legend.
To many of these employees or "Multifloids" as they called themselves, the company's failure
had less to do with the architecture than the business environment. This thing worked,
and it was capable of incredible performance.
So when the business wound down, the talents joined other companies like Hewlett-Packard,
Intel, DEC, and others and evangelized their ideas.
So VLIW lived on. Most famously - or infamously - with Hewlett-Packard and then of course,
Intel. But that is a story for another day.
Klicke auf einen beliebigen Text oder Zeitstempel, um direkt zu dieser Stelle im Video zu springen
Teilen:
Die meisten Transkripte sind in unter 5 Sekunden bereit
Mit einem Klick kopieren125+ SprachenInhalt durchsuchenZu Zeitstempeln springen
YouTube-URL einfügen
Gib den Link eines beliebigen YouTube-Videos ein und erhalte das vollständige Transkript
Transkript-Extraktionsformular
Die meisten Transkripte sind in unter 5 Sekunden bereit
Unsere Chrome-Erweiterung installieren
Transkripte abrufen, ohne YouTube zu verlassen. Installiere unsere Chrome-Erweiterung und greife mit einem Klick direkt auf der Wiedergabeseite auf das Transkript jedes Videos zu.