YouTube-Transkript:
VLIW: The “Impossible” Computer

Kein langes Zuschauen mehr – hol dir das vollständige Transkript, suche nach Stichwörtern und kopiere alles mit einem Klick.

AutoDub

Fremdsprachige YouTube-Videos verstehen

Immersive YouTube-Synchronisation auf Deutsch

Sprachbarrieren überwinden, erstklassige Inhalte aus aller Welt genießen

Kostenlos nutzen

Videotranskript

Videozusammenfassung

Summary

Core Theme

This content traces the development and impact of Very Long Instruction Word (VLIW) architecture, a radical approach to computer processing that achieved significant speedups through aggressive compiler-driven parallelism, despite ultimately facing market challenges.

Mind Map

Zum Vergrößern klicken

Klicke, um die vollständige interaktive Mind Map zu öffnen

You are walking down an alleyway late one night, and suddenly behind you,

a mysterious dark figure appears and says.

Sir, I promise you a computer. A computer up to 20 times faster than what is on the market.

You think. Well, what sort of unholy, insane 2-nanometer silicon will this computer be using?

No, the dark figure replies. Ordinary hardware. Same as what everyone else is

using. Maybe even a bit slower, clock time-wise.

So what is the catch?

If you accept this computer, then you must also accept some

of the most brilliant but maniacal compiler software ever created. Oh.

In today's video, we trace a radical idea to a computer 10-30 times faster than anything

thought possible. They said it was impossible. An intrepid platoon of geniuses proved them wrong.

## Beginnings

I want to start with a video game that I liked to play during the pandemic called Overcooked.

In this game, you play a chef in a relatively bizarre kitchen. And

you must cook and serve finished plates of food to fulfill incoming customer orders.

The recipes are largely simple. Two that I remember - and are illustrative

of where we are going - are the salad and the hamburger. Let's begin with the salad first.

The basic salad is easy. Your chef character gets the lettuce from the bin, brings it to

the chop station and chops it, sticks it onto a plate, and then delivers to the customer.

The burger is a bit more complex. You take a bun, lettuce, and beef steak out

of the bin. You must chop the lettuce and beef steak. Do not chop the bun.

However, the beef steak also has to be cooked on the oven.

Do not cook the steak until you have chopped it first.

The world of the CPU is a little like Overcooked. A CPU is a kitchen that

takes in raw data inputs/ingredients and transforms it to get finished outputs.

The various steps to produce a salad or burger dish are Instructions.

An instruction is essentially just a string of 1s and 0s that tell the CPU what specific

action to take and on what data. Get the lettuce. Chop the meat. Plate the burger.

The CPU's life is to fetch instructions out of its high-speed memory (called its "register"),

decode them, and then execute them.

There are more steps in this lifecycle but fetch, decode, and execute are the basics.

When a programmer writes software - or vibe-codes it using Claude Code - they are most often writing

that program in a higher level language like C or Fortran that they can read.

But the CPU hardware can't read that. So the vibe coder’s program

must be translated or compiled into instructions that the CPU

hardware can follow. Chop the tomato. Chop the lettuce. Plate the salad.

This is done by a special software called the compiler. The compiler

can also do a whole bunch of other stuff, but let me get to that later.

Put together, all the instructions that a CPU can handle make up what we call the instruction set.

In a way, we can consider the CPU itself the literal physical manifestation of its instruction

set. After all, what sits inside the kitchen tells you what that kitchen is capable of, no?

## Parallelism

This all makes sense, right? Now, how might we get our food faster?

A simple way is to just make the chefs move around faster. In that,

I mean to raise the CPU's clock speed. So straight Moore's Law.

We shrink the transistor so signals travel faster from source to drain.

The fabs were doing that. But is there anything else? Early in computer history,

various scientists proposed concepts to speed up operations with a little cleverness.

Let us consider the burger. You might decide to do all the steps - bun,

lettuce, meat - one after the other.

But if you think about it, we have time spent doing nothing while waiting for the meat to cook.

Why not use that time to chop the lettuce?

So let us do the meat first. Then chop lettuce.

This is parallelism - executing various

instructions out of order for faster overall processing.

Specifically, this is a category of techniques that we call Instruction-Level Parallelism or ILP.

I hope this made sense. I am now leaving the

Overcooked metaphor behind before it ... overcooks.

## Out of Order The trouble with parallelism however is that

many instructions depend on the outputs of their priors.

You cannot grill the burger meat unless you have chopped it first. You can’t decode something until

you have fetched it first. You cannot add A and B together until we know what A and B actually are.

We generally break up a program's code into blocks.

Each block is usually only a few instructions long. Just about six.

Each block tends to end with a conditional branch like an if-else statement, function call or loop.

The name tells it all, the conditional branches create branching paths in the code.

The presence of such branches means that we don’t

know what instructions will run or what data they will run on.

Without this, we cannot run that many instructions in parallel. A famous 1970 paper looked into this,

and with a few assumptions concluded that program code on average contained

enough independent work to only do about 2 operations at the same time.

So that seemed to settle it. Parallelism can’t do that much for us ... right? But what if we decided

to throw out those assumptions? Free ourselves from all the legacy stuff? What then is possible?

## A Radical Idea

In the late 1970s, a graduate student at the Courant Institute of Mathematics at NYU named Josh

Fisher joined a project to build an emulator of the supercomputer CDC-6600. They called it PUMA.

The CDC-6600 is an iconic computer, but PUMA would try to shrink it using modern

integrated circuits. Fisher's role at the start involved making tools for computer

aided chip design. Programs to layout and route wires as well as simulation.

He then started working on the 64-bit microcode for the emulator. This microcode sought to break

down pieces of program code - originally meant to be run linearly - into smaller,

simpler blocks that can be run in parallel.

As he did this, it occurred to him that these two tasks - chip layout and code scheduling - were

conceptually similar. Both ingest one-dimensional lists and put out two-dimensional maps or grids.

A program producing a chip layout takes in a netlist - which is a

simple one-dimensional list of an IC’s transistors - and produces a layout:

A two-dimensional map of each of those transistors' placements.

With code scheduling, it was similar.

Turn a list of operations normally performed linearly - like fetch, decode,

execute - and turn it into a two-dimensional grid where those operations can be run in parallel.

This led him to an optimization technique known

as Trace Scheduling. Gosh, how am I going to do this one ...

Let me revisit the concept of a program being broken up into many

blocks. Each block is maybe just a few instructions long and as I said before,

ends with a conditional jump like an if-else statement or loop.

Such things create branching paths in the code. Since it had been assumed that

we cannot predict the future state of the program, we are left to only search

for parallelism opportunities within these tiny blocks. There are not many.

Trace scheduling busts through this assumption.

The compiler traces through the entire program like as if it is a single block and predicts

a likely code execution path using either heuristics or actual performance information.

In essence, assuming that the program's conditional jumps do not exist.

This magical compiler then aggressively schedules all

the instructions in that trace - moving up anything that can be parallelized.

These instructions are bundled into a very long instruction word.

Sometimes the trace gets it right. For example, most of the time with a loop,

it’s probably most likely that the program just jumps back to

the start of the loop again. It can just presume that is the case.

If the trace gets it right, then we blast through the program like the Millennium Falcon - achieving

parallelism speedups of 10 to 30 times, far beyond what was previously thought possible.

Sooo what happens if the trace doesn't get it right?

With existing CPUs, there are extra circuits or hardware to fix such compiler mistakes on

the fly. It is nicer for them because they know the variables at runtime.

Trace scheduling eschews that because it means more complex hardware, which goes against its

philosophy of shifting all complexity to the software. So we need something else.

Imagine we are cave diving - and I will never go cave diving - and we come across a divergence,

we might attach a string or line so we can backtrack.

Compensating Code is the compiler's cave dive string. If the compiler decides to move up an

instruction, it must add code so that if the trace goes wrong, the computer can backtrack or redo.

The compiler might also continue on tracing a different path, hoping that it eventually

returns to the fold. There is a real risk of code bloat, where the compiler adds SO much

compensating code that the thing cannot achieve the promised performance goals.

Hopefully, you can see why this can get a bit tricky. We must guess the

program's future before we run it. This compiler practically has to be a time-traveling Mary Sue.

## Hardware and Compiler Together After finishing his PhD,

Fisher found a tenure-track post-doc job at Yale, moving there in 1979.

Soon thereafter, he gathered a team of smart and talented students and tried to

implement in software the trace scheduling technique that he had written about before.

In 1980, Fisher consulted for General Electric, working to adapt trace scheduling for a computer

made by a company called Floating Point Systems. But it ended up failing because

the hardware’s fussiness and complexity made it difficult to achieve much parallelism.

Embarrassed by the failure, Fisher dug up manuals for other computers like the CDC Cyberplus and

found that those too were too complicated to get the degree of parallelism he wanted. It

always seemed like the highest possible parallelism speedup was about 2-3 times.

Fisher got frustrated. Eventually, he came to believe that the only way to get the gains he

sought was to implement both the compiler and hardware simultaneously. And he starts

thinking about what kind of architecture would such a computer need to have.

## VLIW

In 1982, Fisher submits a paper to the International

Symposium of Computer Architecture about his work.

It is titled “Very Long Instruction Word architectures and the ELI-512",

and details his new trace scheduling-centric computer: The ELI-512.

The ELI stands for “Enormously Long Instruction”,

and also serves as an inside joke for Yale people. It was a simplified RISC

device, but modified to run a pack of multiple instructions. Fisher writes:

> Everyone wants to use cheap hardware in parallel to speed up computation.

One obvious approach would be to take your favorite Reduced Instruction Set Computer ...

> let it be capable of executing 10 to 30 RISC-level operations

per cycle controlled by a very long instruction word. (In fact,

call it a VLIW.) A VLIW looks like very parallel horizontal microcode.

The paper's real audacity was less its hardware than its

Trace Scheduling-enabled compiler, named Bulldog.

Today this paper is acknowledged as one of the greats. But when it was released,

people greeted it with polite incredulity. A graduate student at Carnegie Mellon named

Bob Colwell recalls reading the VLIW paper and thinking this guy was nuts:

> I thought, he wants to do what with a compiler? This guy is nuts.

He wants to move code all over the place and then make up for that intentional code

misbehavior with yet more code. You'll never get away with it, it seemed to me,

and even if you can patch things back up, the new overhead will kill performance.

He thought the complexity would eat them all alive. Thusly,

it was inevitable that Colwell would later on down the line work with Josh Fisher.

## Notorious H.A.R.D.W.A.R.E.

As the months passed, Fisher discovered a curious effect.

He discovered that whenever he came to talk about the VLIW computer, people filled the room - mostly

to tell him that he was wrong and that his computer was impossible. The energy was electric.

By contrast, whenever he came to talk about the trace scheduling compiler technique,

it was crickets. With gentle, general agreement that why yes this can work. He later reminisced:

> You can get more people, a LOT more people,

to come to your talk if you promise them bizarre sounding hardware instead of a compiler technique.

To Fisher, this made little sense because to him the hardware was the easy part! The compiler does

all the hard work of arranging and scheduling the instructions across blocks for parallelism.

The hardware just does whatever the compiler tells it to do.

Get the compiler right and the rest falls into place. And why does this

approach feel so alien to people? Is this not what John Cocke of RISC fame also advocated?

Fisher started writing papers evangelizing the VLIW approach,

with provocative titles that got the people going. It brought him notoriety,

but young and restless, he soon realized that he wanted more.

People kept telling him that his computer was impossible. He wanted to prove them wrong.

But building such a computer from scratch took more resources than

what can be found within academia. And the big computer companies like IBM or

DEC seemed to have no interest in funding something so unproven as the VLIW technique.

## Founding Multiflow

The early 1980s were an interesting time for hardware startups.

In 1979, new US laws allowed pension funds to invest into venture capital funds,

greatly expanding their assets under management.

Total venture capital funds would grow ten-fold between 1980 and 1989.

A lot of this VC funding went into computer startups seeking to challenge

entrenched players or just try new things. Apollo Computer, Silicon Graphics, Compaq,

Thinking Machines. They were all founded around this time.

As the VC boom grew, computer science academics

began leaving to found their own computer hardware startups. Much

like how many university researchers in AI today are leaving to do AI Neolabs.

After five months of pondering and consulting with family and colleagues,

Fisher too decided to left Yale and do a startup. He was joined by his

graduate student John Ruttenberg and systems manager John O'Donnell.

Since this was the VC boom, they naturally wanted to take VC money. But in late 1983,

they met with Apollo Computer, the famed workstation maker. Apollo offered to fund

the VLIW computer's development. Once it was done,

Apollo would market it. To get started, the company offered a $500,000 loan.

Now for a name. The obvious one was VLIW Technology - there was a company out there

called VLSI Technology too - but Josh Fisher wanted something warm and fuzzy,

because the term VLIW by then had already gained a little notoriety.

They pondered "Mercury" - because it was in the similar line as Apollo - and Elm

City Supercomputer - because that was their city.

In the end, Ruttenberg coined the name "Multiflow", which seemed to

convey the visual of logic flowing through the computer. Despite concerns that people

might think their company drilled for oil or produced high-tech toilets, they chose it.

And thus in April 1984, Multiflow got started. Just six months later,

the Apollo deal collapsed after its CEO was replaced. With money running low, Fisher and the

other cofounders went to the VCs - personally borrowing money to keep paying employees.

In February 1985, they closed a $7 million round. Another $26

million over two rounds would be raised over the next two years.

By the way, before we continue, I want to highly recommend the book "Multiflow

Computer" by Elizabeth Fisher. Elizabeth is Josh's wife, and thus had a front-row

view to the whole saga. It is fantastic. Read it for an in-depth look into life at a startup.

## The Mini-Super Boom

Multiflow sought to develop and sell high performance computers

for the scientific and engineering markets.

These silicon monsters do the most complex, time-consuming calculations. Often with many

decimal points of accuracy. They also cost upwards to tens of millions of

dollars which restricted use to the biggest government labs on a time-share basis.

What if we can produce a supercomputer with a significant percentage of performance at

a fraction of the price and size? This would make more compute power available

to companies who needed them to run increasingly special calculations.

There had always been "small supercomputers", but those truly were supercomputers - made by

traditional supercomputer makers like Cray or Fujitsu. They were still quite hefty.

Then in 1985, a small startup in Texas called Convex released the C-1 computer.

They marketed it as a "mini-supercomputer", or "mini-super" or "super-minicomputer" - and called

them a new category of computer. Some people dubbed them "Crayettes" which is funny to me.

Enabled by advancing VLSI semiconductor technology, the C-1 was less a small

supercomputer than a souped-up minicomputer. That made them more of a threat to the famed

minicomputer-maker DEC than Cray. But anyhow the category took off with a wave of new entrants.

There were a lot of these guys. There was Alliant Computer Systems, founded the same year as Convex.

Scientific Computer Systems, which booted up in 1983; And then after that,

a half dozen new companies like Cydrome (remember this name), Gould, and of course Multiflow.

Multiflow would arrive in the second wave of this supermini boom,

meaning that they had to take on several established players. They

had turn the ELI-512 paper's concepts into a working, manufacturable product. Then

convince actual customers that the "impossible computer" was indeed real and worth adopting.

Over the span of two years, the team frantically worked days,

nights and weekends to 2 AM or later to put together their first computer: the TRACE 7/200.

## The Hardware

The VLIW philosophy says to keep the hardware simple.

Keep it simple so that it can be manufactured fast and at scale.

The TRACE computer had multiple execution units to do arithmetic, logic, floating-point numbers,

loading from/storing to memory, plus a conditional branch unit.

If the computer was to be highly parallel ... if you want to give the compiler the greatest

freedom to do whatever, whenever ... then the device had to also be unusually interconnected.

Fisher's original paper showed a rough sketch of his vision. The global interconnection

had 16 clusters - which contain the various execution units together with their memories.

They are interconnected both to their sides and across with buses, or wires.

The whole thing looks vaguely abyssal - as

if we are trying to summon the VLIW demon from the architectural depths.

Manufacturing such a complicated structure conjured similar terrors. The TRACE computers

targeted scientific use cases, which demanded larger 64-bit double precision floating point.

That is a lot of data, so very large buses. 64 separate copper pins plus

the control signals on a connector. And with dozens of buses - remember

to count the buses to the sides as well as across - that adds up to thousands of pins.

The persistent proliferation of petite pins occasionally made

it difficult to plug them all into the computer's backplane.

The aforementioned Bob Colwell writes that the hardware lab had a big, heavy, and world-weary

rubber mallet to coax these VERY expensive pins into their place. They called it "the Persuader".

Elizabeth Fisher relates a story that happened as

the hardware design approached ship date to its fabricator.

It was the night before the deadline,

and the hardware team was trying to fit all the memory registers into the computer,

which was a struggle because the hardware had to be so interconnected and space was so limited.

If you recall, the register refers to the high speed memory holding

the chip's runtime variables as it does stuff.

The spec called for the registers and arithmetic units to be fully interconnected. But the hardware

team couldn’t fit and connect all that together. There simply was no space.

Desperate to make the deadline and with no one on the compiler team around,

the hardware people saw no choice but to split the

pair of register chips - attaching one to the outsides of the two arithmetic units.

The compiler team was infuriated because this in certain cases can create a split

brain problem where the two arithmetic units see different things and disagree. But it was

too late. The silicon was locked in. Fortunately it wasn't catastrophic.

## The Compiler

But the core of the TRACE series was not hardware. It was the software, the compiler.

As I mentioned, the hardware is so simple because the compiler takes

on all the complexity. Anything that can be shifted over, was.

Each cycle during operations, the CPU fetches its very large instruction

word from its memory with the multiple instructions that the compiler bundled

together. After the word unbundles, its instructions go straight to the units.

For this first 7/200 computer, each word had up to 7 instructions bundled together

and is about 256-bits large. Multiflow later released the 14/200 - which packed

together 14 instructions - as well as the 28/200 with 28 instructions and 1024 bits.

Without hardware circuits to direct the flow of data through the buses

or resolve memory conflicts at runtime like with other CPUs,

it is all on the compiler to coordinate that. It's got to do everything.

Modeled on the Yale Bulldog compiler, the Multiflow TRACE compiler turns programs

written in Fortran or C into high performance code for the computer. It does this over three phases.

In phase 1, the compiler takes the Fortran and C

code and turns it into an intermediate representation called IL-1. The idea

is to capture language-specific rules and programmer intent for later phases.

In phase 2, the compiler takes the IL-1 representation and reinterprets

it again at a lower level for the machine. It runs an optimization

step to reduce the amount of computation and increase the amount of parallelism.

For example, the compiler addresses loops by unrolling them - copying the loop's body for

some number of iterations as determined by some heuristic for the scheduler. After cleaning up

some variable names, the unrolled loop is ready to be exploited for max parallelism.

The output of phase 2 is another intermediate representation called

IL-2. We are now finally ready for Phase 3. This is where the

actual trace scheduling algorithm is run and instructions are scheduled.

As I said earlier, the algorithm runs through the program code and guesses a

likely path using heuristics or profiled data given to the compiler by the user.

After scheduling, the algorithm will insert compensation code

to cover up any potential off-track branches.

The result is a compiler that enables the TRACE computers to outperform a RISC-based

MIPS computer in well known benchmarks like LINPACK anywhere from 2 to 10 times.

Real world performance however did depend on the individual program, which infuriated

salespeople on both sides to no end. "Your mileage will vary" was a common refrain.

It's definitely not perfect. One flagged issue was that the compiler runs very

slowly - four times slower than one of DEC's RISC-based workstations. In part

because the Multiflow compiler creates six representations of

the program throughout its three phases. Compilation sometimes took up to 3 days.

Nevertheless. It was a marvel. The trace scheduling algorithm worked. The Multiflow

compiler team were wizards, and produced a software program that was surprisingly

reliable for something that had to literally predict the future.

## Debut

Multiflow debuted the TRACE series in April 1987 at a glitzy event at the World Trade Center.

Multiflow lined up three beta customers including the Supercomputer Research

Center - a division of the US NSA. They all gave glowing endorsements. Grumman

Data Systems said that the computer was running their software two hours after being uncrated.

They then took the TRACE to a 1988 supercomputing conference held in Santa Clara. For years,

people had told Josh Fisher that VLIW was impossible. But now here it was,

running UNIX and working like a real computer.

The CAD chief at Sikorsky Aircraft said about the TRACE:

> To many of us, what the Multiflow people told us it could do seemed

like black magic ... [but now] not only do you have a reasonably priced

supercomputer, but you don't have to rewrite software significantly

To convince people that they were not taking a risk on some "radical architecture",

Multiflow launched a massive PR and marketing program.

The campaign was masterminded by Brian Cohen - PR machine and future angel investor.

He threw himself into the task and so completely believed in it that he even named his son Trace.

It helped that the computer was blazing fast too. The TRACE 7/200

did 53 million instructions per second and 30 million floating point operations per

second. The followup 28/200 boasted specs four times higher than that.

The story also sold well. Fisher was a willing subject with a compelling

personal story. And the VLIW technology itself was intriguing. The notion of this

compiler correctly guessing some 90% of the branches in a program is eye-catching.

The computer's debut got covered by a wide variety of press outlets, including a full page

in Business Week. All in all, it was a triumphal moment. The impossible computer was real.

## Selling the TRACE

Multiflow originally targeted scientific customers: University and government labs.

Such labs wanted supercomputer-like performance at a fraction of the price.

A Cray would cost maybe $5 million as compared to a TRACE's $300,000.

They can load in their Fortran-coded programs and

let the complier optimize it without additional modifications. Moreover,

scientific application programs were thought to have lots of opportunities for parallelism.

The computer turned out to be very useful for commercial users too. By the end of 1989,

they sold about 100 machines to 75 customers and about half

of those were commercial like P&G, Hewlett Packard, Motorola, and more.

In 1989, Multiflow released an upgraded line of machines:

The 7/300 series - four times faster than their predecessors. Impressively,

those gains were almost entirely achieved with an improved compiler. Just software.

Unfortunately, it was also too late. By then,

the company was already in a financial tailspin that it would not escape from.

## The End of the Minisuper

The minisuper boom had attracted a rogues' gallery of players like Convex, Alliant,

Cydrome and DEC. You can count up to 20 vendors in the market.

But in 1987, analysts had estimated the whole market size to be just about

$350 million. Considering it might take $20-30 million to develop a minisuper,

you don't need to think a long time to conclude that 20 is too much.

The presumption had been that the Crayettes would go after the real

deal. But supercomputer vendors like Cray upped their game. At the high-end,

they added the Cray Y-MP with way more power than the minisuper can offer.

Cray then shored up their low-end flanks with an extension of the older Cray X-MP.

Though pricier than a minisuper at $14 million, it offered compelling

price-performance for labs and weather stations that could not afford the Y-MP.

Moreover, Cray defanged many of the minisuper vendors' top selling point by

adopting a flavor of UNIX called UNICOS in 1985. This grew their platform and

gave national labs confidence that their applications would work on Cray hardware.

## The Killer Micros

But it was on the low end where the most serious competition was: The "Killer Micros".

It is a phrase that emerged in the 1980s that refers to powerful UNIX

workstations equipped with highly integrated single-chip CMOS CPUs.

Such RISC chips like Sun's SPARC, IBM's RS/6000, MIPS, and even Intel's i860

were getting more powerful each year - subsuming computer categories that once

existed like minicomputers and the minisupers. Put another way: Convergence, driven by the CPU.

Multiflow gained some benefits from having its CPU set up as clusters of discrete

compute modules. But this separation also meant that they could not benefit

from the exponential scaling of Moore's Law - which granted both size and power advantages.

And while Multiflow might have had the best software in the business,

their hardware was painfully lacking. George Weiss at Gartner Group would say about them:

> "The technology was good; they put tremendous effort into software,

but they needed to duplicate that on the hardware side,"

The low-end i860 ran at 25 hertz and used just a few watts. The TRACE on the other hand ran

at 8 hertz and needed multiple kilowatts and big copper buses to distribute that power. In the end,

ever faster cycle times let the Killer Micros make up for any architectural disadvantage.

And at just $100,000, these workstations' price points cannot be beaten. Frankly,

big iron mainframes just fundamentally could not

keep up with the greatest cost scaling items in human history.

Analysts had once estimated that the mini-supercomputer market would almost

quadruple to over a billion dollars in 1991. That never happened. Instead,

starting in the summer of 1988, the whole category began to implode.

Vendors resorted to steep price cuts. And when that didn't work, exited stage

left. Celerity Computing, a San Diego-based UNIX vendor who tried to enter the space,

fell apart and was acquired in 1988 by Floating Point Systems.

But Floating Point itself was also struggling. Alliant too reported quarterly losses.

Cydrome - the only other minisuper company pursuing VLIW - folded without

commercially shipping a product. The losses came fast and hard.

## The Flow Ends

When the US market started to crash in 1989, Multiflow found itself on the wrong course.

The company lost money from the start - always one step behind

Convex and Alliant. They took too long to enter international markets

like Japan and Europe - not expanding there until it was way too late in 1989.

Weiss, the Gartner analyst, added that great but not game-changing

performance gave the company little chance to overcome its ecosystem disadvantages:

> "Multiflow never really delivered a dramatic performance — certainly

not enough to grab market attention ... Convex has a more aggressive sales force,

a more sophisticated hardware platform and had more software ported earlier in the game."

As 1989 came to a close, management increasingly focused on an acquisition by DEC as their last

chance. DEC deeply evaluated the VLIW technology as a potential platform for future work.

And while it passed many of their evaluations,

powerful voices both in and out of the company shot it down. Particularly

those behind a competing high-speed computer project called the VAX 9000.

In Christmas 1989, DEC started to backpedal - saying that it could

not do a deal right then because adding Multiflow's expenses to

the income statement would cause them to report their first financial loss.

And then in March 1990, DEC told Multiflow that the acquisition deal

was truly dead. Two of the company's venture capitalists tried to salvage it but failed.

There was no Plan B. They were out of money. With that, the board decided that Multiflow

should voluntarily liquidate. At the time, the company had about 160 employees. They gathered

for a meeting the next day to hear the news - and then got to work dissembling the company.

## Conclusion

In the end, Multiflow the company did not find the economic success that it desired.

But judging by what they were able to do and their influence on the computing world,

they succeeded beyond their wildest dreams. And ironically,

going out of business perhaps helped spread its ideas farther and wider.

The sheer amount of talent they had gathered was shocking,

considering how small they were. Fisher is a winner of the Eckert-Mauchly Award - widely

acknowledged as the most prestigious citation for computer architecture.

Another winner was the aforementioned Robert Colwell, who joined Intel and

became the chief architect for iconic chips like the Pentium Pro, Pentium II,

III, and Pentium 4 CPUs. Man is a legend.

To many of these employees or "Multifloids" as they called themselves, the company's failure

had less to do with the architecture than the business environment. This thing worked,

and it was capable of incredible performance.

So when the business wound down, the talents joined other companies like Hewlett-Packard,

Intel, DEC, and others and evangelized their ideas.

So VLIW lived on. Most famously - or infamously - with Hewlett-Packard and then of course,

Intel. But that is a story for another day.

Klicke auf einen beliebigen Text oder Zeitstempel, um direkt zu dieser Stelle im Video zu springen

Die meisten Transkripte sind in unter 5 Sekunden bereit

Mit einem Klick kopieren125+ SprachenInhalt durchsuchenZu Zeitstempeln springen

YouTube-URL einfügen

Gib den Link eines beliebigen YouTube-Videos ein und erhalte das vollständige Transkript

Die meisten Transkripte sind in unter 5 Sekunden bereit

Unsere Chrome-Erweiterung installieren

Transkripte abrufen, ohne YouTube zu verlassen. Installiere unsere Chrome-Erweiterung und greife mit einem Klick direkt auf der Wiedergabeseite auf das Transkript jedes Videos zu.

Zu Chrome hinzufügen – kostenlos

Funktioniert mit YouTube, Coursera, Udemy und weiteren Lernplattformen

Transkripte sofort abrufen: Einfach die Domain in der Adressleiste ändern!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube-TranskriptDeine Ergebnisse werden vorbereitet …

YouTube-Transkript:VLIW: The “Impossible” Computer