YouTube Transcript:
Bench-marking ZSTD for our Linux distribution!
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
View:
welcome everyone to more open source
linux development stuff here today i
wanted to benchmark set standards the
latest and greatest compression
20 years ago rock nukes was the first to
one of the first to adopt b sip 2 which
at the time 20 years ago was the latest
and greatest but obviously
and now t2 was a one of the first
adapted standard
maybe even the like nearly kind of sort
of first up that's that that's not for
william's new situation um so what that
means is compressing our binary packages
and builds the sources like firefox
compressing those and using sets and not
here i made a video probably three years ago
ago
um in the meantime some other linux
distributions also switched to set
standard for its
extremely outstanding decompression
speed but still maintaining quite some
compression and the
the
so the the thing is i
we have here
like two use cases we have downloading
the sources we are recompressing them
for local storage
because they might come in a lesser
format and also for our mirrors plus
we have the binary packages and
i thought so for the mirror we actually
use like set standard
ultra 20 and so what is confusing i
thought that was the highest maybe i
should have or maybe it's not the highest
highest
because i only now realized the main
page says
ultra by the way ultra is guarded behind
that so the higher levels are protected
and guarded
with ultra to avoid people accidentally
shooting themselves into the foot
because it requires levels 20 plus
maximum 22 require significantly more memory
memory
and this is also why i for probably the
ps3 but in general
vintage and rachel like even 486 right changed
changed
the tower creation
for 32-bit platforms because otherwise
you need too much swap and it obviously
becomes lesser performance it helps
nothing if it's the highest performance
stuff if it needs so much memory that
it's constantly
uh swapping to death
and that is why i changed this just
recently some months ago and also kind
of the point of this video and probably
i should make more presentations like
video stuff for more people tutoring but
anyway today we quickly live streaming
the work because now i've noticed
um so here we use 21 or um
um
oh i changed this already sorry
we use 16 and 1620 and i was just
toying around here and so i can also
live stream that so it was 20.
so i guess we will not uh
uh
yeah so
the thing is instead of just
ballparkings i think i came to 20 some
three years ago which probably after 20
it became just way too long to compress this
this and
and
16 i don't remember um because
because
i read loss of the point of the videos i
read now in ultra
just i think because we yesterday uh
briefly touched this right and
and
so that sounds like maybe 19 is enough
and also yeah because yesterday i
noticed the power piece so the issue is
now the power pi power pc iso has become
so much larger
and what i actually
quickly aka famous quickly
five hours later
welcome everyone epson
what are you even freaking
posting there anyway
we should actually benchmark that a
little bit more proper theoretically um
it's also a little bit funny that was
all the sets down that fame
they mean they made a
not so old thing i mean basically
probably even i could make a better
video if i wanted to
but it's a little bit funny that this major
major
projects didn't update their stuff
stuff
you would sing such big open source projects
projects
like at facebook theoretically they
could actually update this graph for
each version but i guess they are mostly
a year or two year gcc7 and
i mean i understand compiling this takes
quite some time has seen your life on
i would expect uh people working at
facebook would automate that
so let's take something not as that
that
large or so
maybe let's take lvm
lvm lvm
lvm
that here so is it soc this was epic
thought ripping so this was originally
400 megawatt for eight megabyte
so for
um so that's directly we should create some
some
some nice graph for um
um
i in
time
and then i prepared to read something
somewhere all the time [Music]
all right it's also like user
yeah it was
capital and what did i use okay
so that what i want to know is um
the point is also test the stuff
professionally right and maybe two yeah
so that so that should be the
a maximum resident set size i
would hope that is all allocations i
mean this is also that do they explain that
that
better yeah so yeah you you really need
to know unix stuff to
be sure anyway
we get all the time so
elapsed real time [Music]
[Music] um
um
combine that so what we want to know is
how long did it take
so that would be total user time
for multi-processing systems also we
want yeah okay let's let's just use it
like that for now and
maximum memory size so that is um
so this is not even this is not what we
really want to know but let's first
start with that sets.net compress llvm
and writes it to lvm await also with
dash level so that maybe always ultra
i hope that should only unlock this level
level
because otherwise we would need this for
over 20 and
and
so let's see maybe also
yeah controlling the of course [Music]
[Music]
let's print this out
and so theoretically we should make some
nice graph so this is compression the
compression we actually don't care that
much about we just need those files i'm
only i mean my
epic thread ripping
horizon systems you can basically
compress relatively as long as they want
until it becomes too annoying but you
also see already how much the time
increases there
and this might also be why i've chosen
20 and not like 23 or whatever the
maximum was
this is also the memory alright so this
should be the memory
and yeah it's it increasingly
apparently needs more memory i should
actually research and clarify exactly
what the maximum residence set size is
if this is a maximum locations
and so actually i'm more interested in the
the
decompression stage for the various
levels which is also why i'm capturing
those here actually thoroughly so what
we probably should do for proper youtube
video is
like time
uh compressed size
maybe even megabytes per second and
and
time says all right memory
and so we could
better scripting than manually taking
your precious 486
486
and i'm
reinstalling this because when you sit
there a day obviously you don't want to generate
generate
20 isos obviously this is how you test this
this
so i'm mostly interested or i'm only
interested to see if we can use 19
instead of 16 so that our
do not become that large but you see how
long this already takes a few i also
changed recently um
in this massive
optimization and improvement phase of t2 here
here
three years live on youtube i also
changed our build scripts to compressors
in the background because obviously if
you compress such large um
um
such large archives uh you know what
maybe we could even
or maybe not um yeah because because
obviously you stall you build quite a
lot right you can already
cook although this should be relatively uh
uh
parallelizable with this mighty thread
implementation but
it's still um the next configure can
even even if your epic thread ripping 32
thread machine is
is
saturated with a stationary compression
which actually by the way we
we could await a second voice it's not
uh darn we didn't okay this was stupid
you know what
and also press do some
buttons we didn't actually we don't even
multi-thread that so we actually could
have split this up
but yeah there's also i was already
wondering whether why does that slow but
yeah you see 15 to 16 it's like nearly
doubling the time already right or 14 to
60. let's actually control see that let's
let's
do that fully mighty threaded
could it be
i realize this so t
t
should not be t
or is only compression
compression [Music]
the [Music]
compression level does it not
so yeah that should be obviously embarrassingly
it's surprisingly not fully texting the
interesting
even more of a point to continue [Music]
[Music]
uh
course instead of threads
or is it simply it could also be that
this is simply not more further
threadable
yeah thanks to graver i i figured as
much things um so yeah you see how much but
but
it's crazy i mean this is the stuff you
always think like yeah it's like
fully mighty threaded and then only live
on youtube like hold on a minute it's
not like as perfect as i
hoped it would be anyway
anyway
let's test so memory wise here it's
going further so this is only
using four max of memory surprisingly if
oh wait a second or is it in okay maybe
we should actually check the main page
in kilobytes okay so [Music]
[Music]
this is okay that is actually then quite
some significant more memory
incented
what was it i said four megabytes and it is
uh four gigabytes
is it 4.7 freaking gigabyte
and welcome everyone um
um
yeah so they also see why i didn't use
look how slow that is right 21 is still
apparently voice is not further
not really for the mighty threading interesting
interesting
yeah that also explains why i did
because last night i was wondering why
didn't i use more but no i think i
vaguely remember like it became super
slow is that what our benchmark
benchmarks are for right um
um
so this was 21
still do with what was even
what's this one gigabyte i'm slightly
if if that is true how we calculate
calculate
the gigabyte megabytes from that
[Music]
because the archive was 400 megabyte or
it should be significantly faster than
three years ago when i tested this it
was always in the news like new
optimization more optimization
better compression more optimization so
but of course sometimes also people
should actually
how large is it even oh wait a second
can we finish that
what you're saying with dash m for
uh memory
the memory usage limit by default does
not use the 128 max of full decompression
decompression [Music]
at least worst case with some
swap space
i wouldn't
really further limit this we obviously won't
won't
quote some significant compression
compression
um but thanks for that pointer i mean
one thing that would be cool is we could
theoretically like if we wanted to
optimize the heck extremely out of it we
could theoretically
let it create a dictionary from all
eyes or so
but that is a little bit too much so
basically we could have it like this
training set for this dictionary stuff and
and
like like pre-train it um
then probably dictionary builder year um
i mean very theoretically we what we
actually could do
but i mean it's getting a little bit
complex then right i mean maybe i mean
it would certainly be an interesting
youtube video potentially how much we
could get the dawn i mean not to delay um
um
not to delay building
building
all the packages at the end we could
actually train it with the first binary
package we built
like say the first package might be zero
zero d3 like this file system hierarchy
standard so it doesn't have a binary so
that wouldn't use a dictionary
but all the other
packages like like the first thing
installing some binary like julipsi or
stuff have it trained that that
certainly should be quite like the
dictionary is matching for arm
executables versus versus x86
um dictionaries and
and
so then have once the first dictionary
is built
from the first binary package installing
something binary you need to have all
the other stuff built with that
dictionary and then included into this
either that would be
potentially some
interesting uh endeavor for for rainy uh
it's not my priority i only wanted
okay so actually
23 doesn't exist
fair enough so
so
also maybe we should have how much
so we have two
two
there's even the time here
so that is the size so the memory use actually
actually
interesting the memory was was even
slightly lower by some
pure chance and stuff but
how long did it take 170 yes so it took
even longer so basically we are right
we are right now using 20.
interesting that
was 20 i mean i thought you have some
control characters
so i
it's crazy that's a slightly faster that
maybe i mean it could always be that
due to a larger window
window
it's compressing more far more earlier
or something
it sometimes matches more and compresses more
more
so this is a crazy find really to be honest
honest
yeah so
is it significantly smaller
yeah so basically we should actually for
proper youtube channel
this is basically just
just um
um
oh this is how would be and i thought it
was by size
so yes the size is
pretty sorted as you would expect and i
mean to be fair it
squeezes quite some additional
additional
compression out of there to be honest i
mean this is actually
the i mean it's actually kind of
impressive that
i mean it's three let's make about three
megabyte here and
i mean 22
it is actually crazy better compression
i mean 10 freaking megabytes i give them
and this is actually freaking impressive
to be honest
and this is the stuff that i love right
not this like ibm breaking bin utils for
page sizes and other random london's i
mean the set synapse is really
really amazing compression technology i
did not expect that
kind of advancements
um that is really cool
should we actually yeah so let's test
decompression and then
make a decision so right now we are
using 16 i mean we of course see here so
right now our 16
yeah so that is 15
13 megabytes larger than
i'm kind of thinking we could even
increase that to at least 20. i mean i
mean this level is really impressive
let's see what the decompression um
memory usage is and theoretically
there's also some win longer window mode
i wonder if
maybe we should do the same i mean it probably
probably
yeah let's let's just run this for like
some experiment when we're done with
this stuff so right now i only want to
determine the
decompression memory use and
so we don't need to save that so that
decompressive stuff there
also best with
like that so let's see
okay so this um
um
okay 23 written i mean this
decompression is so blazing fast it is
actually crazy
um i think it doesn't support the com
thread decompression does it let's
fall in of all of that stuff
that was t0
t0
but i think it's not multi-threaded it's
decompression yet i wonder maybe it's
not even easy um
um
to multi-thread so yeah um that is
extremely fast as you see so if this are
kilobytes that would be 41 megabyte
so that is 19. yes so i mean these and
the difference between actually here 16
is actually significantly lower
i mean
so that just wait a second so that is 25
but we are still speaking if this are kilobyte
kilobyte
wait a second i'm getting confused with
um [Music]
[Music]
so they said kilobyte
uh getting already
wait this should be kilobytes so that is parts
parts 25
25
25 megabytes um
i mean this is actually not unreasonable um
um
for a large package if that is what it is
is
do that again with
it if not
not
i'm confused but it even has an effect
this is a little bit yeah people always
say read the
effing menu all right and then
i wonder
i read it and wonder
actually did it because they're not equal
equal
it's a little bit
let's maybe should we
it's actually
how much
better as it apparently is i mean we
will see with the memory requirements i
i mean this needed
that's quite far away yeah i mean i wonder
wonder
but the thing is i mean doing that once
it's already quite some time as you see
your life and making nice bar grows even
takes more time we are twenty
gear so that is that is 10 freaking mega
alexander says training as a dictionary
using all available archives i wouldn't
use all i mean um
there's also a limit i'm i know from
machine learning i started also quite
some machine learning
getting a little bit dark you don't see
this on camera the camera is so bright
actually it's pretty dim here it's uh
my eyes start to hurt from from the
contrast of the dark background
it's a little bit overcast here um
so you can't you can't train a
dictionary i think it's probably enough
with all archives um this is too much
training you can't like put the
knowledge and stuff you could also
directly train that then i think it's
enough for this dictory stuff to
somewhat recognize common patterns
i mean of course it will not like
intelligently recognize patterns like
power pc versus risk five assembly
but it will do so by dictionary
statistically statistical heuristics so it
it
should be enough
with with
one of the first like binaries
any binary should be enough to recognize
common like the f header um also the
various sections some strings some
common all the
jumps and moves and stuff all the bit
patterns that all those
most current occurring stuff um
leave there in the binary
especially you think rms previous video
right i had says this arm instructions architecture
architecture
this old fashioned 32-bit arm so many
unused bits that are often
yeah let's see how the deep compression
goes i yeah i
wonder if that will i'm actually
surprised how
how small the memory usage was for
for decompression
i mean obviously the compression needs
to build all the dictionary in memory
and certainly where some
dual channel or quad and stuff memory
and it comes in and certainly plenty of that
that
here is not it changed significantly
though look at that i think previously
it went up significantly after 16
and now um
it actually significantly went up well
at 14 15 already decreased
decreased
probably due to better earlier more
22. yeah
i want to run the next thing
okay let's take a look at the sizes here
while we do that
are you freaking kidding me do you want
it oh wait no this is in in progress
right this can't be other yeah this is progress
progress
um 21 so yeah 10 10 freaking megabytes
smaller right
that is crazy
i'm kind of tempted to
last minute even rebuild all those isis
again i mean it's fully automated right
i just leave the epic threat ripping
horizon here running for
uh for 30 hours and i have all the isos built
built
as we cross compile everything
so anything else we wanted to
i did not
[Music]
there's any reason that it doesn't download
download
ah come on we need
we we are ready for
and that is the thing right do i buy a
threat ripper i need also single
threaded performance right and
i think the threadripper will not really
give me more single threaded performance
so probably for content making a proper
review video
i don't think only it
and where are we wait a second released
oh did it again i did this with okay
okay um
so yeah i'm so ready for i did not wait what
yeah
so ready for the next sun architecture
so um
so memory wise
yeah i mean this
was it
seriously running for wait a second this
is accumulated so this is not wall clock
time this is accumulated multi-threading
multi-threading
i mean yeah i mean it's excessive um
otherwise i mean we
care more about decompression let's
let's do them
decompress
let's see what happens
yeah it needs more memory though i think
yeah i think this significantly
previously had 25 max of memory right so
it's no wonder that
here is a decompression right one and
two decompress
yeah so yeah we probably can't do that
so that was i mean we are not using 11
megabyte but
so we previously had 16 so that would be 25.
25.
i would even say 19 is 41 should be fine
so i guess we can use the regular
19 for 32-bit architectures
i mean you can even decompress that if
you have a high-end 486
if if there is something like a
high-end 486. um
um
i guess it's fine even on the p3
so we we have some my memory free after
decompression in it already
and you can actually wait swap so yeah
so we can do that i would say and yes so
we can't do that though um
um
we can't do long however
i i think what we will do
is for 64-bit mode because this long leaf
leaf
is so so how much it was at 500
something right
even the octane i mean yeah there is a
thing i mean that is quite
but i would say all reasonable 64-bit platforms
i mean why not make it more convenient
for everyone right
i like to support vintage retro x cases
edge cases but in
in
in a reasonable way i would say
and i would say
512 ish max of memory
means the only thing is sga octane
but maybe we are using certitude anyway
because we default as you obtain really
i mean yeah okay i mean even
because even ultra spark i mean i mean
yeah for vintage if you i mean the thing
is supporting 20 year old
old 64 ultra spark stuff even i didn't
have so much memory in my ultra spark
maybe it's not
it's a very
let's think what else do we have we have
okay pa risk we built 32 but user land um
what else do we have alpha titanium
titanium
i mean it's no wonder that with with
such memory usage and like 512
megabyte of long window that
we get so much better compression
basically than one long is as good as four
four five
um okay wait so that ah this is why the
32 didn't delete because
it was 23.
how much i mean 20 too long actually it doesn't
doesn't
i mean given that okay all those are so
basically 18
i mean what this also means is for you
you can significantly improve
compression without investing that many
cpu cycles as 18
18
long as here in this case equivalent to
so previously we had so 22 okay i mean
what was i think 22 that
yeah so we use this is decompressing
okay this is
this was already using significant
memory in the 22 case
so this is why this doesn't improve
further because it was already using
using
so yeah what does it mean for us it
means for us this was 21
that was 270 what what have we been
we were using 20
and 21
21
gives us a little bit 21 long i mean 21 long
sorry is basically
what was it zero point i mean the
decompression time is kinda constant
actually this long decompression time is
at times
in the lower level actually faster
i mean if you use so much memory and
uh default is 27 or so window
window
defaults does this increase the window
size whenever memory is supposed to
compress and decompress it
proof compression right from doing something
something um
i wonder what results we get when we set
for like 10 to 22 compress the stuff with
with
this window must already have some value
that is [Music]
[Music]
so 10 yeah so this is also verse okay i
think it doesn't make sense to manually
specify another
okay what does it mean for us we don't wanna
wanna
we want to be able to install in
that means our
20 choice wasn't better
i mean yeah 20. i mean there was a
maybe i should yeah anyway probably we
we kind of leave it like that i guess
because 20 ultra but good it's good that
so that means for us we can safely use
19. so it's still an improvement and
that's also often a thing right you
think here we need to double check that
and prove something and then it doesn't
improve but we at least we understand
this is not better um
um
also something you wouldn't get
immediately from the manual page
especially this window you would think
that's always making it longer but
apparently sometimes it's if you specify
it's already shorter than the
the only thing is we
i mean for the building it doesn't
change our mirror default mirror compression
i mean it would be significantly smaller
right 10 max of less
maybe we could increase that a little bit
i mean if you build i mean
unless you build on vintage systems again
again
okay you know what let's play it a
little bit safe
we did some research with 20 is
yeah i mean that you can still use t2 to
download to download and build at least
reasonable small medium-sized stuff on
vintage hardware here i guess
some compromise
now that we did some research with the
memory usage
i have significantly smaller isos again
yep that's probably for this finding um
i hope you found this interesting probably
in the future i need to make proper
graphs maybe even live i know i need
some better graphing tools obviously
if you have
tips up in the 20 years ago i was using
plot equals but someone's also probably
not really maintained anymore you can
actually check that also mean it's not
the most outstanding it was just
better for what i use than a lot of stuff
but i think as the last i checked
probably the last update is decades ago 242
242 yeah
yeah
news for more you click news on like a
okay so loss of sounds they have up and
down this for some new poison stuff or something
something um
um
what's the greatest poison fan
good for them um
um
not good for me but it's there maybe
should we package that
let's put this on the to-do can i have
[Music]
the future of the looks desktop the copy
anyway but that's it then for today i
i mean i could recompress this [Music]
[Music]
maybe i just manually recompress those
packages maybe i'm not feeling like
rebuilding everything just for a second but
but anyway
anyway
that's it for this video let's
and
should we do that
nice so how much would be safe you have
925 megabyte for
f in time
time
f removed
why is that
remove thing
yeah
all right so it could be consistent with
projects like you as it's also a thing
right that i tried a little bit and it's
encourage people to work a little bit more
more automated
automated versioned
versioned
reading the manual or i think stuff
through and not always yellowing
everything together
and um
a little critical and long term
and surrounded sinking there it's very
very it's not the best practice to
recompress the stuff but
but
for one time thing
i think it's okay i'll also probably
only do this for the power pci so not
for the other ones
that was some of the largest
maybe the others are not firefox and stuff
is it even fully mighty yeah those
mighty threading is
i love my wish you could see my rgb
animations on
my cpu utilizations memory sticks
year was actually thinking we could
recompress this more in the background
actually run all of them in parallel
maybe i should have done this
now i forgot already how much was it 100
yep but for the most part so it's not is amazing
i really wonder yeah we probably maybe
we make this another video trying this i
actually didn't try this dictionary
based training thing there
and that's then we probably better do
where are we linux i mean there are
so very soon the further updates data is done
done
data certainly also the best invention
yeah but so this workload also doesn't
uh profit from uh threadripper right so
the horizon here is as you see this
this
parallel compression stuff only goes so far
far
i'm actually surprised it doesn't do more
maybe we should create a freaking manual
compressing using threads
detecting a physical cpu cores
[Music] hmm
okay now we're done
yeah so from
125 so that is over 57 so 75 megabyte
that is certainly something
the only other thing i wonder is how
much the mighty threading is reducing
the compression but
but
i guess we tested enough for today um
so yeah
that is the summary and how to widen the
pros and cons of this stuff ultra 20
um
and i'm kinder in the flow i wonder if
test how much
i don't i don't really have the time now
use
set standard nineteen [Applause]
how much was it this was
800 was it 850 of 925 i think [Applause]
[Applause]
i said standard 19 instead of 16
16 f4
okay [Applause]
something of that i would say um
um so yeah that's it for those videos the
so yeah that's it for those videos the only other things it's only with the
only other things it's only with the dictionary stuff
dictionary stuff we do this another video maybe i should
we do this another video maybe i should actually maybe i make a note otherwise i
actually maybe i make a note otherwise i might eventually forget about this
there are additional complications not only we would need some surrounding
only we would need some surrounding scripting for training the dictionary
scripting for training the dictionary and then making sure obviously to use
and then making sure obviously to use the same dictionary for all tables and
the same dictionary for all tables and additionally changing our iso creation
additionally changing our iso creation to include actually this dictionary
to include actually this dictionary right my understanding is
right my understanding is um you need this dictionary then um
um you need this dictionary then um with
with in your iso
in your iso so
so um it would need to require a little bit
um it would need to require a little bit of surround scripting and tooling to
of surround scripting and tooling to make sure this and even the installer
make sure this and even the installer right we would even need to patch we
right we would even need to patch we would even need to patch our freaking
would even need to patch our freaking outdated vintage mine binary turbo
outdated vintage mine binary turbo extraction thing to
extraction thing to run
run runs that standard with
runs that standard with just this dictionary so
just this dictionary so yeah probably not really worth it we
yeah probably not really worth it we will test it another day obviously
will test it another day obviously because just to test it and
because just to test it and be smarter and learn something
be smarter and learn something hope you don't forget to subscribe
hope you don't forget to subscribe have a great day or night and i hope to
have a great day or night and i hope to see you soon for the next fun stuff to
see you soon for the next fun stuff to come
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.
Works with YouTube, Coursera, Udemy and more educational platforms
Get Instant Transcripts: Just Edit the Domain in Your Address Bar!
YouTube
←
→
↻
https://www.youtube.com/watch?v=UF8uR6Z6KLc
YoutubeToText
←
→
↻
https://youtubetotext.net/watch?v=UF8uR6Z6KLc