YouTube Transcript:
Transcriptome analysis_ Learn library preparation and data analysis from scratch_
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
View:
[Music]
hello and welcome to explore bio
rna c chord transcriptome sequencing is
a powerful technique to characterize and
quantify gene expression
in my previous video i explained that
transcriptome is the entire set of rna
expressed
at a specified time in a particular
biological sample
its importance in identifying and
studying gene expression
which is useful to study development
disease in response to stresses
i also mentioned about two major
techniques of transcriptome analysis
namely microarray and rna-seq and their
applications
if you are new to transcriptome and its
analysis you should watch my
introductory video
the link is provided in the description
below
this is the second video in the
transcription series
in the first part of the video i will
cover the basic steps
involved in transcriptome library
preparation for sequencing
in the second part i will cover the
basic workflow for transcriptome data
analysis
i hope the video will be useful for
beginners who have little or no idea
about transcriptome and willing to learn
more about it
it would also help the researchers who
are planning or currently dealing with
some kind of transcriptome work
i request you to stay tuned and watch
the complete series of videos on
transcriptome
at last i will mention some of the
important things to remember and
consider
before you plan a transcript of
experiments
so let's begin with the basic steps
involved in transcript on library
preparation
here i will focus on one of the popular
illumina mrna enrichment library
preparation
there are separate protocols for other
small rna
and microorganic library preparations
too
the first and the foremost thing you
need to start is
a high quality of rna extracted from
biological samples to be studied
along with appropriate controls for
comparison
next you proceed for mrna enrichment if
your target is protein coding rnas
else this step can be omitted here poly
a tail containing rna is captured using
magnetic beads with oligodt attached to
it
next comes the cdna library preparation
which involves series of steps
the mrna is fragmented appropriately
using chemical or heat treatment to
shorter fragments
of usually 100 to 300 base pairs that
can be sequenced
note that full length mrnas are not
sequenced unless you are using oxford
nanopore sequencing chemistry
the fragmented rna is now reverse
transcribed to double stranded cdna
using reverse transcriptase
after end repair and addition of adenine
net 3 prime end
the adapters which are short double
stranded oligonucleotides are ligated at
both the ends of cdna fragments
these adapters serves as the site for
primer binding to facilitate clonal
amplification in pcr in the next
step the adapter ligated at cdns is
termed as
cdna library which represents the
complete set of rnas expressed in the
sample and are ready to be sequenced
multiple samples are ligated with
different adapters so that they can be
pulled together for sequencing in a
single run on a machine
this is known as multiplexing after the
sequencing is over the data generated
can be demultiplexed based on the
different
adapters used cdna libraries can be
sequenced from one
or both the ends which is termed as
single end or pair end sequencing using
suitable ngs platform
the amount of sequence data generated in
the form of short reads depend upon the
sequencing platform
and the need of experiment usually 10 to
30 million reads for each sample
are appropriate for analysis
coming on to the second part which is
the basic workflow for transcriptome
analysis
once the sequencing run is complete you
will get sequence data in the form of
raw reads
the read files are usually in fastq
format which contain the information
about the sequence and base quality
qc or quality check of the sequence rate
is the first step of transcriptome
analysis
generally done using tool like fast qc
raw reads generated after transcriptome
sequencing using next generation
sequencing platform such as illumina or
roche
is processed to remove low quality reads
adapter sequences
used during transcription library
preparation sometimes
read and streaming is also required as
the basis sequence at the end of
sequencing rung
may be of lower quality some of the
commonly used tools for
raw read filtering are ngs qc and fast p
so next comes is the read alignment or
mapping the short high quality reads are
then aligned or mapped back to the
reference genome or transcriptome if
available
this is known as reference based
assembly if reference is not available
for example in case of non-modal
organisms
de novo or fresh assembly is done in
case of genome guided assembly
spliced aligner tools and for
transcriptome guided or deno assembly
unspliced aligners are used
examples of routinely used aligners are
bowtie and top head
short reads are meaningless to us unless
they are assembled to larger
and more complete sequence termed as
transcripts or context that actually
represents
mrna from which they are derived the
assembly is done based on sequence
overlaps in the reads to form a must
longer sequence in case of reference
guided assembly reads are first aligned
to the reference genome transcriptome
and then the overlapping reads are
assembled together in case of denom
assembly the reads are assembled into
transcripts without reference
most popular tools for transcriptome
assembly are trinity oss
clc genomics workbench and curve link
sometimes assembly is done with multiple
tools before finding the best one
transcripts or the contigs are further
clustered using tools like
cd heat est to reduce the redundancy
once the assembly is done based on the
alignment with the conserved orthologous
genes in related lineage the
completeness of the assembly may be
checked
example of one such tool is busco
to quantify the expression of individual
transcript the mapping file generated
during read alignment is used as input
gene level or transcript level abundance
is determined using different tools such
as rsm
solvent or cufflink the abundance or
expression level of transcripts
is represented as normalized read counts
that are mapped to the transcript
major ways to represent normalized read
counts are tpm
fpkm rpk or cpm
to compare the change in expression in
treatment versus control samples
differential expression analysis or dge
is done
various programs such as hr desec
curvedif
performs differential expression
analysis between samples after
normalizing the abundance data
p-value and fdr tells how significant is
the differential expression results
and should be or should not be
considered for further analysis
later using real-time pcr the
transcriptome expression is validated
to learn more about it i highly
recommend you to watch my video on
real-time pcr and how it is done
to predict the function of transcripts
or contigs after assembly they are
assigned
functions based on the sequence homology
against known protein in the databases
such as nr
swiss prod and tear using blast search
i have made a separate tutorial on how
to perform standalone ncbi blast on your
computer
you may watch it later subsequently geo
classification and pathway analysis may
be done
so these are the major steps involved in
the transcript of sequencing and
analysis
coming on to the last part of the video
about the things to consider for
planning a transcriptome experiment
following question should be asked
is the aim is to identify a quantified
transcript sequence
what are the biological samples controls
and number of replicates you are going
to take
how much sequencing data needs to be
generated what sequencing platform you
will use
will it be reference based or de novo
this will determine the sequencing depth
you will need
is mrn enrichment required sequential
should be single end or pair handed
do you have budget for sequencing
analysis and is access to high-end
computing available
so that's all for the today's video you
can do a lot more things once you have
the transcriptome data for example you
can study gene enrichment pathway
enrichment classify the genes
based on their ontologies cag analysis
identify orthologous groups coexpression
analysis and generate a heat map
develop protein protein interaction
network to identify interacting partners
and lot more to make most use of
transcription data generated
some of these may be the part of my
subsequent videos
in my next video i will be covering the
important terms such as reads
transcripts annotation
blast e-value bit score read count
dge n50 faster format rpk
fp tpm cpm and others which are
routinely used in transcriptome analysis
if you like the information do share
with your friends comment about what new
you want to learn
subscribe and check my playlist to stay
tuned with other informative videos
and finally thanks for watching
[Music]
you
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.
Works with YouTube, Coursera, Udemy and more educational platforms
Get Instant Transcripts: Just Edit the Domain in Your Address Bar!
YouTube
←
→
↻
https://www.youtube.com/watch?v=UF8uR6Z6KLc
YoutubeToText
←
→
↻
https://youtubetotext.net/watch?v=UF8uR6Z6KLc