This presentation introduces a large-scale, multi-omics population cohort study aimed at understanding the complex interplay between genetics, environment, and health to develop personalized medicine strategies.
Mind Map
Click to expand
Click to explore the full interactive mind map • Zoom, pan, and navigate
my name is Jordan Marino I'm an
associate professor and group leader
here at cbmr welcome to everyone and I'm
today delighted to introduce the next
speaker Iran seal uh inan is a professor
uh at the department of computer science
and Applied Mathematics at The wman
Institute of Science in Israel his
research focuses on how genetics G
microbiome and nutrition affects health
and diseases with a special focus on
developing batti bioinformatic tools to
better understand the relationship
between health and disease his aim is to
develop personalized medicine strategies
based on large scale and deep phenotype
uh course Iran has been highly
successful in publishing at top high at
high top journals uh he published more
than 200 uh manuscripts and has Reed
several Awards and honors including the
Overton prize and the Mikel Bruno award
for his contributions to the field of
bioinformatics and with all of thisan
I'm delighted to have you
here thank you
you
yeah thank you Jordy and thanks the
organizers for the kind invitation to uh
speak to you today uh and tell you about
a large scale population cohort that uh
we've been uh collecting um we were
motivated to do this by um seeing other
uh data collection efforts uh starting
from the Human Genome Project uh which
if you look from the year 2000 until now
has been really a transformative project
uh taking us from knowing just uh a
handful of genetic variants involved in
some diseases to where we are today
after profiling and genotyping tens of
millions of people where we now know
hundreds if not thousands of genetic
variants for virtually every disease uh
but of course the genome is uh just what
we're born with and doesn't take into
account environmental factors exercise
nutrition uh and so on and for that
reason like we had the Human Genome
Project we decided to initiate the human
phenotype project to phenotype people uh
very deeply with the hope that this type
of data would allow us to identify
trajectories of disease and find uh
novel um disease biomarkers uh and so uh
to that end what we did was several
years ago we started a clinic at The
whitesman Institute uh where um before
participants uh come in
uh they um undergo on the right hand
side you can see uh they upload their
electronical health records they fill
out hundreds of questioners about
medical uh history family history of
disease and uh and so on uh so we know
all of their um medical status then they
come to the clinic uh they undergo uh
various anthropometric measurements so
various body measures hip and waist
circumference uh hand grip strength
cardiovascular assessments like ECG uh
voice recordings which has correlates to disease
disease
then various Imaging modalities we use
dexa to look at full body composition
and bone density high resolution scans
of the retina from which we can see
blood vessels it's considered to be a
window to the heart we use ultrasound to
look at fat in the liver and the kerds
also for cardiovascular health uh then
various sensors that people go home with
continuous glucose monitors track
glucose levels every 15 minutes uh
sleeping devices that track quality of
sleep I'll I'll show that later in the
talk um smart watches track physical
activity while people are connected to
these CGM continuous glucose monitors
they log all their dietary intake which
they're incentivized to do because they
get a report of which meals Spike and
which do not Spike other glucose levels
they track medication they take and
physical activity and we biobank uh
samples we have uh both live cells from
everybody pbmc's and uh serum uh and
maybe the most distinctive feature is
connecting all of this with the more
Advanced genomics and multiomics uh
testing which includes whole genetic
sequencing so we also look at the human
genome but then also uh gut and vaginal
recently microbiome metabolomics which
looks by mass spectrometry at thousands
of different small molecules uh in the
blood proteomics from blood RNA
sequencing in bulk also from the pbmc's
and a novel immune assay that we develop
whereby uh we can synthesize hundreds of
thousands of antigens and
then see which of these are bound by a
person's antibodies in the serum so if
you will this is a snapshot of the
entire inous history of a person so all
of this we did on over 12,000 people in
Israel uh it's a longitudinal cohort uh
people get profiled every two years um
we can look at this data also and
partition it by body systems we're
profiling 17 different systems of the
body on the right hand side you can see
uh also some uh kind of very global map
of how the features of different body
systems relate to each other uh this
cohort has about 8,000 people who have
one or more um of these morbidities
shown here mostly cardom metabolic
morbidities and about 4,000 by exclusion
for not having any of these are
considered uh to be healthy uh and it's
very nice to be able to uh study on a
standardized platform everybody measured
in the same way and compare all of these
diseases uh to each other we also have
about uh 30 what we call medical
conditions not diseases but still
interesting to study like various
vitamin deficiencies allergies and uh
and so
on um now um um because we see the value
of this data which will be the main
focus of of the talk uh today our our
discoveries we're now in the process of
trying to significantly expand this with
the next Milestone we hope to achieve is
100,000 people both by uh doubling and
tripling this cohort in Israel but also
expanding to other countries where we
can explore more genetic cultural ethnic
and so on diversity and in 2024 we're
actually now building clinics in Abu
Dhabi and uh Japan we'll start to uh to
add these uh to the cohort um now of
course this is not the only large scale
population Court there are quite a few
uh that are being carried out worldwide
the most impressive one in my view is
the one uh in the UK called the UK
biobank it started 17 years ago uh half
a million people profiled also uh
longitudinally and the impact that this
project uh had is is really enormous it
really changed the way that we do
biomedical research uh 40,000
researchers from 100 countries are
working on this data collectively they
published over uh 10,000 papers um you
can see how it grow uh over the years
with close to 3,000 Publications in
PubMed uh just last year uh in
comparison we obviously don't have uh a
that's as big but we have several uh
unique features like uh it's going to be
the first cohort to be International um
as I mentioned spaning multiple
countries and then all of these other um
measurements shown here mostly the more
advanced uh multiomic measurements that
we have that are not part of the UK
biobank are unique features to this
cohort which is why I believe that this
is will be complimentary to the UK
biobank and uh like the UK biobank we
are also giving uh access to to uh the
cohort um free for uh academic academics
um uh uh for uh for research so uh in
the remainder of the talk I'll tell you
about discoveries that we've been making
uh from studying uh this cohort uh and
I'll go um pretty briefly on on uh on
each of these different projects just so
that I could uh cover and show you the
breadths of what could be done with this
type of data and the first project I'll
start with is something we actually
started 10 years ago it's it was even a
a pre precursor cohort to the large
cohort where we profiled a thousand
individuals with continuous glucose
monitors and found uh that people have
um very different blood glucose response
to food as you can see by the four
different curves here of uh the two
hours after eating four slices of bread
very different between individuals but
uh lines of the same color correspond to
the same person on different days
showing very consistent uh Behavior
within a person different between people
and using uh clinical dat and gut
microbiome data we show that we could
fairly accurately predict um the actual
personalized responses every dot is um
an individual meal uh where the x-axis
is the area under the glucose curve
that's actually measured after uh eating
that meal and the y- axis is the
predicted one uh according to the model
um and then uh we could show that you
could uh take this uh such an algorithm
and take individuals who are
pre-diabetic they have many spikes in
their blood glucose levels over the
course of a week and with a diet that's
equal in the amount of calories for
every single meal we could uh prescribe
a diet that fully balances uh their
blood sugar levels uh after publishing
this on a short-term intervention we
went and did a full-blown uh randomized
clinical trial on 200 people with
pre-diabetes randomized them into either
a standard of care Mediterranean diet or
uh the algorithm diet uh and you can see
anecdotally one participant before the
intervention many spikes in glucose
levels at the top and at the bottom in
the one month of CGM tracking after the
intervention on the algorithm diet
virtually uh eliminating these uh these
spikes and glucose lels and
statistically looking at the primary
outcomes in the center panel uh
hemoglobin A1c which is the average
glucose levels in the past 3 months in
the red curve you can see an improvement
on the Mediterranean diet after 6 months
of intervention but we get double the
Improvement on the algorithm diet and on
the right hand side we have multiple
metabolic parameters that also uh
improved um uh in people following uh
following this diet some of them
improving more significantly on the
algorithm diet compared to uh the
Mediterranean diet so uh after we
published this uh we were also
interested in kind of going a little bit
more deeper into uh gaining insights on
the mechanism by which this dietary
intervention actually uh had an effect
on various metabolic parameters and I'll
show you um a few analyses that we did
that convinced us that at least part of
the effects uh part of the way in which
the diet induced beneficial metabolic
effects was mediated by the uh by the
microbiome namely by the changes that
the diet had on the microbiome and we
showed this in two different ways uh
first um just by looking at the uh all
the intervention uh and what happened uh
when you compare samples at Baseline
samples after 6 months of intervention
we saw many changes at the center in um
gut bacterial species and at the right
hand side on different metabolites that
we measure with MPC uh through uh
metabolomics and then uh in previous
work done on a different cohort uh we
showed actually that uh quite a lot of
the metabolites several hundreds of the
metabolites could be very well predicted
by gut microbiome composition so here's
an example of four metabolites that on
heldout samples are predicted quite well
from the microbiome composition of
individuals and so what we did was we
took these models developed um prior to
this work and on a different cohort and
we applied them to the samples on the
dietary intervention both to Baseline
samples and samples taken after 6 months
and because there were changes in the
microbiome there were also changes in
the predicted levels of certain
metabolites according to the model and
we then compared them to the actual
changes in the same metab metabolites as
measured by metabolomics on the dietary
intervention between Baseline and uh
post intervention 6 months samples and
we saw a fairly good uh correspondence
meaning that uh we believe that some of
the metabolite changes that we saw in
people are in fact mediated by the
changes that the diet induced in the
microbiome because we could predict
these changes in the metabolites based
on the changes in uh the gut
microbiome um another way in which we
looked at this was by mediation analysis
and before doing the mediation analysis
we just look at looked at a direct
relationship between how diet uh affects
uh changes in the gut microbiome and so
for each gut bacterial species we
predicted uh its levels um on the on the
big cord that we have of uh 12,000
people predicted it based on dietary
data that we had and uh quite a lot of
uh the gut bacteria we can actually
predict from uh what people are eating
not with perfect accuracy but with
fairly uh definitely significant and
fairly good uh Pearson correlations um
and uh if you look at some of the uh uh
best predicted species you can see that
just by diet alone you could predict um
quite robustly what the levels of uh the
bacterial species would be in
individuals and so then we could go and
do what's called a mediation analysis to
try and get at causality where in
mediation analysis you're trying to uh
uh assess whether you have um whether
the the relationship between the X in
this case diet and the Y in this case
various metabolic improvements that we
saw are uh also mediated in part by some
other factor in this case uh the
microbiome so to do this analysis you uh
construct models that try to explain uh
in this case the metabolic improvements
from uh from the diet um and uh you know
that you have the microbiome which also
uh is affected by the diet and you ask
whether the microbiome also contributes
to explaining these beneficial metabolic
outputs beyond what you can explain uh
with the diet and and we found quite a
few of these cases that they're all
summarized um in in this work I won't go
into the details but several of the
metabolic improvements like for example
the uh time spent above 140 so and on
the CGM so how many spikes of glucose
levels U mediated by particular
bacterial uh species and a particular uh
dietary component so so overall by these
two different uh analyses uh we believe
that at least some of the beneficial
effects of the diet are actually
mediated by the changes that the diet
induces uh to uh the
microbiome uh continuing on uh the
thread of the microbiome we also have um
efforts to uh try and develop direct
Therapeutics based uh on the
microbiome and uh for that what we uh
did and and recently published is uh
very analogous to what people do in
human genetics where the main method of
analysis is called gwas genomewide
Association studies where you associate
a genetic variation across individuals
with a human trait of Interest so uh
what we did was uh to develop uh and
adapt that methodology to the microbiome
whereby we look at genetic variation not
on the human genome but genetic
variation on bacterial species at the
single nucle tide level and Associate
that with um uh human traits uh and so
here is a result of that this is a
standard Manhattan plot that many of you
probably saw from Human genetics except
that instead of the x-axis now being a
human chromosomes the excess AIS is now
bacterial species every dot corresponds
to uh one particular position within a
specific bacterial genome and the y- AIS
corresponds to the strength of the
association between genetic variation at
that position and a human trade of
interest in this case we're looking at
BMI um and so overall we found over a
thousand uh different of these Snips
many of them actually they they come in
blocks so after you uh you remove uh
Snips that are correlated to each other
we're left with about 40 uh different uh
independent uh positions in uh 27
different bacterial species that showed
strong Association interestingly about
in 12 of these cases if if you look at
the abundance of the species that
encompasses that contains uh the snip
the abundance of that species is not
correlated to BMI but when you look at
the genetic variation uh it
is and uh uh I think even more
interesting than uh just um the strength
of the association is actually the uh
effect size that uh this could explain
so now I'm plotting exactly the same
data the y- AIS is still the minus log P
value the strength of the association
but now the x-axis is actually how many
points of BMI are explained by each of
these different uh Snips uh and you can
see that there's quite a few Snips that
explain one sometimes even approaching
two points of BMI that are explained by
just a genetic variation at a single uh
bacteria um U one position and and this
is uh even the these uh the size of
these effects are stronger than some of
the strongest effects that people have
uh seen on the human genome uh and the
way I rationalize that is because I
believe that the microbiome is an
environmental factor and we know that
environmental factors have big effects
on many different traits definitely on
BMI and the reason we think it's an
environmental Factor we we did work
several years ago where we showed that
genetics explains very little of our
microbiome composition so on the cohort
in Israel uh when we do just a simple
PCA plot U on the left hand side of the
genetics we can see a very clear
partitioning of different populations in
Israel coming from different locations
in the world that are known to be
genetically uh distinct but when you
look at the same microbiome composition
uh all of these populations that are
genetically uh different uh they're
actually completely mixed in terms of uh
the microbiome and there were other
analyses in in this work and followup
works also by others to suggest that the
micro the um human genetic
explains a very little portion we
estimate 2 to 4% of the microbiome
composition and the rest is is
environmental factors you get your your
microbes from uh the
environment uh we also validated uh
these uh findings on uh the BMI the
particular Snips on a Dutch uh cohort of
a similar size and we showed that the
majority of these Snips were also
significant in the Dutch cohort and
almost all of them with the exception of
one uh even if they were not significant
the direction of the association whether
it's a positive or negative association
to BMI was maintained also um in the Dutch
Dutch
cohort um if we look at some of these uh
Snips um individually uh we can see for
example here one of the most strongest
associations if you look at uh the
center here we have about 2,000
individuals that in this particular
species had the major alil for this snip
they have an average BMI of 25 and about
400 individual on the right that had the
minor Al that on average have a BMI of
almost 27 1.9 points of BMI explained
only when you partition People based on
this snip so a very uh strong
Association uh what's also nice about
the association that the resolution of
uh Snips is that you can look at the not
just the bacteria but the individual
genes that are being affected so here's
an example of another snip explaining
1.3 uh points of
that if you look at this region it
contains uh various energy uh production
genes lipid metabolism genes that you
might rationalize could be
mechanistically involved in the way in
which bacteria would have an effect on a
trait uh like BMI and another example uh
in a inflammatory pathway also uh makes
uh kind of sense that could be involved
in uh in in a trait having an effect on
a trait like
BMI um using exactly the same pipeline
I'll just briefly mention we're also
looking at Association not to human
traits but to um abundance of other
bacteria the reason is that this may
allow us to find genetic variation in
some bacteria that affect the presence
of other bacteria namely they could be
some novel uh antibiotics uh we also
look at associations uh to the same
bacteria that Harbors uh the snip that
can allow us to identify uh genetic
variation that uh um help or inhibit
induce or inhibit the growth of the same
bacteria and and quite interesting
interestingly just from a systems level
perspective we see that um uh a a
genetic variation within the same
bacteria typically has positive
associations to abundance of the same
bacteria namely evolutionarily as you
might expect bacteria evolve their
genomes such that uh the mutations
support the growth of the organism where
as if we look at associations with other
bacteria most of these are actually
negative supporting kind of the Warfare
that we know uh is going on between uh
between bacteria and also the actual
genes are very different genes in the
same bacteria that uh are associated are
mainly housekeeping genes genes having
to do with uh with growth whereas genes
uh involved in the association with
other bacteria are mainly um uh related
to metabolism
genes uh maybe in the interest of time
I'll skip this uh short uh piece um and
tell you now uh a little bit about the
uh sleep data that I mentioned so I
think this is also uh quite unique data
that we are collecting here so uh
there's a a company in Israel who
developed a device uh FDA approved for
diagnosing sleep apnea it has several
sensors uh like actograph
accelerometer um and uh another device
on the finger finger that measures blood
saturation so overall uh by tracking
three nights of sleep from each
participant we get several different uh
analog channels throughout the night
from which we can uh extract multiple
features and and something like this was
never measured on a scale of thousands
of people definitely not when it's
coupled with all the other parameters
that that we're measuring so for example
we can see uh differences between males
and females and how uh these these
measures um
of of sleep apnea actually change with
age we see that they change with age
some known Association we also see that
if you try to diagnose sleep apnea based
on these which is what the devic is is
used for uh there's a difference uh in
the diagnosis people would get if they
use it for one night or three nights
supporting uh multiple use uh uh
multiple nights in order to uh to get
robustness uh and also we see uh
different associations between all of
these different uh sleep features and
and the multiple uh body systems that we
have such that we can use the Sleep data
to predict um um with some uh uh decent
accuracy several uh measures about
people having to do with their sometimes
their blood test sometimes uh food that
they eat dietary patterns uh and so on
and also various associations with with
disease um and then the final project I
want to tell you before uh a short last
segment on on uh AI models that we're
building with with this is uh uh looking
at uh human aging uh so um because we
have so many different types of data
there there's an opportunity to see how
this data changes uh in people uh with
age uh and so first we took just the
metabolomic data um and we use the
metabolomic data to predict the
chronological age of uh of people uh
which you can see here uh the
predictions are working fairly well the
y- axis is uh the predicted
chronological age of a person using only
the metabolomic data the x-axis is the
actual chronological age of a person but
you can also see that these predictions
are far from perfect people at the top
uh have a predicted age which uh
according to the model um is is older so
uh when the model looks at their
metabolomic data it seems like uh to be
more similar to somebody who's older and
conversely people at the bottom
according to the model uh they have
bomic data that resembles more somebody
who's younger and so we ask whether
these um errors in the model or are
actually just errors in the model or
whether they actually carry some meaning
meaning that uh people at the top maybe
are actually aging faster and indeed
when we compare people at the top to
people at the bottom and look at other
independent parameters that were not
used as part of the model we see that
people at the top are in worse health
condition they have higher BMI on
average higher triglycerides hemoglobin
A1c blood pressure cholesterol and liver
enzymes so um so there uh so so this uh
this model which we call um a biological
clock in this case a metabolomic a
biological clock is is also informing us
of uh the clinical uh condition of
people and it might allow us to identify
biomarkers uh that uh for that that are
indicative of Accel accelerated age at
least according to the metabolomics data
uh but we have many different types of
data so now looking at just the retina
scans the original image is this one uh
the other images are what a deep
learning model algorithm applied to
these images see in terms of blood
vessels and and various uh features
having to do with their uh their widths
their curvature their fractal Dimension
and so on that that you can extract and
we can also look at how these parameters
uh change with age and so them uh
actually uh change with age and um more
broadly we can look at all different
systems of the body that we measure uh
in this uh core that I showed you uh
before and for each of these body
systems we can develop this this model
that tries to predict chronological age
uh all of these models um that we
developed had a significant predictive
power for chronological age some with
better accuracy uh some with worse uh
accuracy but they were all uh significant
significant
interestingly we found that these Clocks
Were by and large independent from each
other meaning that if um in one
individual one body system is aging
faster it doesn't necessarily mean that
another body system will age faster this
is a result that somewhat surprised us
uh the second uh key result that we
found is that these clocks are
clinically meaningful like I showed you
in the case of the metabolomics clock uh
that uh people with a higher age were in
worse clinical condition typically
people with a higher age in one body
system were in worse clinical condition
but not just worse clinical condition
they were at higher risk typically for
developing a disease related to the body
system on which the clock was developed
so for example a clock developed for
insulin resistance parameters people who
had a higher biological age were at
higher risk for developing diabetes and
uh a final result that we saw was a
marked difference between males and
females with males uh kind of Aging
pretty much linearly but females having
this abrupt change right between the age
of 50 to 55 uh which we could link to
menopause because when we looked at
women of the same chronological age but
comparing pre to postmenopausal women we
saw several biological clocks that were
higher in uh postmenopausal women of the
same chronological age as compared to uh
premenopausal women in and and this
might uh might be uh important for
example for uh identifying um the
transition uh into menopause because
that's a time where uh some treatments
could be given to delay the onset that
has been shown uh to have um health
benefits um so in the final uh few
minutes uh um uh I want to tell you um
about more recent uh um uh efforts that
that we have on trying to take all of
this data and try to work work towards
eventually integrating it all together
to hopefully get a more holistic view of
uh of human health um and so so far I
showed you various projects in which we
focused on uh one particular question
addressing uh typically uh one
particular data or developing models for
a particular task and in recent years uh
we're seeing uh kind of um a change in
or or new types of approaches being
developed in in the field of of AI where
by you can take um just the data itself
and without labels work on what is
called a foundation model it's called a
foundation model because it's not
developed for any particular task it's
developed to kind of learn the
statistics and the patterns uh in the
data and then you could fine-tune it and
use it for a specific Downstream tasks
for example uh a large language model
llm is just trained on a big Corpus of
uh textual uh data it learns a model of
language and then one application of
that could be a chat bot like chat GPT
so uh the idea is to try and take all of
our data and try to uh uh integrate them
together into into a single model that
would have uh many different uh
applications and um just to give you an
idea of of of how this works so uh
rather than um as in the past taking
label data like for example taking a
digit data and saying that the first two
dig digits are zero next one is seven
the next one is is two and and and
having those labels and trying to
predict those what the new new type of
approaches uh in AI tried to do is is a
framework uh called contrastive learning
whereby you would take for example the
two top images and without saying that
these are zero you just say that these
are similar to each other these should
be the same and the two digits at the
bottom seven and two are different so
they should be different so uh what you
ask the model to do is you you um you
you run a model and you tell it that uh
you want uh the model to embed the uh
images at the top in a space such that
they would be close to each other and
the two other images to be uh far from
each other there's a mathematical uh
framework to that there's other
approaches in which you can do that for
example you can take a picture uh of a
dog or a picture of a chair you can
break it down into uh you can take
different slices of the dog it's still a
dog and you tell the model that these
are similar to each other uh simar
similarly different patches of the chair
are similar to each other but the chair
and the dog should be different so uh by
doing that and doing it also for uh text
and images and uh what's called Del that
um You probably play with um in uh in in
open AI um uh what you do is is is you
take the legend and you take the uh the
you take the caption you take the the
image and uh basically everything on the
diagonal are what's called positive
examples the text course responds to the
images image and everything on the off
diagonal is um should be should be
different because it's an image and a
caption that do not correspond to each
other and just by doing that you're
basically aligning the text to an image
such that then you could just give a
text and you would get back and generate
an image that pretty much corresponds
quite well uh to the image so uh so the
idea that we're trying to follow is is
is kind of the same to try and take all
of these different uh data mod ities
that we have on a person they each are
related to each other to each other in
the sense that they each inform us of a
different aspect of of human health but
they might be measuring uh different
body systems and still we want to uh put
them together so the one example that
I'll give is is a recent work uh that we
did on um looking at the cardiovascular
system and looking at uh two different
data modalities that we have the retina
scans on the one hand that I showed you
before that shows all the intricate
acies of the blood vessels and and uh
the only non-invasive way in which you
can you can see uh blood vessels uh in
the body and on the right hand side an
ultrasound of the kerds also informing
us of cardiovascular health but they do
so in two very different ways the uh the
the the measurements are done from
different parts of the body obviously
and also by different methodologies
different Technologies so the idea here
is is to build a model where the only
thing that uh that we do using this
contrastive learning framework is that
uh when we have a retina scan and a
cared ultrasound that come from the same
person we tell the model these should be
embedded to be close to each other and
then we take retina scans and cared
ultrasound of different people and we
tell the model these actually should be
U distinct from each other we can also
do that uh not just on uh different uh
the same images uh the images from the
same person at the same time point but
we can also do it across different time
points so we can take the retina scan of
a person done at Baseline and the retina
scan of the same person uh profiled two
or four years uh later and tell the
model these should be similar to each
other whereas um different um again
different images from different people
should be uh different from each other
and uh when we do that uh similar to
kind of the magic that happens uh with
these models in in uh di with text and
images and so on
uh we find that we can do tasks that um
medical doctors or or humans probably
can't do which is to take just the rtina
scan of a person and then have it rank
all the cared ultrasounds all the 10,000
images that we have and we see that the
accuracy is not perfect but if you look
at the um blue curve uh uh for example
uh here at the top then uh within the
top 100 images out of the 10,000 images
in 70% of the cases it will put
uh the corresponding um ultrasound uh
cared image of of the person within
within the top 100 uh images so so this
is something that that that probably um
you can't see and and what the model is
doing is probably picking up on some
particular patterns in that person um
and and somehow aligning those so the
idea is that these types of
representations could be useful for uh
many Downstream tasks we show that it
helps us helps us to uh predict future
uh cardiovascular uh events and also uh
to predict responders and non-responders
in uh clinical trials so uh with that uh
I'll just um um kind of go back to the
high level which is after seeing um I
what I think is is uh a great utility in
being able to learn many things from
these top from these types of population
level cohorts uh we're trying to now uh
significantly expand this cohort on
multiple different d Dimensions uh
growing the number of people to
hopefully reach 100,000 looking at more
geographies more countries to look at
more cultural and ethnic and genetic
diversity also profile more phenotypes
that we are not profiling together to
today and looking at more uh diseases
and uh I believe that these types of
models uh also uh when applied to these
types of um uh Next Generation uh AI
models uh will really take us to the
next level in terms of uh Precision
medicine and with that I'll just put the
acknowledgement slide of all the people
in my group who uh contributed and led
This research and uh thanks for your
attention happy to take any questions thank
thank [Applause]
you thank you Anan very nice and uh
inspiral uh talk so let's open uh the
session for for questions from the
public uh there are a question over
there's a question over
here hi thank you for the talk Monica
linsa from
gobra I have a quick question about the
cause and consequence of your uh
bacteria snip and
BMI where is the egg and where is the
chicken is that a dietary choices that
led to high BMI that then
preferentially um cause the growth of
those uh snip snip harboring bacteria or
other way around and whether you did
some antibiotic therapies to prove that
or FAL matter transfer or something yeah
so so that's a that's a really great
question and that's the that's the the
million dooll question in this in this
field the cause and the effect unlike in
genetics when we do these associations
in the microbiome because it's an
environmental factor maybe it's uh
actually affected by by BMI so
ultimately uh the way to show that is
with experiments uh which we are uh in
the process of doing this will take time
but we're now in the process of
isolating the relevant bacteria from
people who Harbor the strains and the
Snips that we think uh help with weight
loss uh after isolating these bacteria
we want to give them in clinical trials
as probiotics to see if they indeed uh
induce weight loss and this is
ultimately what we have to do um and and
are working to do uh I think that all
the evidence points to the fact that a
lot of these actually uh probably would
be causal uh which is why we're putting
the effort in doing these experiments
and and the reason I think that is uh
because I think it's um otherwise very
hard to imagine how um a trait like a
BMI or other traits would actually uh
cause the bacteria to evolve
genetically such that uh uh uh there
would be concordant uh changes in
multiple people uh at the genetic level
these are things that we only see and
actually we do see when you give
antibiotics and and we know that that
happens uh that's a very strong
selective pressure but I don't believe
that uh there is such a selective
pressure in uh a trait like BMI and and
um the mechanism I think that's going on
is more that you kind of randomly get
your bacteria from the environment some
bacteria have a propensity for obesity
and some people you know got a few of
those unlucky ones but then they got a
few other ones that may be good and it
all kind of balances out to give an
overall propensity for uh weight loss or
weight gain by the microbiome very
similar to how we think about genetics
that you kind of randomly got your
different variants and some uh um uh
increase and some decrease your
propensity for some trait like obesity
and overall there's some net effect I
think that's the simpler explanation but
um that's a long answer to say in the
end that we still we have to prove that
and we're working on proving it and
maybe just to follow up have you
considered um including in that modeling
whether uh whether the patients or or
subjects leave together in the same
household therefore exposed to same or
similar dietary uh say dietary causes I
don't know preservatives uh dies what
not that could be environmental or
coming from the same household yeah so
um so so we do look at we try to look at
other confounding uh variables that we
have and sometimes we have people living
in the same household sometimes uh uh
sometimes not but we try to collect all
of these factors s and um the sniffs
that I showed are the sniffs that we
found Beyond uh the confounding factors
so we can't explain them by um by by
anything else that that we
measured there's another question here
in the first row tuna has from the
University of Copenhagen I I was
interested or curious about the
foundation models are you building them
based on larger models that you're
fine-tuning or is it in your do you have
enough data to just build them from
scratch to sort of get what you are sort
of a good question so uh typically with the
the
um we typically try to take pre-trained
models as a starting point and we find
that that that often uh actually helps
so for example with the the Imaging uh
data we we use Dino 2 as a starting
point of a pre-trained model and then we
fine-tune it on the images that we have
uh with other types of data like the
continuous glucose monitor or the
microbiome where we're also trying to
build these models our data is the
largest so there's no other data we're
trying to so we're just using that um
and and it's it's working uh I can't
tell you if you know if we had 10 times
or 100 times more data whether um that
would C get us to the next uh big leap
prob probably it would help um so yeah
whenever possible we we're trying to
take pre-trained models from the
literature there's another question over
here hi uh very interesting talk um just
building on this topic of AI models and
in this uh context and of course in in
personal I medicine I'm kind of not
worried but a little bit apy when it
comes to these models because at the end
of the day it's what you're putting into
it what you're asking it to do that's
the output that you're getting so are we
asking the right questions are we
training it the way that we want it to
get the answers that we're getting and
is it getting a little bit out of hand
in terms of we're feeding data to it but
there's going to be a point where we
don't know what we're getting out of it
and maybe statistics with it the
relevance of it what scientific
questions we want also want to ask the
model is not really thinking for us
right so of course it's a valuable tool
but do we have to be careful moving
forward and what is it actually telling
us uh what do you think about this yeah
I I think it's um it's a very fair
question so um you know I think prior to
these or um uh the models that are not
these Foundation models are typically
developed for one particular task and
then you know what you're exactly what
you have a hypothesis you're asking a
specific question I think with these uh
Foundation models initially you're
trying to build a model that just
captures the patterns in the data but
then you're right that um I mean you
don't stop there but then you go and you
ask okay so what is this good for um so
these are the types of things that um
were trying to work on various uh to
show that in various Downstream tasks
then you can use these models so I
showed a little bit that the model I
showed for uh the cared ultrasound and
the retina scans uh could help us in
predict better future cardiovascular
events so that would be something it's
relevant for um but but ultimately um
you know like with the large language
models you want to show that they have
uh they they don't just learn a model of
language but then you can do something
useful with it and in language models we
know that you can do many useful things
with it I think here we're still in
early days we're starting to see some
signs but but I think this is the this
is where the field will be working on to
try and show um what applications these
thanks so there is another question here
in the first row another question here
here thank you very much absolutely
fascinating so um your organ specific
clocks I just wonder in terms of you
know people don't care about you know
individual my kidney is older uh or my
brain is older so ultimately do you have
an insight into what is the kind of
clinical or longevity or healthy
longevity relevance of these different
systems for your overall kind of health
and survival I mean it's yeah so so I
hinted a little bit to that that um
typically when when we find that these
clocks are um disrupted in people
meaning that that that you're you're
older compared to your peers of the same
chronological age in one particular
system we show that that has clinical
relevance in the sense that then those
people are at higher risk for disease
but not just any disease typically it
would be a disease related to the body
system on which the clock was developed
so in that sense the clock for um for
the liver is telling us telling us a
little bit about the risk for developing
liver disease fatty liver disease and uh
and so on so uh that that's where we see
the relevance yeah sorry if I wasn't
clear I meant more the relative
importance of each of the organ systems
for overall survival so oh so yeah
overall survival
um we don't have data on overall
survival uh for this population but I
would say that I think um uh any of
these diseases that we're finding you
have elevated risk for um then first of
all you you might you are at higher risk
from suffering from that disease and
probably many of these diseases also uh
that's known in the literature uh leads
to a reduction in overall survival
survival
perfect thanks Idan one more time for a
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.