0:08 my name is Jordan Marino I'm an
0:09 associate professor and group leader
0:12 here at cbmr welcome to everyone and I'm
0:14 today delighted to introduce the next
0:19 speaker Iran seal uh inan is a professor
0:21 uh at the department of computer science
0:23 and Applied Mathematics at The wman
0:26 Institute of Science in Israel his
0:29 research focuses on how genetics G
0:32 microbiome and nutrition affects health
0:35 and diseases with a special focus on
0:38 developing batti bioinformatic tools to
0:39 better understand the relationship
0:42 between health and disease his aim is to
0:44 develop personalized medicine strategies
0:47 based on large scale and deep phenotype
0:50 uh course Iran has been highly
0:53 successful in publishing at top high at
0:55 high top journals uh he published more
0:59 than 200 uh manuscripts and has Reed
1:02 several Awards and honors including the
1:05 Overton prize and the Mikel Bruno award
1:08 for his contributions to the field of
1:10 bioinformatics and with all of thisan
1:12 I'm delighted to have you
1:14 here thank you
1:15 you
1:18 yeah thank you Jordy and thanks the
1:20 organizers for the kind invitation to uh
1:22 speak to you today uh and tell you about
1:25 a large scale population cohort that uh
1:28 we've been uh collecting um we were
1:31 motivated to do this by um seeing other
1:34 uh data collection efforts uh starting
1:36 from the Human Genome Project uh which
1:38 if you look from the year 2000 until now
1:40 has been really a transformative project
1:42 uh taking us from knowing just uh a
1:44 handful of genetic variants involved in
1:46 some diseases to where we are today
1:48 after profiling and genotyping tens of
1:50 millions of people where we now know
1:52 hundreds if not thousands of genetic
1:54 variants for virtually every disease uh
1:56 but of course the genome is uh just what
1:58 we're born with and doesn't take into
2:00 account environmental factors exercise
2:02 nutrition uh and so on and for that
2:04 reason like we had the Human Genome
2:07 Project we decided to initiate the human
2:10 phenotype project to phenotype people uh
2:12 very deeply with the hope that this type
2:14 of data would allow us to identify
2:17 trajectories of disease and find uh
2:21 novel um disease biomarkers uh and so uh
2:23 to that end what we did was several
2:25 years ago we started a clinic at The
2:28 whitesman Institute uh where um before
2:30 participants uh come in
2:32 uh they um undergo on the right hand
2:34 side you can see uh they upload their
2:36 electronical health records they fill
2:38 out hundreds of questioners about
2:41 medical uh history family history of
2:43 disease and uh and so on uh so we know
2:45 all of their um medical status then they
2:48 come to the clinic uh they undergo uh
2:50 various anthropometric measurements so
2:52 various body measures hip and waist
2:54 circumference uh hand grip strength
2:57 cardiovascular assessments like ECG uh
2:59 voice recordings which has correlates to disease
3:00 disease
3:02 then various Imaging modalities we use
3:04 dexa to look at full body composition
3:07 and bone density high resolution scans
3:09 of the retina from which we can see
3:11 blood vessels it's considered to be a
3:13 window to the heart we use ultrasound to
3:16 look at fat in the liver and the kerds
3:18 also for cardiovascular health uh then
3:20 various sensors that people go home with
3:22 continuous glucose monitors track
3:25 glucose levels every 15 minutes uh
3:28 sleeping devices that track quality of
3:30 sleep I'll I'll show that later in the
3:33 talk um smart watches track physical
3:35 activity while people are connected to
3:37 these CGM continuous glucose monitors
3:39 they log all their dietary intake which
3:41 they're incentivized to do because they
3:43 get a report of which meals Spike and
3:45 which do not Spike other glucose levels
3:47 they track medication they take and
3:49 physical activity and we biobank uh
3:52 samples we have uh both live cells from
3:56 everybody pbmc's and uh serum uh and
3:58 maybe the most distinctive feature is
3:59 connecting all of this with the more
4:02 Advanced genomics and multiomics uh
4:04 testing which includes whole genetic
4:06 sequencing so we also look at the human
4:10 genome but then also uh gut and vaginal
4:13 recently microbiome metabolomics which
4:15 looks by mass spectrometry at thousands
4:17 of different small molecules uh in the
4:19 blood proteomics from blood RNA
4:23 sequencing in bulk also from the pbmc's
4:25 and a novel immune assay that we develop
4:28 whereby uh we can synthesize hundreds of
4:30 thousands of antigens and
4:32 then see which of these are bound by a
4:34 person's antibodies in the serum so if
4:37 you will this is a snapshot of the
4:40 entire inous history of a person so all
4:43 of this we did on over 12,000 people in
4:46 Israel uh it's a longitudinal cohort uh
4:49 people get profiled every two years um
4:51 we can look at this data also and
4:53 partition it by body systems we're
4:56 profiling 17 different systems of the
4:58 body on the right hand side you can see
5:01 uh also some uh kind of very global map
5:02 of how the features of different body
5:05 systems relate to each other uh this
5:08 cohort has about 8,000 people who have
5:12 one or more um of these morbidities
5:15 shown here mostly cardom metabolic
5:18 morbidities and about 4,000 by exclusion
5:19 for not having any of these are
5:22 considered uh to be healthy uh and it's
5:24 very nice to be able to uh study on a
5:26 standardized platform everybody measured
5:28 in the same way and compare all of these
5:30 diseases uh to each other we also have
5:32 about uh 30 what we call medical
5:34 conditions not diseases but still
5:36 interesting to study like various
5:39 vitamin deficiencies allergies and uh
5:40 and so
5:45 on um now um um because we see the value
5:46 of this data which will be the main
5:49 focus of of the talk uh today our our
5:51 discoveries we're now in the process of
5:53 trying to significantly expand this with
5:55 the next Milestone we hope to achieve is
5:58 100,000 people both by uh doubling and
6:00 tripling this cohort in Israel but also
6:02 expanding to other countries where we
6:06 can explore more genetic cultural ethnic
6:08 and so on diversity and in 2024 we're
6:10 actually now building clinics in Abu
6:13 Dhabi and uh Japan we'll start to uh to
6:16 add these uh to the cohort um now of
6:19 course this is not the only large scale
6:21 population Court there are quite a few
6:23 uh that are being carried out worldwide
6:25 the most impressive one in my view is
6:27 the one uh in the UK called the UK
6:31 biobank it started 17 years ago uh half
6:33 a million people profiled also uh
6:36 longitudinally and the impact that this
6:38 project uh had is is really enormous it
6:40 really changed the way that we do
6:42 biomedical research uh 40,000
6:44 researchers from 100 countries are
6:46 working on this data collectively they
6:49 published over uh 10,000 papers um you
6:51 can see how it grow uh over the years
6:54 with close to 3,000 Publications in
6:57 PubMed uh just last year uh in
6:59 comparison we obviously don't have uh a
7:01 that's as big but we have several uh
7:03 unique features like uh it's going to be
7:06 the first cohort to be International um
7:08 as I mentioned spaning multiple
7:11 countries and then all of these other um
7:13 measurements shown here mostly the more
7:16 advanced uh multiomic measurements that
7:18 we have that are not part of the UK
7:20 biobank are unique features to this
7:23 cohort which is why I believe that this
7:25 is will be complimentary to the UK
7:27 biobank and uh like the UK biobank we
7:30 are also giving uh access to to uh the
7:34 cohort um free for uh academic academics
7:37 um uh uh for uh for research so uh in
7:38 the remainder of the talk I'll tell you
7:41 about discoveries that we've been making
7:44 uh from studying uh this cohort uh and
7:47 I'll go um pretty briefly on on uh on
7:49 each of these different projects just so
7:51 that I could uh cover and show you the
7:53 breadths of what could be done with this
7:55 type of data and the first project I'll
7:56 start with is something we actually
7:59 started 10 years ago it's it was even a
8:01 a pre precursor cohort to the large
8:03 cohort where we profiled a thousand
8:05 individuals with continuous glucose
8:08 monitors and found uh that people have
8:10 um very different blood glucose response
8:12 to food as you can see by the four
8:15 different curves here of uh the two
8:17 hours after eating four slices of bread
8:19 very different between individuals but
8:21 uh lines of the same color correspond to
8:23 the same person on different days
8:25 showing very consistent uh Behavior
8:28 within a person different between people
8:30 and using uh clinical dat and gut
8:32 microbiome data we show that we could
8:35 fairly accurately predict um the actual
8:39 personalized responses every dot is um
8:42 an individual meal uh where the x-axis
8:44 is the area under the glucose curve
8:46 that's actually measured after uh eating
8:48 that meal and the y- axis is the
8:51 predicted one uh according to the model
8:53 um and then uh we could show that you
8:55 could uh take this uh such an algorithm
8:58 and take individuals who are
9:00 pre-diabetic they have many spikes in
9:01 their blood glucose levels over the
9:04 course of a week and with a diet that's
9:06 equal in the amount of calories for
9:08 every single meal we could uh prescribe
9:10 a diet that fully balances uh their
9:13 blood sugar levels uh after publishing
9:15 this on a short-term intervention we
9:18 went and did a full-blown uh randomized
9:20 clinical trial on 200 people with
9:22 pre-diabetes randomized them into either
9:26 a standard of care Mediterranean diet or
9:29 uh the algorithm diet uh and you can see
9:32 anecdotally one participant before the
9:33 intervention many spikes in glucose
9:35 levels at the top and at the bottom in
9:38 the one month of CGM tracking after the
9:40 intervention on the algorithm diet
9:43 virtually uh eliminating these uh these
9:45 spikes and glucose lels and
9:47 statistically looking at the primary
9:49 outcomes in the center panel uh
9:51 hemoglobin A1c which is the average
9:53 glucose levels in the past 3 months in
9:55 the red curve you can see an improvement
9:57 on the Mediterranean diet after 6 months
9:59 of intervention but we get double the
10:02 Improvement on the algorithm diet and on
10:04 the right hand side we have multiple
10:07 metabolic parameters that also uh
10:10 improved um uh in people following uh
10:11 following this diet some of them
10:13 improving more significantly on the
10:15 algorithm diet compared to uh the
10:18 Mediterranean diet so uh after we
10:20 published this uh we were also
10:22 interested in kind of going a little bit
10:25 more deeper into uh gaining insights on
10:27 the mechanism by which this dietary
10:30 intervention actually uh had an effect
10:32 on various metabolic parameters and I'll
10:35 show you um a few analyses that we did
10:37 that convinced us that at least part of
10:39 the effects uh part of the way in which
10:42 the diet induced beneficial metabolic
10:45 effects was mediated by the uh by the
10:46 microbiome namely by the changes that
10:49 the diet had on the microbiome and we
10:51 showed this in two different ways uh
10:54 first um just by looking at the uh all
10:56 the intervention uh and what happened uh
10:59 when you compare samples at Baseline
11:01 samples after 6 months of intervention
11:04 we saw many changes at the center in um
11:06 gut bacterial species and at the right
11:08 hand side on different metabolites that
11:11 we measure with MPC uh through uh
11:13 metabolomics and then uh in previous
11:16 work done on a different cohort uh we
11:18 showed actually that uh quite a lot of
11:21 the metabolites several hundreds of the
11:24 metabolites could be very well predicted
11:27 by gut microbiome composition so here's
11:29 an example of four metabolites that on
11:32 heldout samples are predicted quite well
11:34 from the microbiome composition of
11:36 individuals and so what we did was we
11:38 took these models developed um prior to
11:41 this work and on a different cohort and
11:43 we applied them to the samples on the
11:44 dietary intervention both to Baseline
11:48 samples and samples taken after 6 months
11:50 and because there were changes in the
11:52 microbiome there were also changes in
11:54 the predicted levels of certain
11:56 metabolites according to the model and
11:57 we then compared them to the actual
11:59 changes in the same metab metabolites as
12:02 measured by metabolomics on the dietary
12:04 intervention between Baseline and uh
12:07 post intervention 6 months samples and
12:10 we saw a fairly good uh correspondence
12:12 meaning that uh we believe that some of
12:15 the metabolite changes that we saw in
12:17 people are in fact mediated by the
12:19 changes that the diet induced in the
12:20 microbiome because we could predict
12:22 these changes in the metabolites based
12:25 on the changes in uh the gut
12:28 microbiome um another way in which we
12:31 looked at this was by mediation analysis
12:33 and before doing the mediation analysis
12:35 we just look at looked at a direct
12:39 relationship between how diet uh affects
12:42 uh changes in the gut microbiome and so
12:44 for each gut bacterial species we
12:48 predicted uh its levels um on the on the
12:50 big cord that we have of uh 12,000
12:52 people predicted it based on dietary
12:55 data that we had and uh quite a lot of
12:57 uh the gut bacteria we can actually
12:59 predict from uh what people are eating
13:01 not with perfect accuracy but with
13:03 fairly uh definitely significant and
13:07 fairly good uh Pearson correlations um
13:10 and uh if you look at some of the uh uh
13:12 best predicted species you can see that
13:14 just by diet alone you could predict um
13:17 quite robustly what the levels of uh the
13:18 bacterial species would be in
13:21 individuals and so then we could go and
13:24 do what's called a mediation analysis to
13:26 try and get at causality where in
13:29 mediation analysis you're trying to uh
13:33 uh assess whether you have um whether
13:34 the the relationship between the X in
13:37 this case diet and the Y in this case
13:39 various metabolic improvements that we
13:42 saw are uh also mediated in part by some
13:44 other factor in this case uh the
13:47 microbiome so to do this analysis you uh
13:50 construct models that try to explain uh
13:52 in this case the metabolic improvements
13:56 from uh from the diet um and uh you know
13:59 that you have the microbiome which also
14:01 uh is affected by the diet and you ask
14:03 whether the microbiome also contributes
14:06 to explaining these beneficial metabolic
14:09 outputs beyond what you can explain uh
14:11 with the diet and and we found quite a
14:13 few of these cases that they're all
14:15 summarized um in in this work I won't go
14:17 into the details but several of the
14:19 metabolic improvements like for example
14:23 the uh time spent above 140 so and on
14:25 the CGM so how many spikes of glucose
14:27 levels U mediated by particular
14:30 bacterial uh species and a particular uh
14:33 dietary component so so overall by these
14:36 two different uh analyses uh we believe
14:38 that at least some of the beneficial
14:40 effects of the diet are actually
14:42 mediated by the changes that the diet
14:45 induces uh to uh the
14:47 microbiome uh continuing on uh the
14:50 thread of the microbiome we also have um
14:53 efforts to uh try and develop direct
14:55 Therapeutics based uh on the
14:59 microbiome and uh for that what we uh
15:01 did and and recently published is uh
15:04 very analogous to what people do in
15:06 human genetics where the main method of
15:08 analysis is called gwas genomewide
15:11 Association studies where you associate
15:14 a genetic variation across individuals
15:17 with a human trait of Interest so uh
15:19 what we did was uh to develop uh and
15:21 adapt that methodology to the microbiome
15:24 whereby we look at genetic variation not
15:25 on the human genome but genetic
15:28 variation on bacterial species at the
15:30 single nucle tide level and Associate
15:34 that with um uh human traits uh and so
15:36 here is a result of that this is a
15:39 standard Manhattan plot that many of you
15:41 probably saw from Human genetics except
15:44 that instead of the x-axis now being a
15:46 human chromosomes the excess AIS is now
15:49 bacterial species every dot corresponds
15:52 to uh one particular position within a
15:55 specific bacterial genome and the y- AIS
15:57 corresponds to the strength of the
16:00 association between genetic variation at
16:02 that position and a human trade of
16:04 interest in this case we're looking at
16:07 BMI um and so overall we found over a
16:10 thousand uh different of these Snips
16:11 many of them actually they they come in
16:14 blocks so after you uh you remove uh
16:16 Snips that are correlated to each other
16:19 we're left with about 40 uh different uh
16:22 independent uh positions in uh 27
16:25 different bacterial species that showed
16:27 strong Association interestingly about
16:29 in 12 of these cases if if you look at
16:31 the abundance of the species that
16:34 encompasses that contains uh the snip
16:36 the abundance of that species is not
16:38 correlated to BMI but when you look at
16:41 the genetic variation uh it
16:44 is and uh uh I think even more
16:47 interesting than uh just um the strength
16:50 of the association is actually the uh
16:52 effect size that uh this could explain
16:55 so now I'm plotting exactly the same
16:58 data the y- AIS is still the minus log P
16:59 value the strength of the association
17:02 but now the x-axis is actually how many
17:04 points of BMI are explained by each of
17:07 these different uh Snips uh and you can
17:09 see that there's quite a few Snips that
17:11 explain one sometimes even approaching
17:14 two points of BMI that are explained by
17:18 just a genetic variation at a single uh
17:22 bacteria um U one position and and this
17:24 is uh even the these uh the size of
17:27 these effects are stronger than some of
17:30 the strongest effects that people have
17:33 uh seen on the human genome uh and the
17:35 way I rationalize that is because I
17:37 believe that the microbiome is an
17:39 environmental factor and we know that
17:41 environmental factors have big effects
17:43 on many different traits definitely on
17:46 BMI and the reason we think it's an
17:48 environmental Factor we we did work
17:51 several years ago where we showed that
17:54 genetics explains very little of our
17:56 microbiome composition so on the cohort
17:59 in Israel uh when we do just a simple
18:02 PCA plot U on the left hand side of the
18:04 genetics we can see a very clear
18:06 partitioning of different populations in
18:08 Israel coming from different locations
18:10 in the world that are known to be
18:12 genetically uh distinct but when you
18:15 look at the same microbiome composition
18:17 uh all of these populations that are
18:19 genetically uh different uh they're
18:21 actually completely mixed in terms of uh
18:22 the microbiome and there were other
18:24 analyses in in this work and followup
18:27 works also by others to suggest that the
18:29 micro the um human genetic
18:31 explains a very little portion we
18:34 estimate 2 to 4% of the microbiome
18:36 composition and the rest is is
18:38 environmental factors you get your your
18:41 microbes from uh the
18:44 environment uh we also validated uh
18:46 these uh findings on uh the BMI the
18:49 particular Snips on a Dutch uh cohort of
18:53 a similar size and we showed that the
18:54 majority of these Snips were also
18:57 significant in the Dutch cohort and
18:58 almost all of them with the exception of
19:01 one uh even if they were not significant
19:03 the direction of the association whether
19:05 it's a positive or negative association
19:08 to BMI was maintained also um in the Dutch
19:09 Dutch
19:12 cohort um if we look at some of these uh
19:15 Snips um individually uh we can see for
19:17 example here one of the most strongest
19:19 associations if you look at uh the
19:21 center here we have about 2,000
19:23 individuals that in this particular
19:26 species had the major alil for this snip
19:29 they have an average BMI of 25 and about
19:31 400 individual on the right that had the
19:34 minor Al that on average have a BMI of
19:38 almost 27 1.9 points of BMI explained
19:40 only when you partition People based on
19:43 this snip so a very uh strong
19:45 Association uh what's also nice about
19:48 the association that the resolution of
19:50 uh Snips is that you can look at the not
19:51 just the bacteria but the individual
19:54 genes that are being affected so here's
19:56 an example of another snip explaining
19:59 1.3 uh points of
20:01 that if you look at this region it
20:03 contains uh various energy uh production
20:05 genes lipid metabolism genes that you
20:07 might rationalize could be
20:09 mechanistically involved in the way in
20:12 which bacteria would have an effect on a
20:15 trait uh like BMI and another example uh
20:18 in a inflammatory pathway also uh makes
20:21 uh kind of sense that could be involved
20:24 in uh in in a trait having an effect on
20:25 a trait like
20:29 BMI um using exactly the same pipeline
20:31 I'll just briefly mention we're also
20:34 looking at Association not to human
20:37 traits but to um abundance of other
20:39 bacteria the reason is that this may
20:41 allow us to find genetic variation in
20:44 some bacteria that affect the presence
20:46 of other bacteria namely they could be
20:49 some novel uh antibiotics uh we also
20:52 look at associations uh to the same
20:55 bacteria that Harbors uh the snip that
20:57 can allow us to identify uh genetic
21:00 variation that uh um help or inhibit
21:02 induce or inhibit the growth of the same
21:05 bacteria and and quite interesting
21:07 interestingly just from a systems level
21:11 perspective we see that um uh a a
21:13 genetic variation within the same
21:15 bacteria typically has positive
21:18 associations to abundance of the same
21:20 bacteria namely evolutionarily as you
21:23 might expect bacteria evolve their
21:25 genomes such that uh the mutations
21:28 support the growth of the organism where
21:31 as if we look at associations with other
21:33 bacteria most of these are actually
21:35 negative supporting kind of the Warfare
21:37 that we know uh is going on between uh
21:40 between bacteria and also the actual
21:42 genes are very different genes in the
21:44 same bacteria that uh are associated are
21:47 mainly housekeeping genes genes having
21:50 to do with uh with growth whereas genes
21:52 uh involved in the association with
21:55 other bacteria are mainly um uh related
21:57 to metabolism
22:00 genes uh maybe in the interest of time
22:05 I'll skip this uh short uh piece um and
22:07 tell you now uh a little bit about the
22:09 uh sleep data that I mentioned so I
22:12 think this is also uh quite unique data
22:14 that we are collecting here so uh
22:16 there's a a company in Israel who
22:19 developed a device uh FDA approved for
22:22 diagnosing sleep apnea it has several
22:25 sensors uh like actograph
22:28 accelerometer um and uh another device
22:29 on the finger finger that measures blood
22:32 saturation so overall uh by tracking
22:34 three nights of sleep from each
22:36 participant we get several different uh
22:38 analog channels throughout the night
22:41 from which we can uh extract multiple
22:43 features and and something like this was
22:46 never measured on a scale of thousands
22:48 of people definitely not when it's
22:50 coupled with all the other parameters
22:52 that that we're measuring so for example
22:55 we can see uh differences between males
22:57 and females and how uh these these
22:59 measures um
23:01 of of sleep apnea actually change with
23:03 age we see that they change with age
23:06 some known Association we also see that
23:09 if you try to diagnose sleep apnea based
23:11 on these which is what the devic is is
23:13 used for uh there's a difference uh in
23:15 the diagnosis people would get if they
23:16 use it for one night or three nights
23:19 supporting uh multiple use uh uh
23:21 multiple nights in order to uh to get
23:23 robustness uh and also we see uh
23:26 different associations between all of
23:28 these different uh sleep features and
23:31 and the multiple uh body systems that we
23:33 have such that we can use the Sleep data
23:37 to predict um um with some uh uh decent
23:40 accuracy several uh measures about
23:42 people having to do with their sometimes
23:44 their blood test sometimes uh food that
23:46 they eat dietary patterns uh and so on
23:50 and also various associations with with
23:53 disease um and then the final project I
23:55 want to tell you before uh a short last
23:58 segment on on uh AI models that we're
24:00 building with with this is uh uh looking
24:03 at uh human aging uh so um because we
24:05 have so many different types of data
24:07 there there's an opportunity to see how
24:10 this data changes uh in people uh with
24:12 age uh and so first we took just the
24:16 metabolomic data um and we use the
24:18 metabolomic data to predict the
24:22 chronological age of uh of people uh
24:23 which you can see here uh the
24:26 predictions are working fairly well the
24:28 y- axis is uh the predicted
24:30 chronological age of a person using only
24:33 the metabolomic data the x-axis is the
24:36 actual chronological age of a person but
24:38 you can also see that these predictions
24:41 are far from perfect people at the top
24:44 uh have a predicted age which uh
24:47 according to the model um is is older so
24:49 uh when the model looks at their
24:51 metabolomic data it seems like uh to be
24:53 more similar to somebody who's older and
24:55 conversely people at the bottom
24:58 according to the model uh they have
25:01 bomic data that resembles more somebody
25:04 who's younger and so we ask whether
25:06 these um errors in the model or are
25:09 actually just errors in the model or
25:10 whether they actually carry some meaning
25:12 meaning that uh people at the top maybe
25:16 are actually aging faster and indeed
25:17 when we compare people at the top to
25:20 people at the bottom and look at other
25:22 independent parameters that were not
25:24 used as part of the model we see that
25:27 people at the top are in worse health
25:29 condition they have higher BMI on
25:31 average higher triglycerides hemoglobin
25:34 A1c blood pressure cholesterol and liver
25:39 enzymes so um so there uh so so this uh
25:42 this model which we call um a biological
25:44 clock in this case a metabolomic a
25:47 biological clock is is also informing us
25:50 of uh the clinical uh condition of
25:53 people and it might allow us to identify
25:56 biomarkers uh that uh for that that are
25:59 indicative of Accel accelerated age at
26:02 least according to the metabolomics data
26:03 uh but we have many different types of
26:06 data so now looking at just the retina
26:08 scans the original image is this one uh
26:10 the other images are what a deep
26:12 learning model algorithm applied to
26:15 these images see in terms of blood
26:18 vessels and and various uh features
26:19 having to do with their uh their widths
26:22 their curvature their fractal Dimension
26:24 and so on that that you can extract and
26:27 we can also look at how these parameters
26:29 uh change with age and so them uh
26:32 actually uh change with age and um more
26:34 broadly we can look at all different
26:36 systems of the body that we measure uh
26:39 in this uh core that I showed you uh
26:40 before and for each of these body
26:42 systems we can develop this this model
26:46 that tries to predict chronological age
26:48 uh all of these models um that we
26:50 developed had a significant predictive
26:53 power for chronological age some with
26:55 better accuracy uh some with worse uh
26:57 accuracy but they were all uh significant
26:58 significant
27:01 interestingly we found that these Clocks
27:03 Were by and large independent from each
27:05 other meaning that if um in one
27:08 individual one body system is aging
27:10 faster it doesn't necessarily mean that
27:12 another body system will age faster this
27:15 is a result that somewhat surprised us
27:17 uh the second uh key result that we
27:19 found is that these clocks are
27:21 clinically meaningful like I showed you
27:24 in the case of the metabolomics clock uh
27:27 that uh people with a higher age were in
27:29 worse clinical condition typically
27:32 people with a higher age in one body
27:34 system were in worse clinical condition
27:36 but not just worse clinical condition
27:38 they were at higher risk typically for
27:41 developing a disease related to the body
27:43 system on which the clock was developed
27:46 so for example a clock developed for
27:48 insulin resistance parameters people who
27:50 had a higher biological age were at
27:53 higher risk for developing diabetes and
27:56 uh a final result that we saw was a
27:58 marked difference between males and
28:01 females with males uh kind of Aging
28:04 pretty much linearly but females having
28:06 this abrupt change right between the age
28:10 of 50 to 55 uh which we could link to
28:12 menopause because when we looked at
28:15 women of the same chronological age but
28:18 comparing pre to postmenopausal women we
28:21 saw several biological clocks that were
28:24 higher in uh postmenopausal women of the
28:27 same chronological age as compared to uh
28:28 premenopausal women in and and this
28:31 might uh might be uh important for
28:34 example for uh identifying um the
28:36 transition uh into menopause because
28:38 that's a time where uh some treatments
28:41 could be given to delay the onset that
28:44 has been shown uh to have um health
28:47 benefits um so in the final uh few
28:50 minutes uh um uh I want to tell you um
28:54 about more recent uh um uh efforts that
28:56 that we have on trying to take all of
28:58 this data and try to work work towards
29:01 eventually integrating it all together
29:04 to hopefully get a more holistic view of
29:06 uh of human health um and so so far I
29:09 showed you various projects in which we
29:11 focused on uh one particular question
29:14 addressing uh typically uh one
29:16 particular data or developing models for
29:19 a particular task and in recent years uh
29:23 we're seeing uh kind of um a change in
29:25 or or new types of approaches being
29:28 developed in in the field of of AI where
29:31 by you can take um just the data itself
29:33 and without labels work on what is
29:35 called a foundation model it's called a
29:37 foundation model because it's not
29:39 developed for any particular task it's
29:40 developed to kind of learn the
29:43 statistics and the patterns uh in the
29:45 data and then you could fine-tune it and
29:48 use it for a specific Downstream tasks
29:51 for example uh a large language model
29:54 llm is just trained on a big Corpus of
29:57 uh textual uh data it learns a model of
29:59 language and then one application of
30:02 that could be a chat bot like chat GPT
30:04 so uh the idea is to try and take all of
30:08 our data and try to uh uh integrate them
30:10 together into into a single model that
30:12 would have uh many different uh
30:15 applications and um just to give you an
30:18 idea of of of how this works so uh
30:21 rather than um as in the past taking
30:24 label data like for example taking a
30:26 digit data and saying that the first two
30:28 dig digits are zero next one is seven
30:30 the next one is is two and and and
30:32 having those labels and trying to
30:34 predict those what the new new type of
30:37 approaches uh in AI tried to do is is a
30:40 framework uh called contrastive learning
30:42 whereby you would take for example the
30:44 two top images and without saying that
30:46 these are zero you just say that these
30:48 are similar to each other these should
30:51 be the same and the two digits at the
30:53 bottom seven and two are different so
30:55 they should be different so uh what you
30:57 ask the model to do is you you um you
31:00 you run a model and you tell it that uh
31:03 you want uh the model to embed the uh
31:06 images at the top in a space such that
31:07 they would be close to each other and
31:10 the two other images to be uh far from
31:12 each other there's a mathematical uh
31:14 framework to that there's other
31:15 approaches in which you can do that for
31:17 example you can take a picture uh of a
31:19 dog or a picture of a chair you can
31:21 break it down into uh you can take
31:23 different slices of the dog it's still a
31:25 dog and you tell the model that these
31:27 are similar to each other uh simar
31:29 similarly different patches of the chair
31:31 are similar to each other but the chair
31:34 and the dog should be different so uh by
31:37 doing that and doing it also for uh text
31:40 and images and uh what's called Del that
31:43 um You probably play with um in uh in in
31:47 open AI um uh what you do is is is you
31:49 take the legend and you take the uh the
31:51 you take the caption you take the the
31:54 image and uh basically everything on the
31:56 diagonal are what's called positive
31:58 examples the text course responds to the
32:00 images image and everything on the off
32:03 diagonal is um should be should be
32:05 different because it's an image and a
32:07 caption that do not correspond to each
32:09 other and just by doing that you're
32:12 basically aligning the text to an image
32:14 such that then you could just give a
32:16 text and you would get back and generate
32:19 an image that pretty much corresponds
32:22 quite well uh to the image so uh so the
32:24 idea that we're trying to follow is is
32:26 is kind of the same to try and take all
32:28 of these different uh data mod ities
32:30 that we have on a person they each are
32:32 related to each other to each other in
32:35 the sense that they each inform us of a
32:38 different aspect of of human health but
32:39 they might be measuring uh different
32:42 body systems and still we want to uh put
32:44 them together so the one example that
32:46 I'll give is is a recent work uh that we
32:49 did on um looking at the cardiovascular
32:51 system and looking at uh two different
32:54 data modalities that we have the retina
32:56 scans on the one hand that I showed you
32:57 before that shows all the intricate
33:00 acies of the blood vessels and and uh
33:02 the only non-invasive way in which you
33:05 can you can see uh blood vessels uh in
33:07 the body and on the right hand side an
33:10 ultrasound of the kerds also informing
33:13 us of cardiovascular health but they do
33:16 so in two very different ways the uh the
33:17 the the measurements are done from
33:19 different parts of the body obviously
33:22 and also by different methodologies
33:25 different Technologies so the idea here
33:27 is is to build a model where the only
33:29 thing that uh that we do using this
33:32 contrastive learning framework is that
33:34 uh when we have a retina scan and a
33:36 cared ultrasound that come from the same
33:39 person we tell the model these should be
33:41 embedded to be close to each other and
33:44 then we take retina scans and cared
33:46 ultrasound of different people and we
33:48 tell the model these actually should be
33:50 U distinct from each other we can also
33:53 do that uh not just on uh different uh
33:56 the same images uh the images from the
33:58 same person at the same time point but
34:00 we can also do it across different time
34:02 points so we can take the retina scan of
34:04 a person done at Baseline and the retina
34:07 scan of the same person uh profiled two
34:09 or four years uh later and tell the
34:11 model these should be similar to each
34:15 other whereas um different um again
34:16 different images from different people
34:19 should be uh different from each other
34:21 and uh when we do that uh similar to
34:24 kind of the magic that happens uh with
34:26 these models in in uh di with text and
34:28 images and so on
34:30 uh we find that we can do tasks that um
34:33 medical doctors or or humans probably
34:35 can't do which is to take just the rtina
34:38 scan of a person and then have it rank
34:40 all the cared ultrasounds all the 10,000
34:43 images that we have and we see that the
34:45 accuracy is not perfect but if you look
34:48 at the um blue curve uh uh for example
34:51 uh here at the top then uh within the
34:55 top 100 images out of the 10,000 images
34:58 in 70% of the cases it will put
35:01 uh the corresponding um ultrasound uh
35:04 cared image of of the person within
35:07 within the top 100 uh images so so this
35:10 is something that that that probably um
35:12 you can't see and and what the model is
35:14 doing is probably picking up on some
35:17 particular patterns in that person um
35:19 and and somehow aligning those so the
35:21 idea is that these types of
35:23 representations could be useful for uh
35:25 many Downstream tasks we show that it
35:28 helps us helps us to uh predict future
35:31 uh cardiovascular uh events and also uh
35:33 to predict responders and non-responders
35:37 in uh clinical trials so uh with that uh
35:40 I'll just um um kind of go back to the
35:44 high level which is after seeing um I
35:47 what I think is is uh a great utility in
35:49 being able to learn many things from
35:51 these top from these types of population
35:54 level cohorts uh we're trying to now uh
35:56 significantly expand this cohort on
35:58 multiple different d Dimensions uh
36:00 growing the number of people to
36:03 hopefully reach 100,000 looking at more
36:05 geographies more countries to look at
36:07 more cultural and ethnic and genetic
36:10 diversity also profile more phenotypes
36:12 that we are not profiling together to
36:16 today and looking at more uh diseases
36:18 and uh I believe that these types of
36:21 models uh also uh when applied to these
36:24 types of um uh Next Generation uh AI
36:26 models uh will really take us to the
36:29 next level in terms of uh Precision
36:32 medicine and with that I'll just put the
36:34 acknowledgement slide of all the people
36:37 in my group who uh contributed and led
36:39 This research and uh thanks for your
36:41 attention happy to take any questions thank
36:42 thank [Applause]
36:51 you thank you Anan very nice and uh
36:55 inspiral uh talk so let's open uh the
36:57 session for for questions from the
36:59 public uh there are a question over
37:02 there's a question over
37:04 here hi thank you for the talk Monica
37:05 linsa from
37:08 gobra I have a quick question about the
37:10 cause and consequence of your uh
37:12 bacteria snip and
37:14 BMI where is the egg and where is the
37:17 chicken is that a dietary choices that
37:20 led to high BMI that then
37:23 preferentially um cause the growth of
37:27 those uh snip snip harboring bacteria or
37:29 other way around and whether you did
37:31 some antibiotic therapies to prove that
37:34 or FAL matter transfer or something yeah
37:36 so so that's a that's a really great
37:38 question and that's the that's the the
37:40 million dooll question in this in this
37:42 field the cause and the effect unlike in
37:44 genetics when we do these associations
37:45 in the microbiome because it's an
37:47 environmental factor maybe it's uh
37:49 actually affected by by BMI so
37:52 ultimately uh the way to show that is
37:55 with experiments uh which we are uh in
37:56 the process of doing this will take time
37:58 but we're now in the process of
38:00 isolating the relevant bacteria from
38:03 people who Harbor the strains and the
38:06 Snips that we think uh help with weight
38:08 loss uh after isolating these bacteria
38:11 we want to give them in clinical trials
38:14 as probiotics to see if they indeed uh
38:16 induce weight loss and this is
38:19 ultimately what we have to do um and and
38:22 are working to do uh I think that all
38:24 the evidence points to the fact that a
38:26 lot of these actually uh probably would
38:28 be causal uh which is why we're putting
38:30 the effort in doing these experiments
38:32 and and the reason I think that is uh
38:34 because I think it's um otherwise very
38:38 hard to imagine how um a trait like a
38:41 BMI or other traits would actually uh
38:44 cause the bacteria to evolve
38:47 genetically such that uh uh uh there
38:49 would be concordant uh changes in
38:52 multiple people uh at the genetic level
38:53 these are things that we only see and
38:55 actually we do see when you give
38:57 antibiotics and and we know that that
38:59 happens uh that's a very strong
39:01 selective pressure but I don't believe
39:03 that uh there is such a selective
39:07 pressure in uh a trait like BMI and and
39:08 um the mechanism I think that's going on
39:11 is more that you kind of randomly get
39:14 your bacteria from the environment some
39:17 bacteria have a propensity for obesity
39:18 and some people you know got a few of
39:21 those unlucky ones but then they got a
39:23 few other ones that may be good and it
39:25 all kind of balances out to give an
39:27 overall propensity for uh weight loss or
39:29 weight gain by the microbiome very
39:31 similar to how we think about genetics
39:34 that you kind of randomly got your
39:37 different variants and some uh um uh
39:39 increase and some decrease your
39:41 propensity for some trait like obesity
39:43 and overall there's some net effect I
39:46 think that's the simpler explanation but
39:48 um that's a long answer to say in the
39:50 end that we still we have to prove that
39:51 and we're working on proving it and
39:53 maybe just to follow up have you
39:56 considered um including in that modeling
39:59 whether uh whether the patients or or
40:01 subjects leave together in the same
40:03 household therefore exposed to same or
40:08 similar dietary uh say dietary causes I
40:11 don't know preservatives uh dies what
40:13 not that could be environmental or
40:16 coming from the same household yeah so
40:18 um so so we do look at we try to look at
40:21 other confounding uh variables that we
40:22 have and sometimes we have people living
40:25 in the same household sometimes uh uh
40:26 sometimes not but we try to collect all
40:29 of these factors s and um the sniffs
40:30 that I showed are the sniffs that we
40:32 found Beyond uh the confounding factors
40:35 so we can't explain them by um by by
40:37 anything else that that we
40:39 measured there's another question here
40:42 in the first row tuna has from the
40:44 University of Copenhagen I I was
40:45 interested or curious about the
40:48 foundation models are you building them
40:50 based on larger models that you're
40:52 fine-tuning or is it in your do you have
40:54 enough data to just build them from
40:56 scratch to sort of get what you are sort
40:58 of a good question so uh typically with the
41:00 the
41:03 um we typically try to take pre-trained
41:05 models as a starting point and we find
41:07 that that that often uh actually helps
41:10 so for example with the the Imaging uh
41:12 data we we use Dino 2 as a starting
41:14 point of a pre-trained model and then we
41:17 fine-tune it on the images that we have
41:18 uh with other types of data like the
41:20 continuous glucose monitor or the
41:22 microbiome where we're also trying to
41:25 build these models our data is the
41:26 largest so there's no other data we're
41:30 trying to so we're just using that um
41:32 and and it's it's working uh I can't
41:33 tell you if you know if we had 10 times
41:36 or 100 times more data whether um that
41:39 would C get us to the next uh big leap
41:42 prob probably it would help um so yeah
41:44 whenever possible we we're trying to
41:46 take pre-trained models from the
41:48 literature there's another question over
41:52 here hi uh very interesting talk um just
41:54 building on this topic of AI models and
41:56 in this uh context and of course in in
42:00 personal I medicine I'm kind of not
42:01 worried but a little bit apy when it
42:03 comes to these models because at the end
42:06 of the day it's what you're putting into
42:08 it what you're asking it to do that's
42:10 the output that you're getting so are we
42:12 asking the right questions are we
42:13 training it the way that we want it to
42:16 get the answers that we're getting and
42:17 is it getting a little bit out of hand
42:20 in terms of we're feeding data to it but
42:21 there's going to be a point where we
42:23 don't know what we're getting out of it
42:26 and maybe statistics with it the
42:27 relevance of it what scientific
42:29 questions we want also want to ask the
42:31 model is not really thinking for us
42:33 right so of course it's a valuable tool
42:35 but do we have to be careful moving
42:37 forward and what is it actually telling
42:40 us uh what do you think about this yeah
42:42 I I think it's um it's a very fair
42:46 question so um you know I think prior to
42:48 these or um uh the models that are not
42:50 these Foundation models are typically
42:51 developed for one particular task and
42:53 then you know what you're exactly what
42:55 you have a hypothesis you're asking a
42:57 specific question I think with these uh
42:59 Foundation models initially you're
43:01 trying to build a model that just
43:04 captures the patterns in the data but
43:06 then you're right that um I mean you
43:08 don't stop there but then you go and you
43:11 ask okay so what is this good for um so
43:13 these are the types of things that um
43:14 were trying to work on various uh to
43:18 show that in various Downstream tasks
43:19 then you can use these models so I
43:21 showed a little bit that the model I
43:24 showed for uh the cared ultrasound and
43:26 the retina scans uh could help us in
43:29 predict better future cardiovascular
43:31 events so that would be something it's
43:34 relevant for um but but ultimately um
43:35 you know like with the large language
43:38 models you want to show that they have
43:40 uh they they don't just learn a model of
43:41 language but then you can do something
43:44 useful with it and in language models we
43:45 know that you can do many useful things
43:47 with it I think here we're still in
43:49 early days we're starting to see some
43:51 signs but but I think this is the this
43:53 is where the field will be working on to
43:56 try and show um what applications these
44:02 thanks so there is another question here
44:05 in the first row another question here
44:19 here thank you very much absolutely
44:22 fascinating so um your organ specific
44:24 clocks I just wonder in terms of you
44:26 know people don't care about you know
44:29 individual my kidney is older uh or my
44:31 brain is older so ultimately do you have
44:33 an insight into what is the kind of
44:34 clinical or longevity or healthy
44:36 longevity relevance of these different
44:38 systems for your overall kind of health
44:41 and survival I mean it's yeah so so I
44:44 hinted a little bit to that that um
44:46 typically when when we find that these
44:48 clocks are um disrupted in people
44:50 meaning that that that you're you're
44:53 older compared to your peers of the same
44:55 chronological age in one particular
44:58 system we show that that has clinical
45:01 relevance in the sense that then those
45:03 people are at higher risk for disease
45:05 but not just any disease typically it
45:07 would be a disease related to the body
45:09 system on which the clock was developed
45:12 so in that sense the clock for um for
45:15 the liver is telling us telling us a
45:17 little bit about the risk for developing
45:19 liver disease fatty liver disease and uh
45:22 and so on so uh that that's where we see
45:23 the relevance yeah sorry if I wasn't
45:25 clear I meant more the relative
45:26 importance of each of the organ systems
45:29 for overall survival so oh so yeah
45:32 overall survival
45:34 um we don't have data on overall
45:36 survival uh for this population but I
45:39 would say that I think um uh any of
45:41 these diseases that we're finding you
45:44 have elevated risk for um then first of
45:47 all you you might you are at higher risk
45:48 from suffering from that disease and
45:50 probably many of these diseases also uh
45:52 that's known in the literature uh leads
45:55 to a reduction in overall survival
45:57 survival
45:59 perfect thanks Idan one more time for a