YouTube Transcript:
AI and cybersecurity: penetration tester reveals key dangers
Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.
Share:
Video Transcript
This is In the Black, a leadership,
strategy and business podcast brought to
Australia. Welcome to In the Black. I'm
Gareth Hanley and in today's show we're
talking with Miranda about the world of
artificial intelligence and its
implications for cyber security.
Miranda is an AI vulnerability
researcher and a trainer with
Maliva and she's the offensive security
team manager at malware security. At
malware security, she conducts
penetration testing for various sectors
including government and private
industry. Miranda has also worked on the
chips team within the ASD's
ACSC. Welcome to In the Black, Miranda.
Thank you. Thank you so much for having me.
me.
Look, we've got some questions lined up,
but before we start, I' I've just
rattled off a few acronyms there. The
CHIPS CHIPS team and the ASD and the
ACSC. Can you maybe explain for our
listeners who have no idea what I'm
talking about, what those are? Yeah,
absolutely. So ASD stands for the
Australian Signals Directorate and
they're an organization
organization
who work with foreign signals
intelligence, cyber security and
offensive cyber operations. So within
the ASD there's the ACSC which is the
Australian cyber security center and
specifically with the ACSC there's the
CHIPS team which was the cyber hygiene
improvement programs team and this team
is fairly public so what they do is is
allowed to be known. They're in charge
of performing enumeration and scanning
of government and critical
infrastructures attack surface and then
they provide quarterly reports to these
agencies on where their cyber security
posture is lacking. They have a really
really important role actually in making
sure that government's attack surface is
reduced as much as possible from the
internet facing aspects. So no potato
chips. No, no. Although they love the
chips acronym and they have a a section
called hot chips which is high priority
operational tasking and this is a
section where whenever a critical
vulnerability is notified they do
immediate scanning of government and
critical infrastructure to then notify
people who are exposed to the CVAs the
critical vulnerabilities. Going back to
the hot topic, you're involved in what's
known as adversarial machine learning.
Does that mean that you hack AI systems?
And how does that compare with
traditional cyber security like like
firewalls and penetration testing? Being
an AI hacker is is a really cool way to
put it. I would say I'm more of a
vulnerability researcher though I love
yeah performing AI hacks and and
learning about them too. So let's talk
about adversarial machine learning
quickly and then I'll compare AI systems
to IT systems so we kind of get a gist
of of what the difference in my work is
there. So adversarial machine learning
and I'm just going to call it AML from
now on because the whole acronym is hard
to say on and on. So, it's the study of
attacks on machine learning algorithms
designed to disrupt models, preventing
them from doing what they're meant to,
or deceive models into performing tasks
they're not meant to, or making models
disclose information that they aren't
meant to. So, at Maleva, we call these
the 3Ds. Disrupt, disclose, and deceive.
And we've made them as a sort of AI
equivalent to the CIA triad, which might
be familiar to listens. It's listeners.
It's a um a framework that is used to I
guess evaluate the impacts of
vulnerabilities through confidentiality,
integrity or availability. That's the
CIA and that's of people's data on
computer systems. Yeah. So that that's
used to measure the impact of
vulnerabilities on information security
systems. So the triple D disclose,
disrupt and deceive are a way to measure
the impact of AI or adversarial machine
learning attacks and
vulnerabilities. And in terms of how AI
an AI system is different to an IT
system, a few things that make it
different and which make it necessary to
differentiate AI security from the field
of cyber security and why risk
mitigation is really different for both
of them as well. So for example, IT
systems, they're deterministic and and
rule-based. They follow really strict
predefined and explicit logic or code.
And if an error occurs in one of these
types of systems, it can typically be
traced back to a specific line of code.
And for vulnerability management, that
means that you can often directly find
where a cyber security vulnerability
occurred and you can fix it with a
onetoone direct patch at the source of
the problem. And that might be just
through updating the code, configuring
settings or applying some other sort of
fix. But AI is is quite different from
that. The AI systems are inherently
probabilistic. And this comes down to
the underlying architecture that is
built off of mathematical and
statistical models. And that's that's a
whole talk for another time. But because
of that nature, there's rarely a
onetoone direct cause because AI systems
don't follow rigid rules or hard-coded
instructions. They generate outputs
based on these statistical
likelihoods. And that uncertainty is
what makes AI so powerful and so good at
what it does. Because of this
uncertainty, it can adapt and it can
infer and it can make generalizations
and really work with diverse data. But
it's also what makes it really
vulnerable because the uncertainty also
leads it being prone to errors and also
being prone to being biased,
unpredictable and
manipulatable. So yeah, AI vulnerability
management is really really difficult
because unlike cyber security and and
traditional software where you can just
patch it, with AI you can try and
optimize the architecture as much as
possible. You can try and fine-tune
models, which means align them and train
them closer to the purpose in which you
want them to perform. And you can add in
all these layers of internal and
external defenses. But because of this
likelihood in its output, there's always
a level left over where you just have to
accept that the model might be eronous
and produce mistakes. That's one aspect
and that was a lot. The second which is
I guess more simple to understand is
that it I it systems don't take
undefined inputs. They're really
structured. They're programmed to accept
one kind of input and output one type of
input. It might only intake database
queries or it might only intake language
when you're putting in your name in an
input field on a website, right? And if
you get it wrong, it will send you an
error. And errors around this are
usually due to people not having the
right protections in the backend code of
what's happening to that input. But AI
and people who have used chatbt for
example will know that you can you can
give it almost anything. You can give it
files, you can give it code, you can
give it mathematical questions, you can
give it language based questions. And
other systems also take in things like
sensory data from IoT devices and
things. And that just means that it's so
hard to secure that input because now
all of a sudden you have this
multimodal input and this huge attack
surface. It's it's really difficult to
secure. And what were those three Ds
again? Disrupt, deceive, and
disclose. Disrupt is denial of service,
preventing it from doing what it's meant
to. Deceive is about tricking the model
into doing something that it's not
usually allowed to do. For example, you
might have seen a lot of things called
jailbreaking or prompt injecting or
prompt engineering related to chatbt.
So where you might get it to talk about
a topic that it's not supposed to talk
about. Yeah, 100%. So that's deception.
You're deceiving the model into doing
that. And disclosure. So that would be
about getting the model to release
sensitive information, for example,
about other users. Is that because if
I'm using an AI system, what I'm typing
into the system is held somewhere in
memory and so somebody else might be
able to extract that from the memory. So
it could either be disclosing sensitive
user data if there's some sort of
problem where the AI can access data
about multiple users and then someone
might be able to pull your data across
into their session. Or it could be
disclosure of the proprietary
information from whoever has deployed
the AI and what the model has been
trained on. Yeah, absolutely. Or things
like the system prompt as well. So these
are this is a set of instructions that
is I guess a very fundamental piece of
how the AI knows how to perform its
task. And if you disclose that that
again is is a bit of a PI loss for the
company. So are all AI systems the same?
There's a few popular ones that are out
there that people will know of. Are they
all the same? So, all AI systems aren't
the same in terms of their purpose or
capability or even in their
architecture, but the processes that
that underpin them are the same. So, by
this I mean where they're different
could be in that the models can undergo
a variety of training types such as
supervised, unsupervised or
reinforcement learning. not worth
getting into those unless you're
actually wanting to design an AI system.
But those learning techniques can lead
to vastly different performance
outcomes. So people will choose one that
is is most optimal for their scenario.
Then models can also be fine-tuned which
means like I talked about earlier
aligning them to make them particularly
adept and good at doing one specific
thing. Or they could have entirely
different infrastructures. So you know
one that you will know of and probably
use day-to-day is a large language model
or LLM for example like chatbt or
deepseek or claude bard some of the
other ones and these reconstruct text
from human language or other inputs and
another type could be for example a
convolutional neural network or a CNN
and this provides computers with
vision-like abilities so it's referred
to as computer vision and it allows them
to be able to see differences in images
as a human would. So you would find
these types in um facial recognition
systems. But even though there are all
those differences, what is the same is
the underlying process which adversarial
machine learning exploits or AML
exploits used to target. So machine
learning models, whether they're an LLM
or a CNN or something else, they follow
the same life cycle of starting with
data gathering, data
prep-processing, model training, and
then finally deployment of the model and
inference, which is where it makes its
outputs. And all of these systems can
most definitely be exploited to access
sensitive data throughout any of the
stages in that life cycle. What type of
things have you encountered? If you've
got real world examples without
identifying anyone, of course, in my own
experience, there aren't many I can
share of disclosure processes that are,
you know, ongoing, etc. But one that I
can is a prompt injection and this is a
pretty accessible attack that's also
relatively easy to perform. So there's
lots of news about this. So it involves
targeting that deployment and inference
stage that I talked about where the
model is making its decisions. And
prompt injection involves crafting a
malicious prompt or a malicious input
that then elicits a dangerous response
from the model, bypassing their security
guard rails. So through these types of
attacks, people can confuse the model
into sharing data that shouldn't be
included either because it's malicious
or because it is sensitive information.
So it's either that deceive or disclose
or a mixture of both. The one that I can
share is I performed prompt injection on
a website to find some proprietary
technologies that an organization had in
use which would have been important PI
for them. So they had a chatbot on their
website which had too much access to
information about its own programming.
And after a few hours of me trying
various prompt injection techniques, I
could find out a system instructions or
the system prompts which I mentioned
before are important PI for the company
as it is the basis for their chatbot,
how it acts and how it performs as well
as being able to find information about
the model's architecture which is yeah
it was pretty huge. So unfortunately
it's very easy to achieve with most
language models. Chat bots are probably
an openf facing tool that a lot of
businesses might think are useful for
AI. Yeah, exactly. And they're often
Yeah, we'll talk about this later in the
pitfalls, but everyone wants one and no
one really thinks of the consequences.
But there are some really fun examples
that I've come across in my research of
very very interesting attacks if you
want to hear about them. I'm sure our
listeners would love to hear that, too.
Yeah. Awesome. So, this is a personal
favorite of mine, and it's about the
model deception stage. So, between 2020
and 2021, this guy called Eric
Jacklitch, I'm not sure how to say his
name, he successfully bypassed an
AIdriven identity verification system
and it allowed him to file fraudulent
unemployment claims in the state of
California. So basically this AI powered
facial recognition and document
verification system, it was used to
validate identities in government benefit
benefit
applications. So what it did was it
matched the image of someone's face in a
selfie that they took with their face on
their driver's license, but it missed
like a really crucial step where it
didn't correspond with any other sort of
database at all. So all it was doing was
matching that the driver's license
matched the selfie that was sent in, but
not any government records of what that
person actually looked like. So this
bloke, he went and he took a bunch of
stolen identities, like stolen names,
dates of birth, and social security
numbers, and he went and forged driver's
license with all these people, but then
replaced the real individual's photos
with his own, wearing a wig or some
other sort of disguise. And then he went
and created accounts on this system and
then uploaded the ID photo with the
photo of himself wearing a wig. And then
when he needed to do the confirmation of
identity, he put the wig on again and he
took a selfie. And the AI powered system
incorrectly was like, "Yeah, that's
that's the guy that's or the girl, I
don't know, that's Sarah." Because it
didn't check any other sort of database.
And with that identity verification
complete, he then filed fraudulent unemployment
unemployment
claims, directed the payments to his
account, and he just went to an ATM and
took them out. And I'm sure he got
himself in a lot of trouble for doing
this. Just a little bit. So that's an
error in the testing phase. Yeah. So I
definitely think there are a few
takeaways from that. A that in any sort
of critical decision-m system or any
system that has financial repercussions, etc.,
etc.,
humans should be involved in the process
of validating the AI outputs on mass. I
think AI is a really good use case
there. But a you need to make sure that
it's actually checking against some
other value that isn't based on a user's
input cuz that's where all problems
occur in every system AI or it is user
input, right? And having some sort of
human verifying that process whether
just tabbing through all of the
decisions that the AI made on mass or
picking a subset is important. And yeah,
of course that system could have
benefited from testing as well just
because knowing my own team of
pentesters like that's one of the first
scenarios we would have tested. It would
have been so fun. Are there any other
examples that might have some really
good takeaways? So there was this one
called the Morris 2 worm. So the Morris
worm was the first internet worm that
spread without user interaction. Right?
So last year researchers developed
Morris 2 which is a zeroclick worm
meaning it's a type of malware that
spreads automatically without requiring
user interaction. But this worm targeted
generative AI. So it used this technique
called adversarial self-replicating
prompt injection. So that prompt
injection that I talked about before,
but it was
self-perpetuating. And what they did was
they demonstrated a proof of concept of
this by attacking an AI powered email
assistant. It can send auto replies to
people. It can interpret emails that are
coming in. It can summarize to you
what's happening. All of that. But it
has access to your emails, which always
has security issues. So what they did
was they had attacker send an email to
users who used an AI powered assistant
and the incoming email would
automatically be processed and stored by
that AI assistant in its memory and then
the AI assistant would use it in the
reference with all the other emails like
within the context of all the other
emails to build its responses. that this
this adversarial email that they sent
first it included malicious instructions
for data leakage so such that the AI
assistant would respond to the original
email leaking sensitive information from
the target systems emails and then it
would also include this self-replication
aspect right where it's telling that AI
assistant to reinsert this malicious
prompt in future emails to all other
users so then in the case where any
other person you're emailing uses an AI
emailed assistant, they would receive
that malicious prompt. It would again be
stored in their AI assistance system,
cause leakage in their replies, like
data stealing in their replies, and then
it would self-replicate in their new
emails, and then it would just spread
from there. It's pretty scary prospect.
This Morris worm demonstration was was
really good to see like a again how
susceptible language models are to
surprise surprise language with all that
prompt injection. B how AI how different
AI systems can chain together to
perpetuate attacks. So that attack just
got carried on by AI to AI to AI. It
involves no human interaction. They they
did it themselves. And lastly, how AI
systems that store context and memory,
they introduce really bad persistent
risks because attackers can manipulate
the memory to achieve long-term effects
because now that that malicious email
prompt is stored in that person's email,
that the email assistant's memory, it
will continue being inserted into their
emails until they realize it's there.
So, what you're talking about is these
email assistants that might help you
rewrite your emails and reply to people
or might also be a system where a
business has an automatic reply system
that's using AI to reply to incoming
emails in inbox. Is that right? Yeah,
absolutely. That could target a system
that has any AI powered automation.
Yeah, it's a dangerous thought. So, you
mentioned that AI is being used for
fishing or vicing. I think some people
say as well. Can you maybe explain what
it's being used for in that context
because I think that that's something
that might be ending up in a lot of
inboxes. Yeah, absolutely. So to start
with fishing, how AI is being leveraged
there. Traditionally, fishing emails
were sort of sort of obvious if you knew
what to look for. You know, they were
really emotive in their language, trying
to get you to click on a link or
download something and respond to the
email immediately, interact with it in
some way. And also there are really
often I guess language barrier
differences, so spelling errors or bad
grammar. But with
LLMs, all of that is is reduced. People
can get these emails automatically
written in perfect English, so they
don't look too strange. And they're also
automating what we call staged fishing
campaigns. So instead of sending one
email to you with a link and being like,
"Please click now." They send you a
perfectly formatted email with no
dangerous link or attachment, they're
just trying to seek your engagement. And
then once you start talking to them as
if it's a normal conversation with a
human being, they automate replies from
an LLM to build rapport with you. And
then finally down the line, maybe in
your fifth correspondence, they'll send
the actual fishing attack, right? And by
this point, you think you're talking
with a legitimate client or customer or
someone from another organization, and
it's all been powered by an LLM in the
background. It's much harder to detect
than what people are used to looking for
in fishing emails. And then there's also
voice fishing, which is called vishing
for short. And people extract some level
of people's voices online, maybe your
voice or mine from the podcast, and then
they create a model that can mimic your
voice and get it to say whatever they
want it to say. And it might be CEO of
an organization, for example. And they
then call an employee playing back to
them the CEO's voice saying, "Hey, I
need you to make a transfer of this
amount to this bank account." And the
employee is like, "Yeah, that's Ben, my
CEO. No worries." And often times it
would be people in the financial
position that are being targeted.
Absolutely. So what about in the hiring
process? If I'm hiring somebody, is
there a chance that I could be duped by
some of these AI deep fakes? And is
there any examples where people have
been duped by this? Yeah, 100%. So an
example last year was that actually a
security company hired a North Korean
because as they were interviewing they
used an AI powered face changer and they
also used you know a deep fake generator
for all of their other photos on their
resume and things like that. So when
they were going through the interview
process and all stages of application,
they seemed like an American citizen and
they were successful in getting the role
because no one ever knew their real
identity. Going back to those pitfalls
that you mentioned, what are the
pitfalls in AI security that you're
seeing at the moment and what should
people who are listening to this podcast
think about if they want to mitigate the
risk of tools that they might be
planning on using?
Yeah, I guess the most common one that
has been around since AI became a
buzzword was around the AI hype and the
business use case often outruns the
security considerations because everyone
wants to capitalize on this AI hype,
right? So they they rush to pushing some
sort of AI system for their customers to
production, but they don't consider the
security implications of that or they
don't have the right security engineers
on the team. and they just have ML and
AI engineers who are wonderful at what
they do but they might not necessarily
be specialized in AI or ML security or
or cyber security in which there are a
lot of effects on AI systems as well. So
I think anyone looking to implement that
either internal use in their
organization or a chatbot on their
website etc you need to do the due
diligence that you would with any other
system. You need to get it tested and
assessed. You need to do risk profiling.
You need to make sure that the design is
secure from the outset, the coding of it
and that you're practicing dev sec ops
development security operations in your
processes. That's the core irk that we
have when we just see people being like,
"Ah, we pushed some AI model." The
second is relying blindly on the output
of AI models. So not validating their
outputs in decision-m contexts because
you know like we talked about earlier
they're statistical models and they're
prone to errors and bias and what this
could look like and is very commonly
happening is in a coding scenario lots
of junior developers are asking chatbt
to write code for them and they're just
copy pasting the output into
applications without performing due
diligence or understanding what the code
is saying. And in the case of say the
trading data of that model being
poisoned up in the supply chain, if the
poisoning included malicious malware to
be put in the code generation output,
then that developer might have just copy
pasted malware into their organization's
application and let it execute on a
sensitive system. That's a attack that
occurs in that data gathering and
pre-processing and training phase.
That's already hard baked into the type
of AI you've decided to use. Absolutely.
Yeah. It's also really hard to identify,
right? Because if you think about how
much training data goes into these
models, you only really need to affect a
small amount of that data to have huge
implications. So what we're expecting in
general is that there has been a lot of
training data poisoning in models that
are online today, but we just haven't
seen the effects of them yet. These
attacks might be waiting dormantly. It's
a bit of a concern, but it could be like
I said in that coding scenario, they
just copy paste some bad code and they
don't look at it and now the whole
system is compromised. Or it could be
things like relying blindly on the
output of that identity verification
system in that California case we talked
about where you don't check it. You just
assume that the AI is making the right
decision and you go from there. People
just think that AI systems are perfect
and they they do what humans can't do
and don't make mistakes. So they take
what they say for granted as well.
People often use AI systems to help them
with things that they're not sure about
themselves, right? They use it for
searching. So if you're not sure, you
often can't validate that what it's
saying to you is correct. You'll just
take it for granted. It's very dangerous
to do that. I guess the last one I would
warn about as well is sharing sensitive
information on publicly hosted models.
So if you know your organization's
running an internal only software, then
there's a bit of a lessened risk because
that information isn't going to the
cloud and it's not potentially publicly
accessible in some sort of data breach.
But in terms of just using chatbt
online, etc., you should definitely be
watching what you put in there in terms
of sensitive information.
So would you say that the first two
things would be policies for businesses
and then maybe professional advice if
any businesses are thinking about
implementing this type of technology?
Yeah, it never hurts to get some AI
subject matter expert advice on the case
and organizational policies for the use
of AI are really important, but they're
only as effective as how well people
understand them. So getting training for
your organization and your employees on
understanding AI risks is very
important. As these technologies evolve,
what emerging trends in AI security
should businesses and professionals keep
on their radar? So in terms of
vulnerabilities in AI systems, which is
what we've been talking about today, I
guess staying ahead of the trends and
understanding, you know, you don't need
to get technically deep into things. you
just just kept up to date with the news
in terms of what what's happening in AI
systems where they're at risk
particularly laws and regulations that
are coming out around AI use and
deployment because that will have
intense you know policy and governance
implications for organizations and
without doing a sales pitch part of my
work at Maleva is producing a
fortnightly newsletter which goes out to
whoever wants to subscribe there's a
TLDDR that's easy to understand and a
more technical explanation for those who
are really interested in the most recent
AI security news, vulnerabilities and
research with with implications. So
that's a good thing. We also do a
monthly industry briefing where security
professionals or their executives, they
can come in for like an hour and we'll
just talk about the takeaways of the
month. So yeah, I think staying up to
date is your best tool there. You also
mentioned regulatory risk there. Yeah.
So currently things like the the GDPR in
Europe, that's what's being used to
govern the use of AI and coming into
this year as well as the EU AI act. And
they're going to start finding people
for misuse and deployment that doesn't
have the privacy and information
security aspects that they're expecting
of organizations. So whilst Australia,
for example, hasn't implemented
something like that, they might seek to.
So it's important to stay up to date on
that. What they have said though is that
no one can use deepseek or organizations
and government etc. can't use deepseek.
So knowing what is coming into play at
what times it will very much help
organizations move through that space.
But I think people should really be
aware of how AI is being used by
adversaries potentially against them as
well. So, we've talked a lot about
vulnerabilities in AI systems, but it's
important to know and keep up to date
with how AI might be used to target you.
By that, I mean like fishing campaigns
or voice fishing campaigns or deep fakes
against CEOs and public figures in your
organization. So, staying up to date is
your best tool at the moment. Thanks for
your time and insights today. It's been
incredibly valuable and I'm sure that
our listeners know a lot more about AI
now than they did at the beginning of
our chat. So, thanks for joining us.
It's been great having you on the show.
Thank you for the opportunity. It's been
great speaking with you. And thank you
for listening to In the Black. Don't
forget to check our show notes for links
and resources from CBA Australia, as
well as other material from Miranda and
her teams at Maliva and Malware
Security. If you've enjoyed this show,
please share with your friends and
colleagues and hit the subscribe button
so you don't miss future episodes. Until
next time, thanks for listening.
If you've enjoyed this episode, help
others discover in the black by leaving
us a review and sharing this episode
with colleagues, clients, or anyone else
interested in leadership strategy and
business. To find out more about our
other podcasts, check out the show notes
for this episode. And we hope you can
join us again next time for another
episode of In the Black. [Music]
Click on any text or timestamp to jump to that moment in the video
Share:
Most transcripts ready in under 5 seconds
One-Click Copy125+ LanguagesSearch ContentJump to Timestamps
Paste YouTube URL
Enter any YouTube video link to get the full transcript
Transcript Extraction Form
Most transcripts ready in under 5 seconds
Get Our Chrome Extension
Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.
Works with YouTube, Coursera, Udemy and more educational platforms
Get Instant Transcripts: Just Edit the Domain in Your Address Bar!
YouTube
←
→
↻
https://www.youtube.com/watch?v=UF8uR6Z6KLc
YoutubeToText
←
→
↻
https://youtubetotext.net/watch?v=UF8uR6Z6KLc