YouTube Transcript:
AI and cybersecurity: penetration tester reveals key dangers

Skip watching entire videos - get the full transcript, search for keywords, and copy with one click.

Video Transcript

This is In the Black, a leadership,

strategy and business podcast brought to

Australia. Welcome to In the Black. I'm

Gareth Hanley and in today's show we're

talking with Miranda about the world of

artificial intelligence and its

implications for cyber security.

Miranda is an AI vulnerability

researcher and a trainer with

Maliva and she's the offensive security

team manager at malware security. At

malware security, she conducts

penetration testing for various sectors

including government and private

industry. Miranda has also worked on the

chips team within the ASD's

ACSC. Welcome to In the Black, Miranda.

Thank you. Thank you so much for having me.

me.

Look, we've got some questions lined up,

but before we start, I' I've just

rattled off a few acronyms there. The

CHIPS CHIPS team and the ASD and the

ACSC. Can you maybe explain for our

listeners who have no idea what I'm

talking about, what those are? Yeah,

absolutely. So ASD stands for the

Australian Signals Directorate and

they're an organization

organization

who work with foreign signals

intelligence, cyber security and

offensive cyber operations. So within

the ASD there's the ACSC which is the

Australian cyber security center and

specifically with the ACSC there's the

CHIPS team which was the cyber hygiene

improvement programs team and this team

is fairly public so what they do is is

allowed to be known. They're in charge

of performing enumeration and scanning

of government and critical

infrastructures attack surface and then

they provide quarterly reports to these

agencies on where their cyber security

posture is lacking. They have a really

really important role actually in making

sure that government's attack surface is

reduced as much as possible from the

internet facing aspects. So no potato

chips. No, no. Although they love the

chips acronym and they have a a section

called hot chips which is high priority

operational tasking and this is a

section where whenever a critical

vulnerability is notified they do

immediate scanning of government and

critical infrastructure to then notify

people who are exposed to the CVAs the

critical vulnerabilities. Going back to

the hot topic, you're involved in what's

known as adversarial machine learning.

Does that mean that you hack AI systems?

And how does that compare with

traditional cyber security like like

firewalls and penetration testing? Being

an AI hacker is is a really cool way to

put it. I would say I'm more of a

vulnerability researcher though I love

yeah performing AI hacks and and

learning about them too. So let's talk

about adversarial machine learning

quickly and then I'll compare AI systems

to IT systems so we kind of get a gist

of of what the difference in my work is

there. So adversarial machine learning

and I'm just going to call it AML from

now on because the whole acronym is hard

to say on and on. So, it's the study of

attacks on machine learning algorithms

designed to disrupt models, preventing

them from doing what they're meant to,

or deceive models into performing tasks

they're not meant to, or making models

disclose information that they aren't

meant to. So, at Maleva, we call these

the 3Ds. Disrupt, disclose, and deceive.

And we've made them as a sort of AI

equivalent to the CIA triad, which might

be familiar to listens. It's listeners.

It's a um a framework that is used to I

guess evaluate the impacts of

vulnerabilities through confidentiality,

integrity or availability. That's the

CIA and that's of people's data on

computer systems. Yeah. So that that's

used to measure the impact of

vulnerabilities on information security

systems. So the triple D disclose,

disrupt and deceive are a way to measure

the impact of AI or adversarial machine

learning attacks and

vulnerabilities. And in terms of how AI

an AI system is different to an IT

system, a few things that make it

different and which make it necessary to

differentiate AI security from the field

of cyber security and why risk

mitigation is really different for both

of them as well. So for example, IT

systems, they're deterministic and and

rule-based. They follow really strict

predefined and explicit logic or code.

And if an error occurs in one of these

types of systems, it can typically be

traced back to a specific line of code.

And for vulnerability management, that

means that you can often directly find

where a cyber security vulnerability

occurred and you can fix it with a

onetoone direct patch at the source of

the problem. And that might be just

through updating the code, configuring

settings or applying some other sort of

fix. But AI is is quite different from

that. The AI systems are inherently

probabilistic. And this comes down to

the underlying architecture that is

built off of mathematical and

statistical models. And that's that's a

whole talk for another time. But because

of that nature, there's rarely a

onetoone direct cause because AI systems

don't follow rigid rules or hard-coded

instructions. They generate outputs

based on these statistical

likelihoods. And that uncertainty is

what makes AI so powerful and so good at

what it does. Because of this

uncertainty, it can adapt and it can

infer and it can make generalizations

and really work with diverse data. But

it's also what makes it really

vulnerable because the uncertainty also

leads it being prone to errors and also

being prone to being biased,

unpredictable and

manipulatable. So yeah, AI vulnerability

management is really really difficult

because unlike cyber security and and

traditional software where you can just

patch it, with AI you can try and

optimize the architecture as much as

possible. You can try and fine-tune

models, which means align them and train

them closer to the purpose in which you

want them to perform. And you can add in

all these layers of internal and

external defenses. But because of this

likelihood in its output, there's always

a level left over where you just have to

accept that the model might be eronous

and produce mistakes. That's one aspect

and that was a lot. The second which is

I guess more simple to understand is

that it I it systems don't take

undefined inputs. They're really

structured. They're programmed to accept

one kind of input and output one type of

input. It might only intake database

queries or it might only intake language

when you're putting in your name in an

input field on a website, right? And if

you get it wrong, it will send you an

error. And errors around this are

usually due to people not having the

right protections in the backend code of

what's happening to that input. But AI

and people who have used chatbt for

example will know that you can you can

give it almost anything. You can give it

files, you can give it code, you can

give it mathematical questions, you can

give it language based questions. And

other systems also take in things like

sensory data from IoT devices and

things. And that just means that it's so

hard to secure that input because now

all of a sudden you have this

multimodal input and this huge attack

surface. It's it's really difficult to

secure. And what were those three Ds

again? Disrupt, deceive, and

disclose. Disrupt is denial of service,

preventing it from doing what it's meant

to. Deceive is about tricking the model

into doing something that it's not

usually allowed to do. For example, you

might have seen a lot of things called

jailbreaking or prompt injecting or

prompt engineering related to chatbt.

So where you might get it to talk about

a topic that it's not supposed to talk

about. Yeah, 100%. So that's deception.

You're deceiving the model into doing

that. And disclosure. So that would be

about getting the model to release

sensitive information, for example,

about other users. Is that because if

I'm using an AI system, what I'm typing

into the system is held somewhere in

memory and so somebody else might be

able to extract that from the memory. So

it could either be disclosing sensitive

user data if there's some sort of

problem where the AI can access data

about multiple users and then someone

might be able to pull your data across

into their session. Or it could be

disclosure of the proprietary

information from whoever has deployed

the AI and what the model has been

trained on. Yeah, absolutely. Or things

like the system prompt as well. So these

are this is a set of instructions that

is I guess a very fundamental piece of

how the AI knows how to perform its

task. And if you disclose that that

again is is a bit of a PI loss for the

company. So are all AI systems the same?

There's a few popular ones that are out

there that people will know of. Are they

all the same? So, all AI systems aren't

the same in terms of their purpose or

capability or even in their

architecture, but the processes that

that underpin them are the same. So, by

this I mean where they're different

could be in that the models can undergo

a variety of training types such as

supervised, unsupervised or

reinforcement learning. not worth

getting into those unless you're

actually wanting to design an AI system.

But those learning techniques can lead

to vastly different performance

outcomes. So people will choose one that

is is most optimal for their scenario.

Then models can also be fine-tuned which

means like I talked about earlier

aligning them to make them particularly

adept and good at doing one specific

thing. Or they could have entirely

different infrastructures. So you know

one that you will know of and probably

use day-to-day is a large language model

or LLM for example like chatbt or

deepseek or claude bard some of the

other ones and these reconstruct text

from human language or other inputs and

another type could be for example a

convolutional neural network or a CNN

and this provides computers with

vision-like abilities so it's referred

to as computer vision and it allows them

to be able to see differences in images

as a human would. So you would find

these types in um facial recognition

systems. But even though there are all

those differences, what is the same is

the underlying process which adversarial

machine learning exploits or AML

exploits used to target. So machine

learning models, whether they're an LLM

or a CNN or something else, they follow

the same life cycle of starting with

data gathering, data

prep-processing, model training, and

then finally deployment of the model and

inference, which is where it makes its

outputs. And all of these systems can

most definitely be exploited to access

sensitive data throughout any of the

stages in that life cycle. What type of

things have you encountered? If you've

got real world examples without

identifying anyone, of course, in my own

experience, there aren't many I can

share of disclosure processes that are,

you know, ongoing, etc. But one that I

can is a prompt injection and this is a

pretty accessible attack that's also

relatively easy to perform. So there's

lots of news about this. So it involves

targeting that deployment and inference

stage that I talked about where the

model is making its decisions. And

prompt injection involves crafting a

malicious prompt or a malicious input

that then elicits a dangerous response

from the model, bypassing their security

guard rails. So through these types of

attacks, people can confuse the model

into sharing data that shouldn't be

included either because it's malicious

or because it is sensitive information.

So it's either that deceive or disclose

or a mixture of both. The one that I can

share is I performed prompt injection on

a website to find some proprietary

technologies that an organization had in

use which would have been important PI

for them. So they had a chatbot on their

website which had too much access to

information about its own programming.

And after a few hours of me trying

various prompt injection techniques, I

could find out a system instructions or

the system prompts which I mentioned

before are important PI for the company

as it is the basis for their chatbot,

how it acts and how it performs as well

as being able to find information about

the model's architecture which is yeah

it was pretty huge. So unfortunately

it's very easy to achieve with most

language models. Chat bots are probably

an openf facing tool that a lot of

businesses might think are useful for

AI. Yeah, exactly. And they're often

Yeah, we'll talk about this later in the

pitfalls, but everyone wants one and no

one really thinks of the consequences.

But there are some really fun examples

that I've come across in my research of

very very interesting attacks if you

want to hear about them. I'm sure our

listeners would love to hear that, too.

Yeah. Awesome. So, this is a personal

favorite of mine, and it's about the

model deception stage. So, between 2020

and 2021, this guy called Eric

Jacklitch, I'm not sure how to say his

name, he successfully bypassed an

AIdriven identity verification system

and it allowed him to file fraudulent

unemployment claims in the state of

California. So basically this AI powered

facial recognition and document

verification system, it was used to

validate identities in government benefit

benefit

applications. So what it did was it

matched the image of someone's face in a

selfie that they took with their face on

their driver's license, but it missed

like a really crucial step where it

didn't correspond with any other sort of

database at all. So all it was doing was

matching that the driver's license

matched the selfie that was sent in, but

not any government records of what that

person actually looked like. So this

bloke, he went and he took a bunch of

stolen identities, like stolen names,

dates of birth, and social security

numbers, and he went and forged driver's

license with all these people, but then

replaced the real individual's photos

with his own, wearing a wig or some

other sort of disguise. And then he went

and created accounts on this system and

then uploaded the ID photo with the

photo of himself wearing a wig. And then

when he needed to do the confirmation of

identity, he put the wig on again and he

took a selfie. And the AI powered system

incorrectly was like, "Yeah, that's

that's the guy that's or the girl, I

don't know, that's Sarah." Because it

didn't check any other sort of database.

And with that identity verification

complete, he then filed fraudulent unemployment

unemployment

claims, directed the payments to his

account, and he just went to an ATM and

took them out. And I'm sure he got

himself in a lot of trouble for doing

this. Just a little bit. So that's an

error in the testing phase. Yeah. So I

definitely think there are a few

takeaways from that. A that in any sort

of critical decision-m system or any

system that has financial repercussions, etc.,

etc.,

humans should be involved in the process

of validating the AI outputs on mass. I

think AI is a really good use case

there. But a you need to make sure that

it's actually checking against some

other value that isn't based on a user's

input cuz that's where all problems

occur in every system AI or it is user

input, right? And having some sort of

human verifying that process whether

just tabbing through all of the

decisions that the AI made on mass or

picking a subset is important. And yeah,

of course that system could have

benefited from testing as well just

because knowing my own team of

pentesters like that's one of the first

scenarios we would have tested. It would

have been so fun. Are there any other

examples that might have some really

good takeaways? So there was this one

called the Morris 2 worm. So the Morris

worm was the first internet worm that

spread without user interaction. Right?

So last year researchers developed

Morris 2 which is a zeroclick worm

meaning it's a type of malware that

spreads automatically without requiring

user interaction. But this worm targeted

generative AI. So it used this technique

called adversarial self-replicating

prompt injection. So that prompt

injection that I talked about before,

but it was

self-perpetuating. And what they did was

they demonstrated a proof of concept of

this by attacking an AI powered email

assistant. It can send auto replies to

people. It can interpret emails that are

coming in. It can summarize to you

what's happening. All of that. But it

has access to your emails, which always

has security issues. So what they did

was they had attacker send an email to

users who used an AI powered assistant

and the incoming email would

automatically be processed and stored by

that AI assistant in its memory and then

the AI assistant would use it in the

reference with all the other emails like

within the context of all the other

emails to build its responses. that this

this adversarial email that they sent

first it included malicious instructions

for data leakage so such that the AI

assistant would respond to the original

email leaking sensitive information from

the target systems emails and then it

would also include this self-replication

aspect right where it's telling that AI

assistant to reinsert this malicious

prompt in future emails to all other

users so then in the case where any

other person you're emailing uses an AI

emailed assistant, they would receive

that malicious prompt. It would again be

stored in their AI assistance system,

cause leakage in their replies, like

data stealing in their replies, and then

it would self-replicate in their new

emails, and then it would just spread

from there. It's pretty scary prospect.

This Morris worm demonstration was was

really good to see like a again how

susceptible language models are to

surprise surprise language with all that

prompt injection. B how AI how different

AI systems can chain together to

perpetuate attacks. So that attack just

got carried on by AI to AI to AI. It

involves no human interaction. They they

did it themselves. And lastly, how AI

systems that store context and memory,

they introduce really bad persistent

risks because attackers can manipulate

the memory to achieve long-term effects

because now that that malicious email

prompt is stored in that person's email,

that the email assistant's memory, it

will continue being inserted into their

emails until they realize it's there.

So, what you're talking about is these

email assistants that might help you

rewrite your emails and reply to people

or might also be a system where a

business has an automatic reply system

that's using AI to reply to incoming

emails in inbox. Is that right? Yeah,

absolutely. That could target a system

that has any AI powered automation.

Yeah, it's a dangerous thought. So, you

mentioned that AI is being used for

fishing or vicing. I think some people

say as well. Can you maybe explain what

it's being used for in that context

because I think that that's something

that might be ending up in a lot of

inboxes. Yeah, absolutely. So to start

with fishing, how AI is being leveraged

there. Traditionally, fishing emails

were sort of sort of obvious if you knew

what to look for. You know, they were

really emotive in their language, trying

to get you to click on a link or

download something and respond to the

email immediately, interact with it in

some way. And also there are really

often I guess language barrier

differences, so spelling errors or bad

grammar. But with

LLMs, all of that is is reduced. People

can get these emails automatically

written in perfect English, so they

don't look too strange. And they're also

automating what we call staged fishing

campaigns. So instead of sending one

email to you with a link and being like,

"Please click now." They send you a

perfectly formatted email with no

dangerous link or attachment, they're

just trying to seek your engagement. And

then once you start talking to them as

if it's a normal conversation with a

human being, they automate replies from

an LLM to build rapport with you. And

then finally down the line, maybe in

your fifth correspondence, they'll send

the actual fishing attack, right? And by

this point, you think you're talking

with a legitimate client or customer or

someone from another organization, and

it's all been powered by an LLM in the

background. It's much harder to detect

than what people are used to looking for

in fishing emails. And then there's also

voice fishing, which is called vishing

for short. And people extract some level

of people's voices online, maybe your

voice or mine from the podcast, and then

they create a model that can mimic your

voice and get it to say whatever they

want it to say. And it might be CEO of

an organization, for example. And they

then call an employee playing back to

them the CEO's voice saying, "Hey, I

need you to make a transfer of this

amount to this bank account." And the

employee is like, "Yeah, that's Ben, my

CEO. No worries." And often times it

would be people in the financial

position that are being targeted.

Absolutely. So what about in the hiring

process? If I'm hiring somebody, is

there a chance that I could be duped by

some of these AI deep fakes? And is

there any examples where people have

been duped by this? Yeah, 100%. So an

example last year was that actually a

security company hired a North Korean

because as they were interviewing they

used an AI powered face changer and they

also used you know a deep fake generator

for all of their other photos on their

resume and things like that. So when

they were going through the interview

process and all stages of application,

they seemed like an American citizen and

they were successful in getting the role

because no one ever knew their real

identity. Going back to those pitfalls

that you mentioned, what are the

pitfalls in AI security that you're

seeing at the moment and what should

people who are listening to this podcast

think about if they want to mitigate the

risk of tools that they might be

planning on using?

Yeah, I guess the most common one that

has been around since AI became a

buzzword was around the AI hype and the

business use case often outruns the

security considerations because everyone

wants to capitalize on this AI hype,

right? So they they rush to pushing some

sort of AI system for their customers to

production, but they don't consider the

security implications of that or they

don't have the right security engineers

on the team. and they just have ML and

AI engineers who are wonderful at what

they do but they might not necessarily

be specialized in AI or ML security or

or cyber security in which there are a

lot of effects on AI systems as well. So

I think anyone looking to implement that

either internal use in their

organization or a chatbot on their

website etc you need to do the due

diligence that you would with any other

system. You need to get it tested and

assessed. You need to do risk profiling.

You need to make sure that the design is

secure from the outset, the coding of it

and that you're practicing dev sec ops

development security operations in your

processes. That's the core irk that we

have when we just see people being like,

"Ah, we pushed some AI model." The

second is relying blindly on the output

of AI models. So not validating their

outputs in decision-m contexts because

you know like we talked about earlier

they're statistical models and they're

prone to errors and bias and what this

could look like and is very commonly

happening is in a coding scenario lots

of junior developers are asking chatbt

to write code for them and they're just

copy pasting the output into

applications without performing due

diligence or understanding what the code

is saying. And in the case of say the

trading data of that model being

poisoned up in the supply chain, if the

poisoning included malicious malware to

be put in the code generation output,

then that developer might have just copy

pasted malware into their organization's

application and let it execute on a

sensitive system. That's a attack that

occurs in that data gathering and

pre-processing and training phase.

That's already hard baked into the type

of AI you've decided to use. Absolutely.

Yeah. It's also really hard to identify,

right? Because if you think about how

much training data goes into these

models, you only really need to affect a

small amount of that data to have huge

implications. So what we're expecting in

general is that there has been a lot of

training data poisoning in models that

are online today, but we just haven't

seen the effects of them yet. These

attacks might be waiting dormantly. It's

a bit of a concern, but it could be like

I said in that coding scenario, they

just copy paste some bad code and they

don't look at it and now the whole

system is compromised. Or it could be

things like relying blindly on the

output of that identity verification

system in that California case we talked

about where you don't check it. You just

assume that the AI is making the right

decision and you go from there. People

just think that AI systems are perfect

and they they do what humans can't do

and don't make mistakes. So they take

what they say for granted as well.

People often use AI systems to help them

with things that they're not sure about

themselves, right? They use it for

searching. So if you're not sure, you

often can't validate that what it's

saying to you is correct. You'll just

take it for granted. It's very dangerous

to do that. I guess the last one I would

warn about as well is sharing sensitive

information on publicly hosted models.

So if you know your organization's

running an internal only software, then

there's a bit of a lessened risk because

that information isn't going to the

cloud and it's not potentially publicly

accessible in some sort of data breach.

But in terms of just using chatbt

online, etc., you should definitely be

watching what you put in there in terms

of sensitive information.

So would you say that the first two

things would be policies for businesses

and then maybe professional advice if

any businesses are thinking about

implementing this type of technology?

Yeah, it never hurts to get some AI

subject matter expert advice on the case

and organizational policies for the use

of AI are really important, but they're

only as effective as how well people

understand them. So getting training for

your organization and your employees on

understanding AI risks is very

important. As these technologies evolve,

what emerging trends in AI security

should businesses and professionals keep

on their radar? So in terms of

vulnerabilities in AI systems, which is

what we've been talking about today, I

guess staying ahead of the trends and

understanding, you know, you don't need

to get technically deep into things. you

just just kept up to date with the news

in terms of what what's happening in AI

systems where they're at risk

particularly laws and regulations that

are coming out around AI use and

deployment because that will have

intense you know policy and governance

implications for organizations and

without doing a sales pitch part of my

work at Maleva is producing a

fortnightly newsletter which goes out to

whoever wants to subscribe there's a

TLDDR that's easy to understand and a

more technical explanation for those who

are really interested in the most recent

AI security news, vulnerabilities and

research with with implications. So

that's a good thing. We also do a

monthly industry briefing where security

professionals or their executives, they

can come in for like an hour and we'll

just talk about the takeaways of the

month. So yeah, I think staying up to

date is your best tool there. You also

mentioned regulatory risk there. Yeah.

So currently things like the the GDPR in

Europe, that's what's being used to

govern the use of AI and coming into

this year as well as the EU AI act. And

they're going to start finding people

for misuse and deployment that doesn't

have the privacy and information

security aspects that they're expecting

of organizations. So whilst Australia,

for example, hasn't implemented

something like that, they might seek to.

So it's important to stay up to date on

that. What they have said though is that

no one can use deepseek or organizations

and government etc. can't use deepseek.

So knowing what is coming into play at

what times it will very much help

organizations move through that space.

But I think people should really be

aware of how AI is being used by

adversaries potentially against them as

well. So, we've talked a lot about

vulnerabilities in AI systems, but it's

important to know and keep up to date

with how AI might be used to target you.

By that, I mean like fishing campaigns

or voice fishing campaigns or deep fakes

against CEOs and public figures in your

organization. So, staying up to date is

your best tool at the moment. Thanks for

your time and insights today. It's been

incredibly valuable and I'm sure that

our listeners know a lot more about AI

now than they did at the beginning of

our chat. So, thanks for joining us.

It's been great having you on the show.

Thank you for the opportunity. It's been

great speaking with you. And thank you

for listening to In the Black. Don't

forget to check our show notes for links

and resources from CBA Australia, as

well as other material from Miranda and

her teams at Maliva and Malware

Security. If you've enjoyed this show,

please share with your friends and

colleagues and hit the subscribe button

so you don't miss future episodes. Until

next time, thanks for listening.

If you've enjoyed this episode, help

others discover in the black by leaving

us a review and sharing this episode

with colleagues, clients, or anyone else

interested in leadership strategy and

business. To find out more about our

other podcasts, check out the show notes

for this episode. And we hope you can

join us again next time for another

episode of In the Black. [Music]

Click on any text or timestamp to jump to that moment in the video

Most transcripts ready in under 5 seconds

One-Click Copy125+ LanguagesSearch ContentJump to Timestamps

Paste YouTube URL

Enter any YouTube video link to get the full transcript

Most transcripts ready in under 5 seconds

Get Our Chrome Extension

Get transcripts instantly without leaving YouTube. Install our Chrome extension for one-click access to any video's transcript directly on the watch page.

Add to Chrome — Free

Works with YouTube, Coursera, Udemy and more educational platforms

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube

←

→

↻

https://www.youtube.com/watch?v=UF8uR6Z6KLc

YoutubeToText

←

→

↻

https://youtubetotext.net/watch?v=UF8uR6Z6KLc

YouTube TranscriptPreparing your results…

YouTube Transcript:AI and cybersecurity: penetration tester reveals key dangers

Video Transcript

Paste YouTube URL

Transcript Extraction Form

Get Our Chrome Extension

Get Instant Transcripts: Just Edit the Domain in Your Address Bar!

YouTube Transcript:
AI and cybersecurity: penetration tester reveals key dangers