Hack the Plant

Preparing for the potential worst day

Episode Summary

“From an architecture standpoint, from a resilience standpoint, from a capabilities standpoint, I think everybody's kind of facing the same problems, and I think there's not enough resiliency baked into these systems,” says Paul Shaver. Paul Shaver is Global OT Security Practice Lead at Mandiant / Google Cloud. In this episode, Bryson and Paul discuss Paul’s military background, the difference between persistent threats and regular threats, and more.

Episode Notes

In this episode, Bryson Bort is joined by Paul Shaver, Global OT Security Practice Lead at Mandiant / Google Cloud to discuss the cyber threat landscape. How did Paul’s military background play a role in his decision to start working with control systems? What is the difference between an advanced persistent threat and a regular threat? What does Paul think is the best way to protect against documented threats from nation-state actors?

“I think if we're not doing a better job of protecting critical infrastructure, protecting our assets, any one of the nation state actors could cause that level of mass scale outage or destruction of capability. It comes down to being better prepared to protect these environments,” Paul said.

Join us for this and more on this episode of Hack the Plant.

Hack the Plant is brought to you by ICS Village and the Institute for Security and Technology.

Episode Transcription

Bryson: I'm Bryson Bort, and this is Hack the Plant, season four. Electricity, finance, transportation, our water supply. We take these critical infrastructure systems for granted. But they're all becoming increasingly dependent on computers to function. We walk through the world of hackers working on front lines of cybersecurity and public safety to protect the systems you rely upon every day.

From the ransomware threats of Colonial Pipeline to the failure of the Texas power grid, it is clear our interconnectivity is also a significant source of risk. This season, we will continue to bring you a panoply of different insights across all of the different things happening in critical infrastructure.

In my day job, I'm the CEO and founder of Scythe and the co-founder with Tom VanNorman of the non profit ICS Village, where we educate people on critical infrastructure security with hands-on examples, not just nerd stuff. I founded Grimm in 2013, a consultancy that works at the front lines of these problems every day for clients all over the world.

I'm also an adjunct senior advisor at the Institute for Security and Technology, a 501(c)(3) think tank dedicated to tackling technology-driven emerging security threats. This is Hack the Plant, brought to you by the Institute for Security and Technology and ICS Village. Subscribe wherever you find podcasts to get each episode when it drops.

Bryson: I’m Bryson Bort and this is Hack the Plant.

For today’s episode, I’m joined by Paul Shaver, the Global Lead for Mandiant / Google Cloud ICS/OT Security Consulting Function and Practice. Mandiant provides its customers and partners with subject matter expertise in ICS security and works to prevent disruption of critical infrastructure through an understanding of relevant cyber threats, rigorous security testing, and threat detection and response capabilities.

We discuss the challenges that he thinks organizations are facing around architecture and resilience.

Paul: …So as far as architecture and resilience, I think the story is the same across the globe, right? I think everybody's in very similar situations. You look at some of the areas where there's, obviously, there's conflict, geopolitical conflict, and those have a different threat landscape as far as who's actually like knocking on the doors every single day. But from an architecture standpoint, from a resilience standpoint, from a capabilities standpoint, I think everybody's kind of facing the same problems, and I think there's not enough resiliency baked into these systems. There's not enough organizations that I would say are adequately prepared to detect and respond to a potentially bad OT incident. Resiliency and the ability to detect and respond to the things that pretty much everybody needs to work on

Bryson: And why he thinks attribution is so important

Paul: …I think attribution is important because I think if we can't hold people responsible, there's no teeth to stopping this kind of cybercrime, especially if we're critical infrastructure. And if we don't understand who their threat actors are and who they're targeting and why they're targeting them, that's only serving to better, better inform our defenses.

Bryson: How did Paul’s military background lead him to start working with control systems? What is the difference between an advanced persistent threat and a regular threat? What does Paul think is the best way to protect against documented threats from nation-state actors? And if he could wave a magic, non-internet-connected wand, what is one thing he would change? Join us for this and more on this episode of Hack the Plant.

Bryson: Paul, welcome! Please introduce yourself to our listeners.

Paul: I am leading OT security consulting at Mandiant, which is now Google cloud. So fun times, interesting world. I have no formal education. I left home and joined the military at 18 years old because I was a terrible student and I was getting in trouble with the law and I was going to end up doing probably worse things than I was doing. And the military really was the only option. I joined the Navy. My family is a Navy family and I wanted to go out and serve and be working as quickly as possible.

So I went to the fleet in an apprenticeship program where I didn't have any particular job and I eventually became a Gunner's Mate working on naval weapon systems and learned all kinds of stuff about pneumatics and hydraulics and servo motor control, fire prevention systems in ammunition magazines.

so, rewind just a little bit. When I first joined, I was kind of that first generation coming out of high school and joining the military that had had a computer at home in the mid ’90s. And the ship that I was on, they had very basic computer capabilities at that time, and they were rolling out, NT workstations to some of the work centers on the ship, and they sent a bunch of us that had used a computer before to A+ and Network+ classes to help them install networking on board a U.S. Navy ship. And so fast forward to getting out of the service in 2002. I thought, well, computers is a good thing to do, so I'm going to go do that. And I took those skills that I had and I found a trade school program for computer networking in Louisiana. I did that for about a year and a half. About four months into that and being out of the Navy, I realized that I missed the military and I went and joined the local National Guard unit.

And I ended up getting called up to deploy in 2004. So I didn't finish that program, but I got a MCSA CCNA, you know, basic certs that you get from coming out of one of those tech school programs. And then nine days before getting on the bus to go to the Middle East, I failed some medical tests and got left behind and as a platoon sergeant, watching your entire platoon get on a bus and go overseas without you, it put me in a dark place for a while. And I ended up in an IT support job doing break fix stuff. I did that for about three, three years or so. And I was bored to tears doing the same set of things over and over again.

I needed something different. My father in law worked in the oil and gas industry. He was a tool pusher in OIM offshore. And he said, ‘Hey, we're getting all these computer systems offshore. We're doing this telemetry for our drilling systems and our production systems. Man, nobody knows about how do these computers work.You should look into those jobs.’

And I started working for a drilling systems instrumentation company out of Broussard, Louisiana. And that was my introduction to control systems. And it just escalated from there. I was in instrumentation tech, I was a controls electrician, I was a PLC programmer, I was a systems designer.

I finally realized that my programming skills were terrible and the IT skills that I had were better. And so I spent the bulk of my career kind of doing OT architecture stuff, design building, the server systems, the networks, the firewalls, and all of that kind of stuff. My last few years before joining Mandiant, I was an automation engineer and then the director of technology for an oil and gas company here in California.

And then I got to the point where I, being in the oil and gas industry was a little bit of turmoil in the markets, and I wanted to go back to consulting and do something different, and I thought jumping to full time security would be a lot of fun.

And I found a CISSP-based kind of certificate program at UC Irvine.I did that program and started looking for places to move to security full time. Funny story is that the first time I applied for a job at Mandiant, I got the all important, ‘thanks, but no thanks’ email within just a couple of hours of submitting the application. That was July timeframe of 2019. In August, I had been connected with a veterans community group and cybersecurity vet sec.

And a bunch of those folks were going to DEFCON. And I was like, ‘Oh, that sounds like fun. I'll go to DEFCON.’ And I met a bunch of folks who introduced me to more folks and I got pulled to some wild parties and met other people. And that eventually somehow got me connected to Chris Sistrunk. And when Chris posted the America's manager role. I sent him a message. I was like, look, man, I, I really, I think I could do this job. I meet all of these qualifications, but I got like, thanks, but no thanks for this, for a senior consultant role four months ago. So I was just asking, like, how could my resume look better? What could I do to like improve my capability for, for this role?

And Chris's response was like the 10 digits of his phone number and ‘call me.’ And I spent about an hour on the phone with him. He convinced me to apply. In December of 2019, I started as the North American lead on this team. And in August of ’22, when my predecessor moved on to another role, I got asked to take on the global leadership role for the team. And a month and a half later, Google closed the acquisition and we all became Google Cloud employees. So that's the short, short of how I got here.

Bryson: I mean, I think I could summarize all of that in one point in your life, you learned NT 4.0 and basically said, I don't want to learn another operating system, and I'm going to stay in industrial control systems.

Paul: That's accurate. Yeah, I can agree with that for sure.

Bryson: The speaker poster, which you can't see that's on my wall says, ‘Why is it running Windows XP?’ That's my standard industrial control system talk I put together about a year ago. So what exactly do you do at Mandiant, now Mandiant, made Google?

Paul: Security consulting team, right? So we do all of the consultings: instant response, instant response preparedness, technical assurance testing, penetration testing the vulnerability assessment, some of the regulatory stuff like the NERC CIP assessments and that kind of stuff. Really it's, Mandiant grew out of incident response. And so that's the core. We take everything that we learned from incident response, whether it's in the enterprise side or the OT side, and we help apply that to, you know, our client base to help them better prepare for cyber compromise in OT environments and enterprise environments. And for my team, that's core focused on ICS and OT environments. Small team and spread all over the globe, but get to work with some great customers, get to work with some, some interesting projects and, and do some really cool things and stay involved with sometimes the bleeding edge of ICS security that was the bleeding edge of IT security 10 years ago.

Bryson: Industrial control systems. So, what have you learned doing this job? Like walk us through what is like, what is the scope? What kinds of services, the clientele? Do you, are you just Americas now? Are you also global?

Paul: Yeah. So I'm, I'm global. The customers are as far apart as you can imagine, right? Local municipal water companies that have very minor budgets to fortune 100 global auto manufacturers, we'll say. That's the amazing thing about being in a position like this is we're supporting both ends of the spectrum.

We're supporting food and beverage, rail and transportation, oil and gas, data centers, energy, you name it, like we're supporting those sectors. And for most of our customers, it's I think the bulk of what we hear is ‘we've had an IT security program and enterprise security program for X number of years and we're just starting to look at OT security and what we can do there.’ Or it's the fun topic of IT OT convergence, right? These organizations are, they're starting to plug their OT environments into some enterprise connectivity to plug into some ERP or manufacturing system or management systems for data purposes of some kind, and they connect it and they go, ‘Oh, there's all this stuff in our environment that we didn't know about, or they're taking proactive steps to make sure there's no bad stuff in there before they connect it.

So, and, and really it's, the bulk of what we do is, is helping customers better prepare for the potential worst day. I would say, I get asked a lot of times, ‘how many, you know, how many consultants do you have for OT incident response?’ I've got a handful. I don't want to live in a world where I have OT incident responders sitting on a bench waiting for OT incidents to come in.

Like, thankfully that's not the world we live in and where we've got mass conflag, OT incidents day in and day out. We respond to two or three a month. Most of the time they're pretty straightforward and, and they're primarily IT-related. IT systems that support OT environment, their Windows operating systems, your Windows XP, Windows 7, Windows 2000 that are still supporting these systems.

Every once in a while, we get something that affects some other device down in an OT environment, but it's so rare. And when they are there, they're highly targeted. And, you know, it takes a different skill set to kind of go in and investigate that, but thankfully, we're not seeing that all the time, right?

Most of what we see is IT-side compromise that bleeds over into an OT environment or OT environments that are hanging out in the Internet in some way. They've got a modem LTE modem, or they're protected by some firewall that has a zero day vulnerability on it. And some bad guy, whether they're nation state-sponsored or script kitty in their mom's basement, found this thing hanging out on the Internet and went and poked around in it and something weird happens and then somebody notices it. The whole gambit of that, I guess.

Bryson: So breaking down that into two different ways, one U.S. versus global, are there differences that you see with what we have here at home versus the kinds of problems, whether that is threat, whether that is, more, I don't want to say mundane, but the challenges around architecture and resilience? And then do you see differences between the different verticals?

Paul: So as far as architecture and resilience, I think the story is the same across the globe, right? I think everybody's in very similar situations. You look at some of the areas where there's, obviously, there's conflict, geopolitical conflict, and those have a different threat landscape as far as who's actually like knocking on the doors every single day. But from an architecture standpoint, from a resilience standpoint, from a capabilities standpoint, I think everybody's kind of facing the same problems, and I think there's not enough resiliency baked into these systems.

There's not enough organizations that I would say are adequately prepared to detect and respond to a potentially bad OT incident. From an overall capability standpoint, I think pretty much the same answer. The globe looks like everybody's kind of the same. Obviously, where you have those geopolitical situations, there's infrastructure that has been built up and is a little bit more robust in the resiliency capability.

You've got failover capability. You've got redundancies in place where these systems are able to stay up and operating. But a lot of that comes from the fact that they've been beat on for a long time. And the organizations that run them and manage them have had to build in that level of redundancy to be able to keep their customer base up and operational.

So I think for the most part, the globe's riding the same train towards the same destination. And I resiliency and the ability to detect and respond to the things that pretty much everybody needs to work on

Bryson: The folks who are doing all of this, your team, how do you manage that culture? How do you train these folks? How do you assure them? How do you support them? How are they able to deliver these kinds of services across such a large scope?

Paul: One, we hire folks with hands-on experience in these environments, right? They're, a good majority of them have engineering backgrounds. Some of them are, you know, more similar to me where they don't have the engineering degree, but they've got the hands-on experience in these environments and in run and maintain, design, build.

And then it's just like anything else in cybersecurity, right? You have to stay up to date with current certifications. I don't put a whole lot of stock in certifications, right? It's the education that comes with preparing for the certification process. But current threat landscape, we get a solid outlook on current threat landscape based on what we see as Mandiant and responding to the things that we respond to on the enterprise side and on the OT side.

Current trends as far as what emerging technologies are out there, we try to stay up with that as much as possible and staying connected with our vendors and meeting the new vendors in the space at conferences like S4 or like ICS Summit, learning what's out there and what the capabilities of those vendors or platforms or applications are to stay ahead of the emerging technologies to help try to solve some of these problems. And then we continue advancement and staying connected with the teams that are helping drive. We've got folks that support the [ISA/IEC] 62443 initiatives. We stay connected with our friends in some government agencies that help drive the updates to different regulations and compliance issues, new NIST revisions come out and we review those to make sure that we're leveraging those correctly for our customers and help our customers to evolve to those frameworks. So it's just like anything else. You have to stay, you can't be complacent in being willing to learn.

I always make the joke. I think you probably heard me say this. If we sleep for eight hours, there's a chance we gotta learn two new things, right? I can't wait for the day where I can just go be a bartender and learn some recipes and just have normal interaction with customers because I don't want to have to keep learning every single day.

Not that I don't enjoy it, but it's nonstop in this, in cyber, you can't take a day off sometimes because you'll get behind pretty quickly.

Bryson: Let's talk threats. So you, you already mentioned how the most common access factor is typically IT lateral movement, crossing over into OT. But I like to point out to folks that that's common because what is the purpose of IT? It's the Internet connected. It's regular users who are doing official business functions and then sometimes not official business functions.I've got a web browser. I'm going to use it.

So let's talk more again. Mandiant is credited with coming out with the first official APT1 report, which was what, about 12 years ago. And so I was one of the things I really like to talk attribution, not because everybody needs to be concerned about the in quote “advanced persistent threat,” but because I think like any case study, it makes it easier to understand motive to activity for why some of these folks are doing some of those things.So, your view of the global and or domestic threat landscape?

Paul: I have two points on this, and the first one I will say, as far as being prepared and stepping up your defenses for one particular nation state threat versus another, and who we should be worried about, I don't think that's important, right? I don't think that Sandworm is more important than Volt Typhoon or that one particular sector is more vulnerable than another sector.

I think that, you know, as I said before, the state of resiliency and the ability to detect and respond across these organizations, any organization, defend yourself against all enemies, foreign and domestic. Now, when you look at attribution, I have a little bit of a different stance than some folks on this, and I think it kind of aligns with what you said.

Attribution is important for two reasons. One, we need to know the motives behind why these nation states or why these threat actors are doing the things that they're doing. So we need to understand where they're coming from and so, having good understanding in things like the APT report of why those things are happening and what the motivations behind that is really important because it helps drive where specific sectors should pay more attention to something.

The other part of attribution that I believe is important is we have to hold people responsible. And while we're probably never going to see some of these people actually brought to justice, the federal agencies aligning and getting the indictments brought down so that these people that have done these horrible things and, and perpetrated these attacks, they should be held accountable for that.

And if we don't think about collecting the forensic evidence and doing the investigation and being able to attribute something to a particular threat actor, we can't hold people responsible for it. Now, you have to weigh a lot of things when you're talking about recovery of an OT environment.

And sometimes you may not have the luxury of spending all the time that you need to collect all the evidence and all of the forensic data that you need to be able to provide that to make attribution happen. I fully understand that, right.

In a lot of cases, we want to get these systems back up and operational as quickly as possible. If we're talking about a power grid that's supplying a particular region or water or wastewater, if one of those systems is down, we want that back up and running as quickly as possible. There may not be the luxury of getting the time to collect everything that we need to collect. However, if we have that capability or we have the capability to pull out the affected systems and maintain those for forensic research later down the road, and we put new stuff in and we get it up and running, that's always a benefit.

I think attribution is important because I think if we can't hold people responsible, there's no teeth to stopping this kind of cybercrime, especially if we're critical infrastructure. And if we don't understand who their threat actors are and who they're targeting and why they're targeting them, that's only serving to better, better inform our defenses.

Bryson: So part of the threat landscape is the in quote advanced persistent threat. What is the difference between advanced persistent threat and just a regular threat?

Paul: I would say that the advanced persistent threat is the ones that hypothetically, potentially, I hate the FUD part of this. So I, I try to refrain from talking about fear, uncertainty, and death and that part of what we look at in critical infrastructure. But when you talk about advanced persistent threat, these are the people that have the means and the capability to be inside of critical infrastructure systems and be undetected.

They're living off the land. They have persistence. They're potentially there and we don't know they're there yet. That, to me, is a threat group that is worth identifying and collectively pulling together as an APT. There's a lot of other threat groups out there, ransomware groups, and, and yes, they, in some of those cases, they present a pretty persistent threat.

But a lot of those are, I attribute it to like smash and grab, right? A ransomware gang, they want the easy access. They want to drop encryptors and encrypt a system, get their payout and get back out, where the APTs that are potentially there to cause substantial damage to critical infrastructure are potentially the ones that are there.

And we don't know they're there. They've got that dwell time. They're living off the land. They're using techniques that are staying under the detection capabilities of an organization, that's where they, I would say they, we classify the advanced part of that. They're the ones that have that capability.

Bryson: If you were to make one recommendation to asset owners, what would it be?

Paul: Know what you have and figure out the best way to protect it. I think too many asset owners, they just don't have a good understanding of what all they have in their environments. What is connected? I would say 90 percent of the assessments that we do. When we're going to an architecture assessment, we find some dual homed PC, we find a rogue access point, we find some connection to the Internet that somebody didn't know was there, or we find an Internet connection that they knew was there because it supports the third party service agreement to monitor some piece of equipment, but it's running on an LTE modem that is seven years old and has never been patched or updated and has multiple vulnerabilities in it, right?

So get your arms around what you have so that you can better define a way to protect it and isolate it and all of the things that need to be done to build in that resiliency. But if you don't, you got to start with knowing what you have.

Bryson: So closing out the threat part: in the news, the U.S. government has been very forward on Chinese activity, which we know has been documented going back since 2010. And what has changed now is that the U.S. government is saying that there is imminent concern around potential disruption, denial or destruction on critical infrastructure.

What do you think?

Paul: I think we should act like that's a possibility to protect these systems, I think, day in and day out, and I think that should be the cause for building in the resiliency and the detect and respond capability, right? And I don't think it needs to be focused on any one threat actor. I do know that there's documented threat.

There's lots of documented threats. And I think if we're not doing a better job of protecting critical infrastructure, protecting our assets. Any one of the nation state actors could cause that level of mass scale outage or destruction of capability. And it comes down to being better prepared to protect these environments.

And I think there's a lot of inadequacy in that. There's a lot of organizations that are doing a great job. And that's where the added diligence of knowing what these threat actors are capable of, what their motives are, what their TTPs are, to better threat hunt in those environments. But for the organizations that don't have that capability and really don't have much capability at all, they should be acting like anybody could take their systems down, much less some of these advanced threat actors.

Bryson: Is there anything we haven't covered yet that you want to cover?

Paul: I don't think so.

Bryson: All right. Puts us into the lightning round. This is how we end every podcast. If you could wave a magic, non-Internet connected wand, what is one thing you would change?

Paul: I'm going to go back to being a veteran. We got a really high veteran mental health crisis across the globe, not just in the U.S. And let's make that go away. Our service members, regardless of where they've served and what they've done, a lot of them end up in some pretty bad situations. And it breaks my heart to see that happening to people and let's do something better there.

Bryson: Paul and I are both military veterans, and this is a topic that we have both spoken about, and I will just also say thank you because Paul has also been there for me on certain occasions. And so I wanted to publicly express my appreciation.

Paul: Friends are rare in our line of work sometimes, right? It's hard to build connection and build relationships with folks. And that aside from just being fellow veteran, everybody needs a battle buddy and everybody needs somebody to reach out to. And I think it's important that we're there for one another because we don't have a magic wand to wave.

Bryson: Nice tagline. We don't have magic wands. We just have each other.
You waved your magic wand. Now looking into the crystal ball for a five year prediction. One good thing. And one bad thing that is going to happen.

Paul: I believe that we are on a pretty good upward trend in really improving critical infrastructure security. And I think that in five years, I think we'll be in a much, much better place than we are now. I think the technologies will be better. We used to be in this, like, it takes 10 to 15 years to improve critical infrastructure, to make changes, to update things, you know, lifecycle of these components is much, much longer.

And I think we're starting to get to a place where the life cycle is narrowing and the OEMs, the vendors are doing a much better job of building security in. So I think from an overall global security posture and critical infrastructure, I think five years we'll be in a much, much better place.

The unfortunate side of that is again, not to go towards FUD. But I still think that we've got a significant impact event on the horizon in the next five years. And it's going to take something on that scale to wake some folks up and

Bryson: This is Hack the Plant, a podcast from the ICS Village. Catch us at an event near you. Subscribe wherever you find podcasts to get episodes as soon as they're released. Thanks for listening.