Rockset Podcast Episode 6: Real Time Analytics in Cybersecurity

Dive into the world of security real-time analytics in security with the director of data engineering and data science at Picnic Score. Glean insights into extended detection response, streaming analytics, social engineering and more.

About This Podcast

Gio Tropeano:

Welcome to another episode of Why Wait-The Real-Time Analytics Podcast by Rockset. We invite business leaders, app development thought leaders and analytics specialists to share their stories with the world, providing insights into your peers' application analytics improvement strategies. My name is Gio Tropeano. I'm with Rockset. Thank you for being with us today. Before I kick it off, if you're listening to this and have a question or want to comment on what you've heard, please do so in our community Slack channel at rockset-community.slack.com or feel free to tweet at us @RocksetCloud on Twitter. With me is my cohost Dhruba Borthakur, Rockset co-founder and CTO. Thank you sir for being with us and taking the time to chat. In this episode, we're taking the time to talk about cybersecurity. Our guest is a cybersecurity technical lead focused on AI and ML with experience in data science and engineering at FireEye. He was a principal research engineer at ZeroFOX and is now Director of Engineering and Data Science at Picnic. I want to welcome you, Matt Price, to Rockset podcast.

Matt Price:

Thanks. I'm glad to be here.

Gio Tropeano:

Pleasure is all ours. Thank you so much for being here. We're excited to have this conversation. I'm going to jump right into it. You have a real wide set of experience in the security space, starting with FireEye and now Picnic Score. In the security domain and data and analytics, right now that reign supreme, right, in security. So the recent trend has been to detect and kind of find security issues as soon as they occur. Tell us about some of these trends, kind of how that data processing in real time is affecting that.

Matt Price:

Yeah, actually one of the trends I'm the most interested in and that's really been accelerated due to COVID is the trend of workers working from home now. And what this has done in the cybersecurity realm is you now have this blurring of not just your work systems, but now your employees home systems. The surface of attack now for a threat actor has expanded just from essentially the perimeter that a company controls to now they have to start thinking about how do you handle the employees' perimeter now as well. So being able to detect those security incursions that are happening and identifying them, not just at work, but now at an employee's house is becoming critical within the cybersecurity industry. Something else that we're seeing is that due to all the various herding measures that companies have been putting into place and really the efficacy of the tools that defenders can use today, we're shrinking the amount of time that it takes to detect an actual threat actor within a company's systems or their network.

Matt Price:

So because of this, threat actors are realizing this and they are now moving more quickly. So once they compromise a company's systems, they're immediately starting to exfiltrate data or execute their ransomware attack or whatever it may be. So this means that defenders need to be reacting quicker rather than... Kind of before what would happen is that there'd be the recon phase that threat actors would kind of take their time. Now, essentially they get in and execute their attack. Another huge shift that we're seeing is the move to cloud and not just cloud, but in particular serverless architecture. And this is very hard from a security perspective to track because you have all these different systems and things coming online and then shutting down. And because of that, it's very hard to understand the current state of your system at any time.

Matt Price:

And from a security perspective, you need to have an accurate state of your system and your network and your infrastructure in order to defend it and to know where the threat actors possibly could be at. Something that I'm quite interested in, there's a couple of different names for it. The one I like the best is called extended detection response. But what this has to do with is correlating disparate events that you're seeing. So companies have done a really good job at securing say their network or their servers or the persistent storage, but what they haven't done a great job of doing so far really is looking at the different events that are occurring in these different domains, pulling that information together, and then giving that holistic picture of, hey, this is what's happening. A really simple example of this could be an attacker connecting out to some other server. What you're going to see is there is probably a binary that got dropped onto a server that is unexpected, but that's not... Server admins do that all the time.

Matt Price:

But then you also see this network connection out to maybe this host in Russia and this company doesn't have any infrastructure in Russia. Well, those two signals pretty much told you, hey, we've been compromised. Most likely a threat actor installed a back door and is now trying to connect in from Russia. What's even more important about, in my opinion, extend detection and response is it's also addressing the alert fatigue problem that we have in cyber security. I would say a lot of security vendors are very good at throwing alerts. The number of alerts that are actually actionable and there's something that defender actually needs to take action on are very low compared to number of alerts getting thrown. So by joining up these different events and actually essentially having another alerting layer on top of these lower level alerts is one way to reduce kind of that alert fatigue and really throw like higher fidelity alerts.

Matt Price:

And then really the last trend that we're seeing is automation. And this is for a couple of different reasons. One is that there's just not enough cyber security professionals out there that deal with all the attacks that are happening. So we need to really remove that human in the loop. So this is being able to throw the correct analytics and alerts that you can then take action on to have other tools that can go in and take that automated remediation response. So there's just a couple of the trends that we're seeing today.

Gio Tropeano:

As cybersecurity has gotten more and more efficient, I should say how cyber attacks have gotten more and more effective. So we have to get more and more efficient at understanding what's happening more quickly in order to make better decisions. And the last thing we need to do is deal with alert fatigue, right? Where you're getting all this information coming at you and you don't know which one you should be acting upon. So let's just automate it all and have how the computers handle it. One of the quotes that we had on one of our recent episodes, Matt, is that real-time analytics is hard. One of your stints, specifically FireEye, you were building out ingesting software, which you mentioned you received like hundreds of millions of events per day. So tell me why is that hard?

Matt Price:

Yeah, this problem in particular is extremely difficult. The data that we're ingesting, where all of the various logs that our very large enterprise clients we're generating from their security tools. And this is really kind of the... This actually gets back to extended detection response, but what we're trying to do or take all these different logs, analyze them holistically and point out with very high fidelity alerts where our clients like SOC or our internal SOC need to go look at to find the actual threat actor and remediate the issue.

Matt Price:

So the first problem we had here is that there is no consistent format across security tools for communicating data and some of the logs were just completely unstructured. It'd just be a string of text. Some of the tools were nicer to work with and it was more of unstructured data or a little bit more of unstructured data problem dealing with you like JSON blocks for example, but just taking all of these different logs and then turning them into a common structured format that we could then work with downstream was one of the biggest issues that we had to deal with.

Matt Price:

Beyond that in cybersecurity with all these logs being generated by these large enterprise clients, it is very much a needle in the haystack problem. We might see hundreds of millions of events in a day and honestly, none of them, most likely not all of them are benign. So maybe in over the course of like a month or two, maybe three or four events out of those. Like at that point, billions of events would actually be what we cared about. And when we found those, we needed to let our clients know immediately because at that point, if we're seeing it from the different security tools that our clients own, that meant that tackle is within their system and once they were within their system, they're probably moving laterally, laterally dropping in back doors and so on and then remediation comes down much more difficult.

Matt Price:

The other big issue with this particular problem was the time aspect of these logs, which is absolutely critical. And because we are ingesting these logs from clients' sites, there was a latency between some of the different tools. So for example, like the firewall logs tended to come in very quickly, but then the server logs tend to be a little bit delayed. And what we were trying to do was correlate those the network logs happening at the same time as the server logs so now you've got this time series problem and the events are arriving out of order. So how do you resolve that issue? That was one of the major issues that we had to deal with.

Matt Price:

And then lastly, these were all streaming analytics. Just because this was fairly sensitive information, our clients didn't want us actually persisting any of this data. So as data was coming in, we're needing to analyze it, determine what kind of action needed to happen, which is usually throwing an alert to a client and then that was it. We were done. We couldn't retain those logs. So there are quite a few issues going on there and just the volume of events coming in along with those other problems made it a very difficult problem.

Dhruba Borthakur:

Yep. Thanks Matt for explaining that. The real-time analytics seems to be hard for a lot of reasons, for semi-structured data you said and then the volume of data is very high. I understand. So let me take a little bit different take on this thing that there's yet another problem that could affect some of these real-time analytics setups, the problem of finding good people who can actually build these systems, who can run this systems at scale, who can make sure the system is up and running 24 by 7, because like you said, when data is coming in in a high stream, if you lose your service for a few minutes, that could be a 30 black hole for the user. So how difficult is it to find people or find engineers to run the systems and build the systems for you and how much does it cost you to... Or what percentage of time and effort do you spend or you used to spend getting the team right and the human cost of getting these real time analytics systems in place?

Matt Price:

That's a great question. In my experience, it's very hard to find the right people to build out these, I call them data pipelines, but essentially these streaming analytics platforms, especially at scale. I mean, the first problem is you need someone that has that experience, especially working at scale. And there's just not that many people that have it especially at a large scale. That's just kind of the nature of the industry that we work in. The other thing that gets really tricky is you need a multidisciplinary team. You can't just have one or two software engineers, you're going to need data engineers, you're going to need DevOps. You're going to need analysts at the end of the day because they're going to be the ones usually consuming the results of the system.

Matt Price:

So you need to be able to pull together this disparate team and have them all working towards the same end goal. So finding the right people especially with the kind of experience that you need is extremely difficult. And these people know it. And because they are so limited, they're expensive at the end of the day. So because of that, just overall the total cost of ownership when you talk about the human element of these platforms I have found to be fairly large.

Dhruba Borthakur:

Yeah. Some of these human capital, you might also need to invest just because your data is coming in various different forms and shapes, right? So there is, like you said, the data pipelines need to be set up to flatten the data structure or kind of clean some data before you can actually make analytics out of it. What about because for most of these security use cases, data comes from different sources like you mentioned, right? They could be coming from firewalls, they could be coming from other even capturing system that you have in place. How important is it for your analytic systems to be able to handle all this semi-structured data? The fluid nature of the data. Do you use any tools or do you use special software to be able to handle these kinds of events? Do you put them in a relational database? DO you make a lot of ETL queries to be able to put it in a no SQL database or what kind of backend platform are the ones that you have used in FireEye or at other places you have worked? Can you share any insights into that?

Matt Price:

Absolutely. So one of the things that we did in the past, and this was actually my first foray into a real-time analytics was essentially doing the... We had, at the time, it was a batch based system, ETLs into a data warehouse. The business obviously didn't like that because we were running it once an hour and they wanted to have it like immediately on demand, they wanted to know the answer to their questions. So what we ended up doing there is we just ended up implementing a basic streaming system that was all custom built at the time. This was almost 10 years ago now, or seven years ago, something like that, but it was a while back and this wasn't something that was really being done then. So everything we ended up building was custom and it ended up all just ending up in a relational database after we did a lot of transformations, but all this was happening in real time so we were able to give up the date answers.

Matt Price:

It's starting to move through time now that the next kind of place that we ended up was starting to use data lakes. So data lakes were really useful because we were essentially able to just take all this raw data and shove it off into like AWS S3 or GCP cloud storage type system and then build tools on top of that to essentially at query time, go and pull up that data and analyze it. So the problem with that is that running those queries took a long time. We had all the data so we were able to do it, but you also have to process all that data.

Matt Price:

So now you're building out this huge distributed system on top of the data lake in order to answer these queries. Where I've been moving to recently, actually, it's still kind of following that battle lake paradigm, but using that just as essentially cold storage. And what we do is we use now a serverless architecture to stream these events through and then using various backends, usually some kind of like time series database, sometimes no SQL, sometimes a relational database to store the results kind of as they stream through.

Gio Tropeano:

So Matt, before Picnic, which is the project that you're on now, you're at ZeroFOX, which is also a cybersecurity company. What real-time analytics work did you experience while you were there?

Matt Price:

Yeah, so ZeroFOX is a fascinating company. What ZeroFOX is focused on is identifying threats on social media. So this could be like physical threats, phishing. Some of our clients are very interested in scams that are being conducted in their name, impersonations, malware and so on. And then we looked at a bunch of other sources as well, both public and public so this would be not like a pieced or showed in as public sources or an internal source could be that company's email. So we had just a number of different data sources coming in. What was interesting about ZeroFOX is the way that they structured their contracts. And this is actually very similar to the FireEye problem. What they had is that we cannot persist any of that data.

Matt Price:

So again, it was coming in, we got one chance to look at it and after that it was gone. It was never coming back to us. So while I was there, I ran the data science team and we were using AI and machine learning to essentially augment the data that was coming through. So the way that we were approaching this problem is that we use essentially some kind of model to add various analytics on top of the raw data that we had coming in. And then ZeroFOX had built up this sophisticated rule engine that would essentially look at this data as it came in and then based on the various analytics that we had put on those events, that would then cause alerts to be generated. And the reason that we're using machine learning for this is that in social media and these other sources as well such as case then, context matters.

Matt Price:

So if you take like a physical threat example. Maybe there's two friends talking on Twitter, one of them says, hey, I'm going to kill you tonight but it's in reference to a previous tweet that this friend had made that said, hey, do you want to play some video game like Call of Duty. Yes, the word kill is in there and if you're a blindly trying to just analyze this text, you would say, oh, they used the word kill, this is a physical threat against this individual. Well, put it in context, it's not a threat, it's just two people talking. Versus something a little bit more sinister, which is where maybe a CEO of a company is out traveling and there's possibly a disgruntled customer that makes the same threat. At that point, the context is completely different and you need to understand that. But in order to do that in real-time method, you need to be looking at both. You need to be looking at all that context at the time that event arise and then making a decision.

Dhruba Borthakur:

That's a great example of using some kind of a social media or some kind of context around each of these events. And at Picnic now, it looks like you are building a platform that uses some of these social engineering signals to build a better model and predict some of this. At Picnic, can you tell me a little bit more about how useful is it for you to get these signals in real time? How important is real time for you? How important is it for the platform to be able to react quickly rather than react late to the signals that are analyzing for your software?

Matt Price:

Yeah. So social engineering, it's an interesting problem area that really hasn't been tackled yet in cyber security. So the way that we're approaching it is ultimately this is a data problem because in order for a social engineering tech to be successfully conducted against individual, you need to know something about that individual and have some kind of information, because really what you're trying to do in social engineering is gained someone's trust and then execute an attack. So the way that for us, at least from a real-time perspective, it's extremely critical to catch leaks or compromising information before it gets out there and publicly available and people have copied it and it's all over the internet, which case it's too late. So just quick identification of any kind of compromising information with regards to the company is critical. That time to discovery and then remediation is imperative if you want to be successful in this realm.

Matt Price:

The other thing that we need to consider too is the type of information that people are communicating. And we've seen these attacks over the past couple of years where a CEO or a CFO will mention they're going on vacation or they going to some conference or something like that. Attackers have taken advantage of this several times and there's plenty of public examples of it, but essentially what will happen is that attacker will see this and then that's when they will choose to conduct their attacks.

Matt Price:

So they'll contact say the controller of a company acting as the CEO or the CFO. Well, they're out like on vacation or conference, they have no way to rebuttal it and if they can relay that information and gain that person's trust, well, guess what, there's going to be a few million dollars missing when they come back. So being able to identify and tell hey, the CEO, hey, you should not be mentioning that you're going to this conference. Being able to catch that in near real time and have that get taken down, that prevents the attackers from seeing that information and hopefully prevents them from conducting their attack.

Dhruba Borthakur:

Makes sense.

Gio Tropeano:

You mind telling us a little bit about just Picnic in depth and how Picnic helps?

Matt Price:

Yeah, so the main way that Picnic helps out corporations is that we focus on the individuals. Picnic actually stands for problem in chair not in computer. As I was kind of mentioning but earlier is the cybersecurity industry has done a very good job hardening the technology that we all use, but the cybersecurity industry has not done a very good job so far of hardening humans. So humans at the end of the day are the weakest link for any organization and I'm sure most people listening to this have gone through the... Had little security training you can get which is a couple of slides talking about the different attacks you can get conducted against you. Well, I think we can all pretty much admit that's not really helpful for an actual employee. So one of the things that Picnic tries to do is protect employees of our clients, both at home and at work with the theory being that if you're a safer at home, you'll be safer at work and vice versa.

Matt Price:

So we do that through a combination of really targeted education and remediation capabilities. So a great example of this would be looking at someone's LinkedIn profile and we see there that they mentioned, hey, I'm in charge of wiring funds for this corporation. We'll actually pull that snippet out of LinkedIn and be like, hey, you're mentioning that you wire funds, attackers will then start targeting you because of this. Financially motivated threat actors because they will want you to be wiring funds to them. So we would recommend taking the snippet and rewording it. So it doesn't mention the word wire transfer. Similarly like remediation capabilities, one of I guess the features that are some of our clients are most excited about right now is removing people from data brokers. So in the United States, there's a lot of public information out there about all of us as individuals. That's just kind of the way that our society has been built, which is both a blessing and a curse.

Matt Price:

From a security perspective, it's very much a curse because this makes it really easy for our threat actors to go out and discover information on people. And these data brokers, what they do is they make the attacker's job real easy. These data brokers go out to all these different public sources, aggregate all this information up into a nice little package and then sell it to you. You can get this on anybody in the United States. So what we've been doing is removing all that information from the data brokers. By US law, they are required to remove your information if you ask them to. So one of the ways we're helping out our clients is that well, for all of their employees, we'll go out to all these different data brokers and remove their information for them, thus kind of reducing their attack surface to making it that much harder for the attackers to find information they can leverage in a social engineering attack

Gio Tropeano:

As humans in the security realm, humans are the weakest link.

Matt Price:

Always helping.

Gio Tropeano:

Yeah, it's true. It's true. Okay, cool. So speed round coming up. I'm going to fire off a couple of questions to you Matt and we'll go from there. So can you share data and AI trends that you're excited about in the security analytics space?

Matt Price:

Yeah. One of my favorite trends is explainability of AI and ML models. Before it used to be able to just say, hey, like my model said this and the client would be happy. Now the client wants to know why the model is saying something. It's not good enough just to throw an alert. They want to know why it was thrown. We touched on this already, but extended detection response I think is only going to continue to grow. And related to that again, like I mentioned, high fidelity alerts. Another thing that I'm really interested in this space is growing quite rapidly is something called user behavior analytics. So this is analyzing the behavior of users and determining based on that behavior, if there's a potential threat. And then finally, like just real-time analysis, the whole industry is moving towards a kind of real-time standpoint because you need to know the state of your system right now. You need to know where that potential threat is right now. You don't need to know it an hour from now because an hour from now is too late.

Gio Tropeano:

But the damage is already done at that point. Next question, what's causing these trends to take shape now? What enablers have you seen on the technology front to usher these trends in?

Matt Price:

Yeah, just in general, like I mentioned before, just the industry getting very good at certain domains so network, email, server and so on, but just all the noise being generated. And this is where that extended detection response really comes in, is just reducing that amount of noise because the defenders don't have 30 minutes to go and triage an alert, especially when there's 500 alerts sitting in their queue. And we touched on this, just humans being the weakest link. I think there's going to be more and more attention on how do we continue to secure humans themselves.

Matt Price:

Insider threats are becoming also more and more of an issue. Again we've done a very good job in the industry of hardening the perimeter, but a disgruntled employee is inside the perimeter. And at that point, there's a lot of damage that they can do. And again the realtime part is like telling a client about a potential security threat 24 hours down the road, because you're doing some kind of batch analysis over the day's worth of events. It's too late at that point, especially if we're thinking like ransomware attacks, which are huge these days. At that point, they're already locking down servers and you've lost that data.

Gio Tropeano:

Indeed. What are the challenges that engineering and data teams then are facing when implementing or expanding security analytics features?

Matt Price:

Just the wide variety of data formats across these security products and the lack of any kind of standard communication method is a huge issue. So once we start getting into extended detection response, how do you structure this data and put into a common format that you can actually work with across all these different products? Also just the sheer amount of data being generated, just the amount of data being generated from a single firewall product is staggering. And then that's just one security product that a company has. They've got tens of others. Dealing with the time aspect, especially across disparate pieces of data. The time series problem is very difficult in cybersecurity, especially trying to correlate the different events across time. Understanding of what is worth actually tracking is another critical thing. It's impossible to track anything so being able to identify what is actually critical and what you need to be tracking and what you need to be communicating to your customer and just storing and managing all of this raw data, it's mind boggling how much information you can gather from just a single client in a day.

Gio Tropeano:

Awesome. So then how do you see real-time analytics benefiting the security analytics space?

Matt Price:

Yes. So because these threat actors are executing their attacks so much quicker now because they know that their time to discovery is being reduced, we need to step up as a cybersecurity industry and identifying to stop threat actors like immediately. So this is all about, in my opinion, reducing that time to discovery, which will reduce that time to remediation. Being able to do that in near real time, I think it's going to become more and more critical as we continue to move forward. And most importantly, I think from a vendor standpoint is they need to know at right now what the current state of their infrastructure is and their various systems. They don't want to know what the state of the system was an hour ago. That's not useful when they're trying to try to chase a threat actor through their network.

Gio Tropeano:

Okay. And then what would you say? Because we've got a variety of kind of companies that you've worked at with both size and scope. So what's the biggest difference that you've seen when building a real-time analytics at a larger, more established organization versus a startup in the early days?

Matt Price:

Yeah, that's a great question. The biggest difference I have found is the pressure in particular around time, because as a startup, you don't have time to deliver a solution. You either deliver a solution now that's valuable to your clients because you need to be generating revenue for the business. In larger organizations, they can afford to let a team work for a year or two to develop the right product. In a startup, you have to be developing that right product and it needs to be functional right now. So because of that, you really need to understand where and when it's okay to make compromises, either from what kind of tools you're purchasing or what you're building out versus understanding what you can actually deliver.

Gio Tropeano:

Awesome. And last question there and thank you so much for your time obviously. Last question, if you could give one piece of advice to data and engineering leaders and builders on real-time analytics, what would that be and why?

Matt Price:

Yeah, my advice would be start small. Real-time analytics is hard. It is a massive challenge and if you try to boil the ocean and take something that you're running maybe right now in batch and implement it in a real-time analytics solution, it's likely to fail if you try to do it all at once. My advice would be to start small and especially understand the data that you're pulling in and really understand the analytics that you want to generate from it. So starting small will help you start to really understand that data, understand where there's possible bottlenecks with your solution and then really force you to think about, okay, what analytics do I need to actually be delivering usually to the business or to your clients.

Gio Tropeano:

Sage advice for sure. That'll do it for this episode. This has been extremely informative and having come from the security industry myself, I see this as being hugely helpful for the folks that are in the industry. So I'm looking forward to sharing this with the folks out there. Thank you Matt. Check out picnic score.com. I know you guys are sharing the company and the news of the company now so looking forward to being part of that. Thank you Dhruba for being part of today's discussion and I wish you luck, Matt. If you found this discussion insightful, please share it. The Why Wait Podcast is brought to you by Rockset. We at Rockset have built a real-time analytics cloud-based platform that can add value to the use cases that we were discussing today. Check us out at rockset.com. You can try us for free for two weeks where you'll actually receive a $300 in trial credits for checking us out and giving us a shot. Please subscribe and comment. Thanks once again for joining us and stay tuned for our next episode. Cheers everyone.

Dhruba Borthakur:

Bye guys.

Resources

mouse pointer

See Rockset in action

Real-time analytics at lightning speed