Generative AI Is Going Viral In Cybersecurity. Data Is The Key To Making It Useful.

Having a large, varied dataset for training the AI models is essential in order to actually improve cybersecurity with the new technology, experts tell CRN. And not everyone has the data.

ARTICLE TITLE HERE

The rush for cybersecurity vendors to tap into generative AI is in full swing — as anyone who attended last week’s RSA Conference, and perused the many booths touting the technology, will confirm.

When it comes to generative AI and security products, “everybody’s putting it in — whether it’s meaningful or not,” said Ryan LaSalle, a senior managing director and North America lead for Accenture Security, during an interview at RSAC 2023.

As with so many things in technology today, data will be the biggest factor for determining if the applications are truly useful or not, experts told CRN.

unit-1659132512259

type

‘Immensely Powerful’

Boyce is not the only one thinking along such lines. At RSAC last week, cybersecurity vendor SentinelOne debuted a generative AI-powered threat hunting tool — dubbed Purple AI — followed by a new security-focused data lake that the tool can work with. Purple AI “sits on top of the data lake so that it has access to all the data that you put in,” said Tomer Weingarten, co-founder and CEO of SentinelOne, in an interview with CRN.

Whether it’s a firewall, email security product or identity security tool, “now Purple can answer questions from all of these different sources,” he said. “That’s where it becomes just immensely powerful. And it’s going to get better as we train it more.”

SentinelOne is “experimenting” with multiple large language models, including GPT-4 and Google’s Flan-T5, and is exploring others such as LLMs from Anthropic and Cohere, Weingarten said. But for generative AI to make a real difference in cybersecurity, the LLM is not as important as the training data available, he said.

“I think it really is more about training the algorithms, and less about the algorithms themselves,” Weingarten said.

Proactive Security

For Accenture, using generative AI in combination with varied security datasets is a massive opportunity, Boyce said. Doing so will allow its cybersecurity specialists to “start thinking more proactively — predicting what we don’t know yet,” he said.

That means that Accenture should be able to use the technology to “find things that we haven’t even thought about — these ‘unknown unknowns’ that we’ve been talking about for years, but that we’ve never been able to figure out how to [find],” Boyce said.

In other words, with a variety of security datasets and generative AI to more easily interact with it, “we can ask better questions,” he said.

“I don’t think we’re asking the right questions of our security data now, as a community. We don’t even know what to be asking,” Boyce said. “We’re just asking stuff that we already know. ‘Are these IOCs present in this dataset?’ Important — but not going to get us to a protection strategy that’s adding a really high level of confidence for your cyber resilience.”

All of the generative AI hype aside, Boyce has no doubt that for cybersecurity, the technology is going to be “super disruptive.” And the types of use cases that have surfaced in cybersecurity so far are just the beginning.

“I don’t think we’ve even started to think about what the possibilities are,” he said.

Good Data Required

Sam King, CEO of application security vendor Veracode, agreed — saying that even with the rush by so many vendors to tout generative AI in their products, “I think it does have the potential to have broad applicability” in the security sphere.

The week before RSAC, Veracode debuted a product that uses generative AI to provide remediation suggestions for code security flaws. The company plans to explore additional areas for generative AI going forward, she said.

Generally speaking, cybersecurity is “a good area of application for generative AI — because we have a lot of data, and we’ve got to sort through a lot of data,” King said.

At the same time, “what you need to think through is, what is the problem area you’re applying it to?” she said. “Do you have a good dataset for that problem area? Can you train it on good data? And go from there.”