Current LLMs are just not mature enough for high-level tasks
Mention the term ‘cyberthreat intelligence’ (CTI) to cybersecurity teams of medium to large companies and the words ‘we are starting to investigate the opportunity’ is often the response. These are the same companies that may be suffering from a lack of experienced, quality cybersecurity professionals.
At Black Hat this week, two members of the Google Cloud team presented on how the capabilities of Large Language Models (LLM), like GPT-4 and PalM may play a role in cybersecurity, specifically within the field of CTI, potentially resolving some of the resourcing issues. This may seem to be addressing a future concept for many cybersecurity teams as they are still in the exploration phase of implementing a threat intelligence program; at the same time, it may also resolve part of the resource issue.
The core elements of threat intelligence
There are three core elements that a threat intelligence program needs in order to succeed: threat visibility, processing capability, and interpretation capability. The potential impact of using an LLM is that it can significantly assist in the processing and interpretation, for example, it could allow additional data, such as log data, to be analyzed where due to volume it may otherwise have to be overlooked. The ability to then automate output to answer questions from the business removes a significant task from the cybersecurity team.
The presentation solicited the idea that LLM technology may not be suitable in every case and suggested it should be focused on tasks that require less critical thinking and where there are large volumes of data involved, leaving the tasks that require more critical thinking firmly in the hands of human experts. An example used was in the case where documents may need to be translated for the purposes of attribution, an important point as inaccuracy in attribution could cause significant problems for the business.
As with other tasks that cybersecurity teams are responsible for, automation should be used, at present, for the lower priority and least critical tasks. This is not a reflection of the underlying technology but more a statement of where LLM technology is in its evolution. It was clear from the presentation that the technology has a place in the CTI workflow but at this point in time cannot be fully trusted to return correct results, and in more critical circumstances a false or inaccurate response could cause a significant issue. This seems to be a consensus in the use of LLM generally; there are numerous examples where the generated output is somewhat questionable. A keynote presenter at Black Hat termed it perfectly, describing AI, in its present form, as “like a teenager, it makes things up, it lies, and makes mistakes”.
I am certain that in just a few years' time, we will be handing off tasks to AI that will automate some of the decision-making, for example, changing firewall rules, prioritizing and patching vulnerabilities, automating the disabling of systems due to a threat, and such like. For now, though we need to rely on the expertise of humans to make these decisions, and it's imperative that teams do not rush ahead and implement technology that is in its infancy into such critical roles as cybersecurity decision-making.