Researchers probed the black market for AI services that are designed to facilitate cybercrime.
What’s new: Zilong Lin and colleagues at Indiana University Bloomington studied how large language models (LLMs) are used to provide harmful services, specifically generating malicious code, phishing emails, and phishing websites. They weren’t very effective, by and large (though a high success rate may not be necessary to support a thriving market in automated criminal activity).
Risky business: Providers base such services on either uncensored LLMs — that is, those that weren’t fine-tuned to reflect human preferences or don’t employ input/output filters — or publicly available models that they prompt using jailbreak techniques that circumvent built-in guardrails. They sell their services in hacker’s marketplaces and forums, charging far less than typical traditional malware vendors, but services based on models that have been fine-tuned to deliver malicious output command a premium. The authors found that one service generated revenue of more than $28,000 in two months.
Sprawling market: The authors identified 212 harmful services. Of those, 125 were hosted on the Poe AI platform, 73 were on FlowGPT, and the remaining 14 resided on unique servers. Of those, the authors were unable to access five because either the provider blocked them, or the service was fraudulent. They identified 11 LLMs used by these services including Claude-2-100k, GPT-4, and Pygmalion-13B (a variant of LLaMA-13B).
Testing output quality: The authors prompted more than 200 services using over 30 prompts to generate malicious code, phishing emails, or phishing websites. They evaluated the responses according to:
- Format: How often they followed the expected format (as defined by regular expressions)
- Compilability: How often generated Python, C, or C++ code was able to compile
- Validity: How often generated HTML and CSS ran successfully in both Chrome and Firefox
- Readability: How often generated phishing emails were fluent and coherent according to the Gunning fog Index of reading difficulty
- Evasiveness, or how often generated text both succeeded in all previous checks and evaded detection by VirusTotal (for malicious code and phishing sites) or OOPSpam (for phishing emails).
In all three tasks, at least one service achieved evasiveness of 67 percent or higher, while the majority of services achieved an evasiveness of less than 30 percent.
Testing real-world effectiveness: In addition, the authors ran practical tests to see how well the output worked in real-world situations. They prompted nine services to generate code that would target three specific vulnerabilities that relate to buffer overflow and SQL injection. In these tests, the models were markedly less successful.
- The authors tested generated code for two vulnerabilities on VICIdial, a call-center system known to be vulnerable to such issues. Of 22 generated programs that were able to compile, none changed VICIdial’s databases or disclosed system data.
- They tested generated code further on OWASP WebGoat 7.1, a website that provides code with known security flaws. Of 39 generated programs that were able to compile, seven launched successful attacks. However, these attacks did not target the specific vulnerabilities requested by the authors.
Why it matters: Previous work showed that LLMs-based services could generate misinformation and other malicious output, but little research has probed their actual use in cybercrime. This work evaluates their quality and effectiveness. In addition, the authors released the prompts they used to circumvent guardrails and generate malicious output — a resource for further research that aims to fix such issues in future models.
We’re thinking: It’s encouraging to see that harmful services didn’t get far in real-world tests, and the authors' findings should put a damper on alarmist scenarios of AI-enabled cybercrime. That doesn’t mean we don’t need to worry about harmful applications of AI technology. The AI community has a responsibility to design its products to be beneficial and evaluate them thoroughly for safety.