Pen Testers Need to Hack AI, but Also Question Its Existence

Samsung has banned some uses of ChatGPT, Ford Motor and Volkswagen shuttered their self-driving car firm, and a letter calling for a pause in training more powerful AI systems has garnered more than 25,000 signatures.

Overreactions? No, says Davi Ottenheimer, the vice president of trust and digital ethics at Inrupt, a startup creating digital identity and security solutions. A pause is needed to develop better approaches to testing, not just of the security, but the safety of machine-learning and artificial-intelligence models. These include ChatGPT, self-driving vehicles, and autonomous drones.

A steady stream of security researchers and technologists have already found ways to circumvent protections placed on AI systems, but society needs to have broader discussions about how to test and improve safety, say Ottenheimer, who will give a presentation on the topic at the RSA Conference in San Francisco next week.

“Especially from the context of a pentest, I’m supposed to go in and basically assess [an AI system] for safety, but what’s missing is that we’re not making a decision about whether it is safe, whether the application is acceptable,” he says. A server’s security, for example, does not speak to whether the system is safe “if you are running the server in a way that’s unacceptable … and we need to get to that level with AI.”

With the introduction of ChatGPT in November, interest in artificial intelligence and machine learning — already surging due to applications in the data science field — took off. The eerie capabilities of the large language model (LLM) to seemingly understand human language and to synthesize coherent responses has led to a surge in proposed applications based on the technology and other forms of AI. ChatGPT has already been used to triage security incidents, and a more advanced LLM forms the core of Microsoft’s Security Copilot.

Yet the generative pre-trained transformer (GPT) is just one form of AI model, and all of them can have significant problems with bias, false positives, and other issues.

Exploiting Robots Is Easy

These shortcomings, and a general lack of explainability in AI models, means that any model can be attacked in ways that the…

Source…