Recent findings suggest that ChatGPT is more open to manipulation than previously thought, allowing users to expose its internal rules, file structures, and operational details. This vulnerability, identified by researchers like Marco Figueroa, has raised questions about OpenAI’s security practices.
Figueroa discovered these issues when interacting with the model in a way that inadvertently revealed its internal file system, running on a Linux-based Debian environment. Through specific prompts, users could access instructions governing ChatGPT’s behavior and even interact with uploaded files within the sandboxed environment.
This discovery raises significant concerns about the potential for sensitive data exposure and security flaws in OpenAI’s systems.
Figueroa and others found that with carefully crafted prompts, users could interact with the chatbot in ways akin to using a shell, managing files, and uncovering the internal configurations and data structures that govern the AI’s behavior. OpenAI defends these functionalities as intentional aspects of the system’s design, meant to enhance transparency and control.
Experts warn that these features could be exploited by bad actors to reverse-engineer the AI’s protections. Or potentially discover zero-day vulnerabilities.
However, critics like Figueroa warn that this level of access could lead to the discovery of vulnerabilities or zero-day exploits, potentially exposing custom GPTs or sensitive user data embedded in them.
Marco Figueroa, a bug bounty expert, uncovered a surprising prompt injection vulnerability in ChatGPT during a routine coding task. While asking the model to refactor some Python code, he encountered an unexpected response: “directory not found.” Intrigued, he tested a follow-up prompt, “list files /,” which mimics the Linux command ls /
. This led to the AI revealing information about a simulated file system, sparking questions about its operational design.
This revelation underscores broader concerns about prompt injection vulnerabilities. These occur when users craft queries that bypass intended safeguards, unlocking unintended capabilities or exposing internal workings. OpenAI and security experts emphasize the need for continuous monitoring and improvements to mitigate risks in generative AI systems.