Artificial intelligence startup Scale AI has been accused of storing highly confidential client information, including from Meta, Google, and xAI, in publicly accessible Google Docs, raising concerns over its $14.8 billion investment from Meta.
The alleged security lapse has sparked significant concerns, particularly given the substantial investment made by Meta, which acquired a 49% stake in Scale AI and plans to have its CEO, Alexandr Wang, lead a new “superintelligence” laboratory. According to a report by Business Insider, Scale AI’s security system was described as “incredibly janky,” leaving sensitive client data, top-secret projects, email addresses, and employee pay details exposed in Google Docs accessible to anyone with a direct link.
A spokesperson for Scale AI responded to the allegations, stating, “We are conducting a thorough investigation and have disabled any user’s ability to publicly share documents from Scale-managed systems.” The spokesperson emphasized the company’s commitment to “robust technical and policy safeguards to protect confidential information and are always working to strengthen our practices.” While there is currently no indication that these publicly accessible files have led to a data breach, cybersecurity experts warn that such vulnerabilities could leave the company highly susceptible to future hacks.
Five current and former Scale AI contractors corroborated the widespread nature of this practice in speaking to Business Insider. One contractor described the entire Google Docs system as “incredibly janky,” highlighting a perceived lack of stringent security protocols within the company. Neither Google nor xAI immediately responded to requests for comment regarding the report, while Meta declined to comment on the matter.
Business Insider’s investigation reportedly uncovered thousands of pages of project documents spread across 85 distinct Google Docs, detailing Scale AI’s sensitive work with major tech clients. Among the exposed information were confidential Google manuals, at least seven of which were marked “confidential,” including recommendations aimed at improving Google’s chatbot, then known as Bard. These critical documents, outlining strategies for fine-tuning the chatbot, were inadvertently left accessible to the public, offering a glimpse into Google’s internal development processes and its efforts to compete with OpenAI’s ChatGPT.
The public Google Doc files contained detailed information about Elon Musk’s ambitious “Project Xylophone.” This included training documents featuring approximately 700 conversation prompts designed to enhance the conversational skills of an AI chatbot. Similarly, “confidential” Meta training documents containing audio clips of “good” and “bad” speech prompts, crucial for training Meta’s own AI products, were also found to be publicly available. Even when secret projects were assigned codenames, several Scale AI contractors revealed that it was often straightforward to discern the actual client.
In some instances, documents associated with these codenamed projects mistakenly included the client company’s logo, such as a presentation that inadvertently displayed a Google logo. Adding to the security concerns, contractors noted that in certain instances, when interacting with AI products, the chatbot itself would reveal the client’s identity when directly prompted. The revelations extended beyond project details to highly sensitive employee information, with publicly accessible Google Doc spreadsheets found to contain the names and private email addresses of thousands of workers.
One particularly alarming spreadsheet, explicitly titled “Good and Bad Folks,” categorized dozens of workers as either “high quality” or “cheating.” Another document flagged individuals for “suspicious behavior,” raising questions about Scale AI’s internal performance tracking and ethical practices. Moreover, public documents detailed individual contractor pay rates, along with extensive notes on pay disputes and discrepancies. This exposure of sensitive financial and performance data further compounds the severity of the security lapse, potentially impacting the privacy and livelihoods of numerous contractors.
The incident underscores a critical need for robust data security practices within the rapidly evolving artificial intelligence sector, particularly as companies like Scale AI handle vast amounts of proprietary and confidential information for leading tech giants. The fallout from this “incredibly janky” security system could have significant implications for Scale AI’s reputation and its future partnerships, even as it integrates more deeply into Meta’s AI ambitions.




