New GenAI ‘upload file’ options spur data risk fears

As the use of generative artificial intelligence (GenAI) tools boom a new "upload file" option introduced on many platforms like OpenAI's ChatGPT is being blamed for a massive uptick in attempts to share sensitive data outside company private networks.

Researchers at Menlo Security tracked an 80% increase in attempted file uploads to GenAI sites from July to December 2023.

“[Incidents] of data loss through file uploads is on the rise. Previously, most solutions did not natively allow file uploads, but as new versions of generative AI platforms are released, new features are added, such as the ability to upload a file,” wrote Menlo Security researchers in a study on data loss protection (DLP) risks released Wednesday.

The study highlights security risks presented by GenAI tools and their platform owners that collect massive amounts of user data often exposed publicly via the platform's own large language model data sets. Menlo Security cited a March 2023 OpenAI data breach involving account data (not data uploaded or cut and pasted as a user query) that spilled 1.2 million subscriber records.

“These generative AI uses present the largest impact on data loss due the ease and speed at which data could be uploaded and input, such as source code, customer lists, roadmap plans or personally identifiable information,” researchers wrote.

Attempts to input PII into GenAI platforms represent over half (55%) of DLP events, according to the report (registration required) on Feb. 14. The next most common type of data attempted to be shared with GenAI platforms included confidential documents (40%).

Plugging the GenAI data hole

The evolution of GenAI is outpacing an organizations' efforts to train employees on DLP risks, wrote Pejman Roshan, chief marketing officer, Menlo Security. "While we've seen a commendable reduction in copy and paste attempts in the last six months, the dramatic rise of file uploads poses a new and significant risk.”

There has been a 26% increase of security policies restricting access to GenAI tools. Types of DLP efforts for GenAI platforms fall into two camps. One is domain-based blocking of GenAI websites and the other is implementing a user based permission based approach.

"For security and IT teams that apply policies on a domain-by-domain basis, they must revisit that list frequently to ensure that users are not accessing and, potentially, exposing sensitive data on a more obscure platform," researchers wrote. "This process can be time consuming and ultimately will not scale. Organizations need to adopt security technology that enables policy management on a generative AI group level, providing protection against a broader cross-section of generative AI sites."