Two security researchers have uncovered four billion records on 1.2 billion people on an unsecured Elasticsearch server impacting what is estimated to be hundreds of millions of people.
The data itself comes from the data aggregator and enrichment companies People Data Labs (PDL) and OxyData.Io and contains basic personal information, such as names, home and mobile phone numbers and email addresses and what may be information scraped from LinkedIn, Facebook and other social media sources, Vinny Troia reported.
- Over 1.5 billion unique people, including close to 260 million in the U.S.
- Over 1 billion personal email addresses. Work email for 70%+ decision makers in the US, UK, and Canada.
- Over 420 million LinkedIn URLs.
- Over 1 billion Facebook URLs and IDs.
- 400 million plus phone numbers with more than 200 million U.S.-based valid cell phone numbers.
Doing their due diligence, the firms were contacted and each denied ownership of the servers. This was true as was found that the information attributed to PDL was stored on Google Cloud while the PDL API appears to use Amazon Web Services.
A test and additional research was then conducted to firm up the data’s point of origination.
“In order to test whether or not the data belonged to PDL, we created a free account on their website which provides users with 1,000 free people lookups per month. The data discovered on the open Elasticsearch server was almost a complete match to the data being returned by the People Data Labs API,” Troia wrote.
OxyData.Io was confirmed when the company turned over the records it had on Troia who found it matched content found on his LinkedIn profile.
While the data undoubtably came from the two companies, it is not understood how it came to reside on these servers.
“This is an incredibly tricky and unusual situation. The lion’s share of the data is marked as ‘PDL’, indicating that it originated from People Data Labs. However, as far as we can tell, the server that leaked the data is not associated with PDL. This raises a number of other questions. First, how did this mystery organization get the data? Are they a current or former customer? If so, the data discovered on the server indicates that this company is a customer of both People Data Labs and OxyData,” Troia wrote.
Troia also noted the victims in this case have not been notified because nobody knows who was operating the server in question.
“Because of obvious privacy concerns cloud providers will not share any information on their customers, making this a dead end. One could argue that because PDL’s data was mis-used, it is up to them to notify their customers. One could also argue that the owner of 18.104.22.168 is responsible and liable for any potential damages. But legally, we have no way of knowing who that is without a court order,” Troia wrote.
The danger to those whose information was exposed is well known with each person now possibly in line to receive a variety of attacks.
“Given that email addresses are by definition unique identifiers, separate data sets that include email addresses can be easily combined to provide comprehensive digital maps of blocks of millions of people. And then this data can be used very effectively in targeted phishing attacks, as fodder for social engineering, in business email compromises and financially focused identity theft. Unfortunately, it seems that there is no end in sight for these types of massive data leaks,” said Matthew Gardiner, director of enterprise security at Mimecast.