There’s been a lot of stir in the media about the National Public Data breach, which made headlines in early August following a breach of the background check company in December 2023.
The nearly 2.7 billion leaked records – which notably include hundreds of millions of Americans’ social security numbers – first showed up for sale in criminal communities in April from a threat actor called USDoD. The dataset resurfaced again on August 6, this time with the data posted to Breach Forums for free, for anyone to download.
As a quick aside, there has been some mixed reporting on the content of this breach, as there are a few different datasets floating around. This blog primarily focuses on the breach posted in the screenshot above. However, SpyCloud has collected data from multiple alleged ‘NPD’ breaches, as well as some partial breaches, associated to the original breach and have updated this story to reflect that new information.
It’s certainly never good news when sensitive personal information is leaked, especially not in this quantity, but at SpyCloud we focus on if and how stolen data can be used by criminals looking to profit from it – so we can all better protect ourselves. With that in mind, let’s take a look at what our researchers found inside the breach.
What’s included in the National Public Data Breach
Our team at SpyCloud Labs ingested the data recaptured from the National Public Data (NPD) breach and was able to confirm that it includes the following.
2.7 billion total records
It’s a lot of data, but as you’ll see in our analysis, there are some caveats with regard to the integrity of the full body of data.
Sensitive personally identifiable information (PII) on individuals including:
- Full names
- Dates of birth
- 420 million distinct addresses
- 272 million distinct US Social Security Numbers (SSNs)
- Over 161 million distinct phone numbers
Historical data
The breach also contains a large amount of historical data on individuals including old addresses and phone numbers. So if you lived in “Fakesville, Indiana” in 2002, then it might have that old address as well as your current address in this database. This introduces risks because it is not uncommon for credit checks and other identity verification systems to verify places where you have previously lived in order to grant you access to certain resources, so this breach will likely make certain types of verification systems more trivial to bypass.
Alternative names
In addition to firstname, lastname, and middlename, this database appears to have included three ‘aka’ fields which contain alternative names for individuals, such as nicknames and former/maiden names.
Alternative dates of birth
Similarly to the ‘aka’ fields there are also three ‘altDOB’ fields. While these fields are not necessarily populated in many of the records, the existence of these fields for alternative dates of birth illustrates the level of tolerated inaccuracy in the database.
‘StartDat’
There is also a StartDat field in the database that is populated with date values. While it is unclear exactly what this field is meant to signify, our best guess is that this either corresponds to the date that the data was created by a data provider to NPD, the date that NPD obtained the data, or the date that NPD added the row to the database. There do not appear to be dates present in this field after 2021, possibly illustrating lag time in data being introduced into this database from its original source or from when NPD first acquired the data. Alternatively, it could mean that the data was obtained earlier than 2023.
Duplicative, inaccurate, and sanitized data
While the data appears relatively clean, there are some instances of redacted SSNs, invalid dates, and other data anomalies. Redacted information could be an artifact of individuals’ submitting data deletion requests to NPD. Based on our pivoting within the dataset, we also found many instances where inaccurate data appears to have made it into the database, such as individuals’ SSNs being mapped to the full names of relatives or unrelated individuals with similar names. If the main use case for NPD’s data is providing data to inform background checks, this may have been done purposefully; NPD likely wanted to cast a wide net in order to improve their likelihood of identifying any individual US person within their database.
UPDATE: Additional data relating to the National Public Data breach
How cybercriminals can use stolen data from the NPD breach
New account fraud
The data in the NPD breach includes plenty of sensitive PII that cybercriminals can use for new account fraud. With social security numbers, dates of birth, and addresses, bad actors can apply for credit cards or loans, or open bank accounts in victims’ names.
Phishing and smishing
While it’s true that cybercriminals will seek to use the data to steal identities and conduct new account fraud, our analysis suggests that the risks extend even further. This stolen information is likely to be traded and sold in fragments, potentially enabling a surge of phishing and smishing (phishing over SMS) attacks. Using PII made available through this breach, bad actors can sharpen their phishing and smishing attempts to be trickier, and therefore, more successful. For example, the inclusion of some alternative names in this dataset might allow phishers and smishers to enhance their social engineering attempts by referring to targeted individuals by their preferred nicknames.
Authentication bypass and account takeover
Even seemingly less significant data included in the breach, like street addresses or family member names, are often used as part of answers to security questions and could make it easier for bad actors to successfully bypass authentication methods, jeopardizing account integrity.
Synthetic identity creation
Additionally, PII data like names and addresses are a foundational piece for the creation of synthetic identities. Criminals will blend real and fabricated information to then create fraudulent accounts or transactions that are harder for fraud teams to detect.
How to protect your organization following the NPD breach
This is another situation that demonstrates the need for organizations to take a proactive stance to protect both their data and their customers, employees, and user base. Implementing strong, layered security measures is essential, including continuous monitoring for compromised credentials and automated tools to detect and respond to threats in real time. Educating employees about the risks of social engineering and phishing is also critical, as human error is often the weakest link in cybersecurity.
By investing in technologies that help you stay ahead of cybercriminals and prioritizing a culture of security, businesses can mitigate the impact of data breaches and better protect their brands and customers.
How to protect yourself following the NPD breach
- Freeze your credit: If you're not planning a big purchase in the near future, consider freezing your credit. Don't pay a fee to a company to do this, it's easy enough to do yourself and (for most people) can be done entirely online. Check out this guide from usa.gov.
- Get a copy of your current credit report: Obtain a free copy of your credit report as a baseline. Sign up for free weekly credit reports and make sure there isn't anything new on it that you didn't authorize.
- Stay alert: Be extra cautious about emails, text messages, and phone calls you receive in the coming months. Scammers can use personal details like those in the NPD breach to craft more believable scams. If you have elderly parents or relatives, make sure you talk to them about the types of scams – like tech support and fake IRS audits – that are rampant these days, and ensure they never transfer money without verifying the claims with a trusted party. You can also set up transaction monitoring and threshold alerts to receive notifications about suspicious credit card and bank account activity.
SpyCloud Labs is a focused cybercrime research group dedicated to uncovering and analyzing the most intricate patterns from the criminal underground. See other recent research from our team.