Analyzing 136,000 New Domains with COVID-19 Themes

Graph of new domains related to COVID-19 registered from December 1, 2019 through March 27, 2020

New domains related to COVID-19 are proliferating rapidly. Although many of the new domains are legitimate websites, such as charity pages set up to support response efforts, the sheer volume of new content makes it challenging for security professionals and end-users alike to distinguish which ones are legitimate, questionable, or outright malicious.

To gain a better understanding of this newly-created content, SpyCloud researchers have compiled and analyzed a list of over 136,000 hostnames and fully qualified domain names with COVID-19 or coronavirus themes from a variety of open-source feeds.

Source

Description

Certificate Transparency logs

Open dataset for exploring SSL Certificates to identify potential abusive hostnames.

Risk IQ’s COVID-19 feed

Public feed of COVID-19 themed domains sponsored by Risk IQ.

Domain Tools’ COVID-19 threat list

Public feed of COVID-19 themed domains sponsored by DomainTools.

Rapid7 Project Sonar

Open data of internet-wide surveys conducted by Rapid7 Labs.


We then parsed, deduplicated, and enriched the data with HTTP, additional DNS analysis, and WHOIS data that was manually collected by SpyCloud researchers and some proprietary systems for automating data collection from those open sources. Based on this analysis, we have uncovered how many domains have been flagged in public threat intelligence feeds as malicious, which registrars have registered the most COVID-19 themed domains over the last 4 months, and what common keyword trends appeared among these hostnames.

More than 136,000 new COVID-19 themed domains were observed between 12/1 and 3/27.

For the purposes of this analysis, we examined 130,138 hostnames related to COVID-19, including 68,965 subdomains and 61,173 fully qualified domains. Although our raw dataset included 136,886 domains, about 6,748 domains within the raw data failed Whois analysis and were excluded because the Whois servers for those top-level domains were broken or unsupported.

Figure 1: Graph showing domains with coronavirus or COVID-19 themes registered each day from December 1, 2019 through March 27, 2020.

The graph above shows new domain registrations by day. The first small spike on the graph on February 11 reflects the date the World Health Organization named the virus COVID-19 (829 new domain registrations). The graph also reflects an increase in registrations starting around the last few days of February, a time when cases in Europe started to accelerate and newspapers began reporting initial infections in new countries around the world. The chart’s most dramatic spike in domain registrations was on March 12th, the day before President Trump declared a national emergency in the United States (from 1,715 cases on March 11 to 3,305 on March 12). The New York Times has summarized the progression of events within this timeline.

Some registrars are more popular than others—though several providers have promised to crack down on abuse.

The chart below shows the most popular domain registrars used by the COVID-19 themed sites in our dataset.

Figure 2: Top 25 domain registrars used to register COVID-19 themed domains. 

Some hosting and domain registrars have started to crack down on coronavirus-themed abuse, which is unprecedented. Recently, GoDaddy, NameCheap, and Tucows—three of the top five largest registrars for COVID-19 themes within our dataset—made statements about what they were doing to try and hinder abuse by preventing registrations with certain keywords or actively taking down fraudulent sites. We found this interesting in part because providers have previously resisted this type of action, arguing that to do so would affect free speech. 

This commitment from these providers may help curtail new COVID-19 themed domain registrations, and additional registrars may soon follow suit. However, the tremendous volume of recently-created domains may pose an obstacle as providers investigate potential fraud, which can be a highly manual process. In addition, we speculate that blocking specific keywords will cause bad actors to look for creative ways around these restrictions rather than abandoning their efforts. 

Many of the domains have active web content.

The majority of the domains we analyzed are accepting HTTP and HTTPS requests and have active web content. In other words, typing the domain name into your browser will direct you to a live website. That distinction matters because it means there’s activity happening; someone has taken the time to upload content.

Figure 3: Top 25 hosting providers for COVID-19 themed domains in our dataset.

Figure 4: Pie chart showing percentages of domains using HTTPS vs HTTP.

Figure 5: Top 25 webservers hosting COVID-19 themed content.

Some of the domains merely display basic content to show that they have been purchased, indicating that they have been parked at the registrar while the owner waits to either use the domain for their own purposes or sell it. Domain scalping may account for some of these purchases; for example, someone might purchase domains related to COVID-19 cures or vaccines with the hope of eventually selling them to a pharmaceutical company.

Figure 6: Screenshot showing a domain with placeholder content indicating that it has been registered.

The most popular keywords we observed tie back to coronavirus response efforts. 

To understand keyword trends across the domains, we examined a combination of singular keywords (COVID, vaccine, donation) and keywords that represent a set of related tags (mask, N95).

A very large number of the domains (28,282) contain the word “virus,” which is unsurprising. “Medical” terms like nurse and doctor also make sense; the medical content could be legitimate, but may also include scams such as ecommerce sites selling fraudulent protective equipment or phishing lures that promise medical information. (We were surprised to find only 15 domains related to toilet paper.)

Given the confusion and uncertainty surrounding testing, it’s also unsurprising to see that “test” ranks as the third most popular keyword. Our researchers noted a small spike in domains using this keyword after President Trump gave a speech promising additional testing.

Note that “.gov website” refers to the 220 new domains in our dataset that have .gov TLDs, indicating that those sites are likely legitimate government websites.

Figure 7: Keyword trends across the domains in our database.

Tag

Keywords (case insensitive)

Count

Virus

virus

28,282

Medical

mask, n95, cure, doctor, nurse, vaccine, medicine, drug, hospital, clinic, test, respirator, care

6,934

Test

test

3,265

Media

media, news, coverage, podcast, blog, stream, picture, video

1,992

Map

map, dashboard, tracker, tracking, app, android, iphone

1,775

Mask

mask, n95, respirator

1,461

Ecommerce

shop, checkout, supplies, store

1,129

Cure

cure

982

Economic

economy, economic, stock, business, market

903

Vaccine

vaccine

760

COVID

covid

689

Donation

donate, donation, donor, charity, charitable, fund

571

Corona

corona

387

Pandemic

pandemic

366

.gov website

.ends with .gov

220

China

china, chinese, wuhan

186

Memorial

memorial

111

Figure 8: Chart displaying how we categorized domains by keyword for this analysis, with exact counts. 

Some TLDs may provide benefits to threat actors.

The vast majority of the domains we identified use .com TLDs. However, we noticed many variations of generic top-level domains (GTLDs), which are popular with criminals because they can help malicious links seem more credible to users. For example, users may interpret GTLDs such as .shop and .store as indicators that they are looking at legitimate ecommerce providers.

Figure 9: Top 25 registered domain TLDs.

At the time of analysis, the majority of the domains in our dataset were hosted in the United States.

Servers hosting COVID-19 themed content were found to be all around the world, with the majority based in the United States.

Figure 10: Heat map showing locations of servers hosting COVID-19 themed content.

Within the dataset, we observed numerous examples of scams and phishing pages using domains related to COVID-19.

As we described in our recent post about common COVID-19 scams, threat actors are finding many ways to take advantage of people’s emotions about the global pandemic. As expected, we observed numerous phishing and scam domains when we examined the domains manually.

In the example below, SpyCloud researchers identified a phishing domain that a threat actor has set up to mimic a Chase Bank login. Most likely, the threat actor was sending phishing messages “from” Chase with some form of messaging about the bank’s COVID-19 response, making it seem plausible to users that their bank may have set up a dedicated page related to the virus. (Note that SpyCloud reported this domain to the hosting provider and it has since been removed.)

Figure 11: Screenshot of a phishing domain masquerading as a Chase Bank login page.

Querying the Whois information shows that the domain is obviously fraudulent, having been registered using a throwaway email account. YOPmail provides temporary, disposable email accounts to users interested in privacy, which can unfortunately include cybercriminals seeking anonymity.

Figure 12: Screenshot of Whois information for the Chase Bank phishing page. 

Relatively few of the domains we analyzed were included in lists of malicious feeds. 

We expected to accelerate our efforts to find malicious domains within our dataset by drawing on public threat intelligence feeds. What we found instead surprised us. 

When we ran all 136k hostnames in our dataset through 12 different community feeds, we were alarmed to find how few of the domains were identified as malicious. Even the Google Safe Browsing API—which has a robust dataset as one of the largest email providers in the world—only flagged 195 domains. Of the domains that Google Safe Browsing did identify, 84.6 percent were flagged for phishing.

We believe that far more of the COVID-19 domains in our dataset are malicious than are reflected in the community feeds cited above. One potential reason is that the feeds we used have a focus on threat intelligence specific to phishing and malware, not necessarily scam sites. In addition, these feeds are sometimes automatically ingested into security products, increasing the potential impact of false positives because they could cause service disruptions in corporate and private networks. 

 

Figure 13: Domains identified as malicious by nine community feeds. Three additional feeds did not identify any of the domains as malicious.

Feed

COVID-19 Domain Count

Link

urlhaus abuse.ch

209

https://urlhaus.abuse.ch/downloads/text/

Google Safe Browsing

195

https://safebrowsing.google.com/

dshield suspicious domains

110

https://secure.dshield.org/feeds/suspiciousdomains_Low.txt

Threats Hub

106

https://www.threatshub.org/download/

openphish

34

https://openphish.com/feed.txt

phishtank

19

https://www.phishtank.com/

CoinBlocker

5

https://zerodot1.gitlab.io/CoinBlockerLists/list.txt

domainsblacklist

2

http://www.joewein.net/dl/bl/dom-bl.txt

cybercrime tracker

1

http://cybercrime-tracker.net/

c2domains

0

http://osint.bambenekconsulting.com/feeds/c2-dommasterlist.txt

Malware Domains List

0

http://www.malwaredomainlist.com/mdlcsv.php

Botvrij.eu – domains

0

http://www.botvrij.eu/data/ioclist.domain.raw

Figure 14: Chart with details about the community threat intelligence feeds we used to identify malicious domains.

Conclusion: Here’s how the security research community can help.

The huge volume of new COVID-19 themed content represents a security challenge for users, corporate IT teams, and security professionals alike. Distinguishing between legitimate and malicious content has always been challenging. Criminals know that everyone is trying their best to stay on top of the latest news related to the COVID-19 crisis and will continue to exploit fear for profit.

As researchers, we are encouraged by the contributions we have already seen from the security community. Here are a few of the ways you can get involved, either to contribute research or to keep your own users safer:

Educate users about phishing and internet safety

It’s important to educate your users about identifying phishing attempts and avoiding risky actions like downloading files from unknown senders. Staysafeonline.org, powered by the National Cybersecurity Alliance, provides a thorough list of resources. Users can even play a role in helping combat coronavirus-related scams by reporting suspicious messages to email providers and corporate IT. Though flagging a phishing message within your inbox may not feel like a big deal, that action helps providers identify malicious content and flag it for other users. 

Identify IoCs within your own environment

Security professionals can use the COVID-19 Cyberthreat Coalition’s vetted list to look for indicators in their networks that may have COVID-19 themes. Although the list is currently small, it offers one of the most comprehensive lists of confirmed threats. Researchers from many different companies manually vet every item on this list, making false positives unlikely. 

Contribute to community feeds

As a security community, it’s important that we contribute back to public feeds. Identifying domains that are being used for phishing, malware, and other scams will require community effort because of the enormous quantities of content that are coming online. Manual verification is essential, particularly because a lot of threat intelligence feeds are tied into automated equipment like firewalls and other security products. When false positives are introduced, they can interfere with the performance of networking devices and other security products. 

Connect with other researchers

Several online communities have come together to discuss threats related to COVID-19 and share information:

Build on this research

We conducted the analysis in this blog using publicly-available feeds, and you can too. The sources we used are free for security researchers to use. 

We hope that this effort provides a foundation for other researchers to build on. To that end, you can access our full dataset here. Happy hunting.

Stop exposures from becoming account breaches.