Properly Cleaning and Gutting Your Phish: How Cybercriminals Are Vetting Victim Data

Check your exposure

By submitting your email, you agree to receive email from SpyCloud related to this request.

At SpyCloud Labs, we frequently deal with the problem of “junk” data – data that is clearly falsified, fabricated, or improperly entered, but is mixed into a database with high quality data. As it turns out, cybercriminals face a similar challenge. Over the past few months, we’ve observed threat actors using some interesting strategies to filter out “junk” data. Most recently, we’ve specifically observed phishing operators using a simple technique to ensure they are collecting legitimate phished data: a phishing gateway page.

This gateway page sits on a separate host to the full phishing site and validates that user-submitted email addresses are valid and (optionally) match a phishing targeting list, before passing the victim on to input their sensitive information.

By analyzing these gateway pages, it is sometimes possible to obtain the full phishing targeting lists for different phishing campaigns, which can be periodically updated as new phishing attacks are started, and used across multiple phishing sites.

In this analysis, we’ll break down our observations of this technique, including:

Read on to see how bad actors are leveraging gateway pages to filter out unwanted and junk data from their phishing collections, as well as using them as an obfuscation tool to avoid being tracked by defenders.

Techniques Used in Phishing Attacks to Validate Stolen Data

Advanced email validation

There are multiple versions of this type of page, which appear to have been customized by operators over time, starting as early as July 2024. The versions include a basic regex validation which checks that an email follows a valid email address format, as well as a more advanced validation that checks the email against the targeting list for a phishing campaign.

Sometimes phishing actors use encoding or encryption to hide the URL of the targeting list. In rare cases, we have even observed operators deploy the gateway page with an API to check whether an inputted email address matches a targeted user.

Some pages include a toggle switch to determine how the email address should be validated as shown here:

If the email address fails the configured checks on this initial phishing gateway page, the user is not forwarded to the next phishing page. Different versions of the page have different redirect logic. For example, in Image 2, you can see a snippet of the redirect logic from one deployment of the page showing what the page does if the user inputs an undesired email address. Each time a user inputs an email that does not match the target list, they are returned an error message stating that the email address they entered is invalid and prompting them to try again with a valid email. After four unsuccessful attempts, the user is redirected to a random Wikipedia page.

When a user inputs an email that matches the required conditions—generally, either conforming to the correct email format or matching against a specified targeting list—they are redirected to the next stage in the phishing process. This next stage varies widely depending on which phishers have set up the page. We have observed phishing operators place this gateway page in front of a variety of different phishing pages including custom phishing pages as well as pages that appear to have been set up using popular phishing kits like Tycoon and Mamba (based on indicators such as url structure, textual content and behavioural analysis).

PhaaS Kit Page Patterns

Phishing pages set up by popular Phishing as a Service (PhaaS) kits often have recognizable patterns that allow us to identify them and associate them with the kit that was used to set them up.

Sekoia researchers noted in their blog on Mamba 2FA that Mamba phishing pages follow the following structure:

And Validin references Tycoon pages as having a specific structure and motivational text:

Some phishers also set up additional filtering logic in redirects between the gateway page and the main phishing page. In one example, we observed phishers set up IP-based filtering so that even if you passed the email check on the gateway page, some IP addresses were blocked from accessing the main phishing page and were instead redirected to a random Wikipedia page.

IP-based filtering might be based on an IP’s geolocation so that only victims coming from targeted countries get through to the main phishing page, or to block known datacenter IP address ranges which are often used by bots, internet scanning services, or security researchers.

Basic validation

The gateway page appears to contain a basic email validation step that checks whether an email address that is input on an initial phishing landing page follows the standard email format.

In the example in Image 3, the gateway page uses a regular expression written in javascript to determine whether the user inputted data follows the correct format of an email address. If the validateEmail function returns as true, the user is redirected to the phishing page.

Validation against email lists

Bad actors also use this gateway page to validate an inputted email against a target list. The target list is always linked somewhere in the source code for the gateway page. Most commonly, we have seen lists hosted as external text files that simply contain a newline-separated list of email addresses. We have observed phishing operators update these text files periodically with new data as they send out additional phishing emails.

In many cases, we found that the targeting lists for the gateway pages could be easily downloaded from the URLs exposed in the phishing pages’ source code, providing a full list of target email addresses for the associated campaign. SpyCloud has recaptured 40 unique phishing campaign targeting lists from these gateway pages containing over 930,000 targeted email addresses from over 250,000 domains.

Validation against an API

In one instance, we have even observed phishers create a simple custom API to handle validation requests against a list of target email addresses. The phishing gateway page simply submits an API request containing a user-inputted email address and the API responds with a true or false response.

The Phishing Gateway Page

These gateway pages use Microsoft branding and have a simple user interface that prompts a user to input an email address. Many of the gateway pages that we have observed deployed in the wild also use various obfuscation and anti-bot techniques designed to make them harder to track and detect as malicious.

We have observed different variations of this gateway page deployed by a wide range of different phishing actors in front of a variety of different phishing pages including custom phishing pages and those set up by multiple different PhaaS kits. Because the gateway pages are so simple, many use easy-to-stand-up hosting services such as GitHub pages, AWS, Cloudflare Workers, and Glitch.

Phishing landing pages

The phishing gateway pages all appear very similar; they include the Microsoft logo and text prompting the user to enter an email address to “confirm your identity” so that they can access some sort of sensitive document or message. Image 6 contains a generic example of one of these pages. Sometimes the phishers also include an additional image as a background behind the user prompt. In Image 7, you can see what appears to be a blurred-out image of a Microsoft Outlook email inbox, and in Image 8, you can see an image containing branding for Cisco Hypershield, a security product for protecting hyperscale data centers.

Many of the pages also have a function to randomize the page title each time the page is loaded from a short list of options like Secure Your Access, Verify Your Credentials, and Identity Verification Needed.

The initial landing pages only ask the user to input their email address, which allows them to straightforwardly gate user input based on their email validation logic. As we mentioned above, when users input an email address that fails the email validation logic, the pages will generally re-prompt the user to input a correct email address or redirect them to a different page such as a random Wikipedia page. This way, the phishing operators only receive additional user-inputted data, such as passwords or credit card information, for the specific victims that they are interested in targeting.

Obfuscation

These email validation pages serve as an obfuscation method in and of themselves because they serve as an additional step between the phishing email and the final phishing site. We have also observed many phishers use additional obfuscation tactics to make it more difficult for defenders to detect, scrape, and track these pages.

Often the phishers apply some basic obfuscation to hide the link to the email target list. The most basic version of this is simply base64 encoding the link to the email targeting list. We have also seen some phishers encrypt the target list URL, but include the decryption key and function to decrypt the URL in the source code for the gateway page, as you can see in Image 10. In both cases, it is trivial to obtain the raw URL either by unencoding or decrypting it manually using a tool like CyberChef or by capturing the outgoing request using a tool like Chrome DevTools.

We have also observed phishers go a step further and use base64 to encode the script containing most of the email validation logic. Within that encoded function, the URL for the email target list is also base64 encoded a second time.

In other instances, we have observed different phishing operators deploy versions of this phishing gateway page where they have used a javascript obfuscation tool to make their source code more difficult to understand. Additionally, we have seen versions with additional anti-bot functionality including requiring users to click a button before being redirected to the box to input their email address, mouse movement detection (which we show in Image 12), and requiring a wait time between email validation attempts.

How These Insights About Phishing Can Help Defenders

If you’re a defender, knowing an employee or consumer email is on a phishing campaign’s email list offers several benefits:

Awareness of targeting: If you know an individual is on a threat actor’s radar, you can be extra vigilant against phishing attempts.
Proactive defense: You can also implement step-up security measures, such as enforcing multi-factor authentication (MFA), improving email filtering, and/or adding extra training to help employees to recognize phishing attempts.
Incident response: If your organization has already been targeted, you can proactively monitor for potential compromises, such as credential theft or malware infections.
Cross-organizational protection: If you recognize partner companies on the list, you can collaborate with them to mitigate risks collectively.
Pattern recognition: With this information in hand, you can also analyze trends in the targeting, such as specific industries, job roles, or geographic locations being affected.
Threat actor attribution and tactics analysis: Last but not least, seeing multiple affected entities may help you determine the attacker's motives, techniques, or infrastructure.

Key Takeaways

This phishing gateway page is a simple but effective tool that we have seen phishing operators use in order to filter out unwanted users, bots, and security scanning tools from reaching their phishing pages.

Phishing gateway pages sit on a separate host to the full phishing site
The pages are used to validate that user-submitted email addresses are legitimate and can also determine if they match a phishing targeting list
If validated, the victim is directed to input their sensitive information on the full phishing site

At SpyCloud, we specialize in recapturing stolen data from phishing campaigns, malware infections, and breaches to empower organizations with proactive defense levers. Phishing remains one of the most pervasive cyber threats to businesses, but understanding its mechanics and impact can help organizations stay ahead. By adopting SpyCloud’s proactive defenses, you can reduce your exposure to identity-based attacks that use phished data.

SpyCloud is at the forefront of phishing mitigation – helping businesses neutralize threats and protect employee and customer identities.

Learn more about how SpyCloud Investigations

Keep reading

October Cybercrime Update: LummaC2’s Decline, Data Theft Extortion & Hacktivist Leaks

November 10, 2025

Catch up on the latest in cybercrime this month, including an apparent decline in LummaC2 infections, more BreachForums clones, new Cl0p ransomware extortion campaigns, and hacktivist data leaks.

SpyCloud Labs

Best Insider Threat Detection Tools & Solutions for 2026: A Comparison Guide

November 7, 2025

Most security teams are planning to augment their insider threat programs in the next year. Here’s a look at the main insider threat solutions in the market today, and how they fit together for comprehensive protection.

Best Practices

It All Counts: From Small Wins to Global Takedowns, How Being Mission-Driven and Curious Influences Cybersecurity Investigations for Good

October 31, 2025

Our investigative team sat down to talk about our mission to disrupt cybercrime. Read what drives them & learn how they’re using their expertise for good in the cybersecurity world.

SpyCloud News & Product Updates