How to validate, verify and scrub your email list

doublethinker · Jun 28, 2017

@CCarter stressed the importance of collecting leads and starting from the get-go. I set up Mautic, a free Marketing Automation software and rigged it up to a simple newsletter form.

Without knowing it, I have collected 1570 subscribers from three sites in its infancy.

I have no doubt that many of these are bots. Telltale signs include obscure email addresses and domain names that lead to nowhere worth mentioning.

But I am also getting email subscribers from corporate emails who are relevant to the niche of the site. I also have on my list the corporate email of a partner of a large law firm in New York. I'm skeptical, maybe some joker thought it would be funny to put him in the list, maybe he's there to eyeball what I'm doing with the site. Maybe he's a genuine reader.

I also have many gmail and yahoo accounts. Some appear to be real or at least have left a human footprint on a forum or someplace else. Others just look plain bot-like.

Here's the thing. I now need to:
a) Scrub the existing list; and
b) Validate future signups to the list

a) Scrub the existing list

Would you even bother and if you would, how? I am thinking of doing it by sending a "verify your subscription to this list" email where they have the option of opting out if they want to. No response means they stay on the list. This doesn't sweep the bots out though.

b) Validate future sign-ups.

The sign-up form at the front is simple, with basic format validation to check that it is an email address. It does not verify if the email address exists, or anything. It just collects. I might be susceptible to script kiddies spamming a gazillion email addresses that go nowhere, or worse- to people that never signed up.

What do builders do to responsibly secure down their list?

---
Update: I just had an idea while typing this; of using tracking pixels in emails and setting a custom trigger to remove the subscriber if 0 emails were opened after X amount of emails have been sent to it. But this might be non-trivial to setup.

doublethinker · Jun 28, 2017

Just found out that the sign- upform tracks the IP address of the email. So if the same user puts in a different email address, it just overwrites the previous/existing one which prevents multiple entries (at least if not proxied). Cool AF and makes the list appear slightly more legit.

Robin · Jun 28, 2017

I am currently in the same position as you. I started collecting my first emails about two months ago; I also have some clearly fake email addresses like you mentioned.

Double opt-in seems like a great way to make a high-quality list with active users, however, it weeds out a lot of "lazy people" as well (right now I collect and store both lists).

I really like your idea about using a tracking pixel to see who's "active" and then just ongoing cleaning the list - seems more effective than double opt-in.

CCarter · Jun 28, 2017

I am not sure why you guys aren't double opting-in these emails. Do you really want people on your mailing list that are too lazy to confirm their subscription? Aren't those going to be the people that hit the "spam" or "report" button in the future when they forgot they signed up to your list?

Again, with no double opt-in I can input scully@fbi.gov and now you'll be spamming that email address. And now you realize the situation with bots. All these things impact your spam score with each individual ESP (email service provider) like gmail, hotmail, aol, and Yahoo. It's more difficult to get your domain whitelisted on those ESP versus Spamhaus's blacklist.

It's just overall safe for you and creates you a bettter quality list of people that genuinely want to get emails from you. All of this could have been solved by double opt-in.

Now you gotta take chances of emailing a batch of people at different ESPs - which WILL create a flag since you are sending 10 to 100+ in a small timeframe to people that may not exist in their system and therefore you will be labeled a spammer. Simply sending one-off emails to non-existing accounts to confirm their opt-in is a lot safer when they signed up versus now testing large chunks to confirm.

Anyone suggesting to not double opt-in hasn't dealt with the headaches of going to all major ESP and even obscure ones and trying to get your domain whitelisted again cause of laziness - it's a headache not worth pursuing.

First thing I would do is create a script and run the email domain of your email against the StopForumSpam.com API (https://stopforumspam.com/usage) - if it comes back with a spam confidence remove the email address. Those "10-minute disposalable email addresses" like @sharklasers.com and @mailinator.com will be on those list so you'll e able to quickly filter out your list and not waste time and resources with them. That should reduce your list considerably.

Next "slowly" send pseudo confirmation emails- probably 10 at a time over the course of several DAYS (not hours - DAYS) and immediately remove the bounces which come back. You can do this from a random gmail if you want to not burn your domain within other ESPs.

Now you should be left with real emails on your list (Also on on your double opt-in).

doublethinker · Jun 29, 2017

Fantastic that's exactly the kind of advice I wanted to hear, and it sounds like email... optimization? is more cutthroat than SEO.

@Robin the tracking pixel idea turned out to be pretty easy to implement. On sign up they get X number of points. Every email sent expends a point. Every email opened gets them 2 points. If they get to 0 or low, it'll be easy to decide whether to take them off the list or not.

Mostly automatic except for the removal at the end.

doublethinker · Aug 21, 2017

Over the weekend I rediscovered recaptcha in its invisible form. I figured- what if I programmatically invoked a recaptcha challenge to the user on page load to decide whether I should insert the form for newsletter signup?

It seems to work. When I'm on VPN the recaptcha challenge pops up. When I'm not on my VPN, the recaptcha test passes and the newsletter signup form is inserted into the page.

On deployment, I have the recaptcha pop-up hidden completely on my live site. I don't need to issue to challenge, only to know when the challenged is issued. I've also set the security preference to the hardest level.

I'm not aware if spambots would be able to detect the newsletter sign-up form that's stored in a variable in javascript, it would have to be pretty smart- but that's easy to fix via encoding the string, or just obfuscating the variable by breaking it up into a few parts.

Code:

var enc = window.btoa(str);
var dec= window.atob(enc);

I figured this is a pretty flexible solution to stop bots from spamming your site whilst allowing real humans who use VPNs or shared ip addresses to visit your site without getting annoyed the shit out of them by anti-bot measures. It can be used for other user interactions like comments.

doublethinker · Aug 23, 2017

Nope, it didn't work. In fact, it doubled the number of fake signups. Back to the drawing board.

doublethinker · Sep 4, 2017

Update: Combination of Recaptcha, honeypot (extra email field) and obfuscation (btoa, atob) did the trick.

How to validate, verify and scrub your email list

doublethinker

doublethinker

Robin

I ain't Robin

CCarter

Final Boss ®

doublethinker

doublethinker

doublethinker

doublethinker