Google Analytics Hacks for Combatting Spam
Picture this: you’re diving into Google Analytics to check out just how much your marketing efforts have paid off this month.
Conversions are up and goals are doing well, but wait…what’s with that weird spike in traffic?
You know it’s not seasonal. In fact, as you dig a bit deeper, you realize you’re looking at sharebuttons.xyz, a traffic source you’re in no way familiar with. Which is odd, considering it drove over 60k visits last month.
Congratulations. You’ve just been hit with referral spam.
Referral spam is a type of fake traffic that hits your website through either spam bots or ghost spam. The fake traffic is then recorded in your Analytics, driving your reporting out of whack.
Now, at Nebo, we’re not in the business of getting clients increased visits just for the hell of it. We focus on tactics that drive real people to your site—people who want to be there and can benefit from your service.
But we’re not alone. Any good marketer should be saying adios to spam bots and black hat SEO tactics and instead should be focusing on determining which visits are real visits and how these analytics numbers translate into something meaningful for their clients.
The first step? Knowing what to look for.
How do you identify spam?
Luckily, referral spam is usually pretty easy to spot. Go into your Analytics and navigate to Acquisition > All Traffic > Referrals.
A dead give away to referral spam is the 100% bounce rate.
Referral spam doesn’t actually navigate throughout your site; instead they leave a fake hit and bounce immediately. This is done by generating random Google Analytics tracking codes and forcing codes to fire without actually interacting with the website. The referral spam’s sketchy title tag and meta description might also give it away.
You can double check referral spam with a quick Google search (Warning: don’t actually go to their website. Referral spam is designed to make you visit their site for bad SEO services and sometimes even a virus.), and you’ll likely find SEO forums discussing the latest referral spam.
What are the different types of spam?
Crawler spam is an Internet bot that scans your website for indexing purposes, much like Google Bot. It leaves a mark in Google Analytics with the hopes that while investigating you will return a visit to their website.
Ghost spam differs from Crawler spam in that it doesn’t have to interact with your site at all. This type of spam goes straight through Google Analytic’s Measurement Protocol and leaves a fake visit in your data. This type of spam is able to pass right through the Measurement Protocol by generating random Google Analytics Tracking codes. Because Ghost spam can generate tracking codes very easily, this type of spam is more prevalent than crawler spam.
So how do we combat spam?
Before we can combat spam, it’s important to ensure we’re following Analytics best practices with our account structure. You should always create three independent profiles: an unfiltered profile, a testing profile and main profile view.
An unfiltered profile is necessary because it will allow access to raw data without any segments or filters applied. You’ll want a testing profile to try out segments and filters to see if they are properly functioning before pushing live to your main view.
New views can be set up easily by going to Admin > View > Create a New View.
Next you will name your new view. For example, "Testing Profile."
Now that you have Analytics best practices set up, let’s take a look at combating ghost spam. Ghost spam can be removed from Analytics in a few different ways. The first way is by creating a segment that you can apply to any view. Go to Reporting > Add Segment.
Next you’ll want to create a segment including all valid hostnames, which is commonly known as the domain name.
Ghost spam cannot pass through your hostname, so an efficient method to remove ghost spam is by including valid hostnames only. Valid hostnames can be found by going to Acquisition > Source/Medium > Secondary Dimension > Hostname.
From there, you will sort through all hostnames to find only valid hostnames to include in the new “block ghost spam” segment.
Once you have complied all valid hostnames, you can add these to your new testing segment using Regular Expression. Regular Expression, or RegEx, is a sequence of characters that defines a search pattern and allows for a more precise segment in Google Analytics. If you have more than one valid hostname, separate them by a vertical bar.
Did the segment work? Let’s see!
There should be a drop in numbers across Analytics, including all sessions, new users and bounce rate because this segment removes bogus traffic data. Nebo does not have a huge referral spam problem, so the difference is minimal.
Now that we know our segment works, let’s apply the same information to a filter.
Using a filter rather than a segment will allow for easier reporting. You will not have to remember to add a segment each time; instead it will automatically filter out referral spam once it’s applied. (Note: Filters do not work retroactively.)
Go to Admin > Select the view you’d like to apply the filter to > Filter
Add a new filter…
Then create a custom filter that includes valid hostnames using the RegEx pattern we created for our segment.
And voilà! You’ve done it. Ghost spam will no longer hit the Analytics profile.
What about crawler spam?
Crawler spam is a bit more difficult to get rid of because this spam is actually hitting your website. The only way to filter it out is by excluding spam sources. This can easily become a game of whack-a-mole, as crawler spam is constantly changing and evolving, but that doesn’t mean you can’t filter out some of the big guys.
The first method is blocking crawler spam by the .htaccess file.
Blocking known individual spammy sources prevents them from ever actually hitting the server. This helps keep your server from crashing, and it eliminates any referral spam in your Analytics.
One issue with this method is that it continuously needs to be updated because new spammers are constantly being created. Additionally, many developers advise against adding “unnecessary” code in the .htaccess file to reserve it for critical lines of code like simple commands to allow access to the website. If you’d like to reserve your .htaccess file, you can instead add a filter in Analytics.
Go to Admin > Select the view you’d like to apply the filter to > Filter
Here you will exclude spam sources via RegEx rather than including valid hostnames shown in the ghost crawl filter.
Again, if you add this filter you will constantly need to update it as new spam shows up.
Will it ever end?
Google has apparently been working on fixing the issue for what seems like an eternity. Because Google isn’t completely heartless, Google Analytics does have a built in “Bot Filtering” option.
Go to Admin > Account > View and then select “View Settings”.
This certainty will not fix your spam issue, but it will assist you in fighting this battle for clean, pure metrics.
Combatting spam is an arduous task for marketers and site owners alike. Keep this guide handy, and don’t let the scary spam get in the way of your hard work.
Nebo believes that accurate data leads to better business, and we want to empower you to fight spam on your own. But if it’s too scary, give us a call.