Removing Spam From Your Analytics Data

If you’ve ever looked around a Google Analytics reporting dashboard, chances are you’ve seen data that left you a bit puzzled. Maybe one of your top 10 viewed pages is “/sharebutton.to” (aka NOT a page on your site) or you have a lot of referrals coming from “abc.xyz” (sounds legit, right?). Unfortunately this is all a result of spam. No, not the delicious disgusting canned meat, but rather fake traffic caused by sites trying to gain views and rankings.

Analytics is a very important tool in measuring the SEO success of your website but if the data is inaccurate due to spam, it won’t be as helpful as it should be. The best way to make sure you are using analytics to your full advantage is to get rid of all the spam that is affecting your data.

One of the biggest ways that analytics data becomes skewed is by ghost spam. These are fake visits to your site that are trying to entice you to go to the spammer’s URL. To see ghost spam, choose the Network report under Audience - Technology and then choose Hostname as the primary dimension. Everything listed that is not a valid hostname for your site is ghost spam. You can remove this traffic by creating a regex of all your valid hostnames and then creating a custom view filter to include only this list. You can also remove this spam from historical data by creating a custom segment that does the same thing.

Along with ghost spam, most sites will have issues with crawler spam, also known as referral spam. This happens when crawler programs, or bots, show up as referral traffic in your data. While most bots are trying to increase the search engine ranking of the website they are promoting, not all referral spam is malicious. For example, “googlebot” may show up when Google crawls and indexes the pages on your site. Even though this isn’t a bad thing, it still isn’t relevant data for your analytics.

Because crawler spam uses a valid hostname, it is harder to detect and requires a different filter from the ghost spam to remove it. To do this, you can create a filter that excludes Campaign Sources with the crawler spam names. You can find these names by looking at the Referrals report under Acquisition. It can be fairly time consuming to create a regex of all of these names, however you can typically search online for an updated list of the most common crawler spam. Like ghost spam, this can also be excluded from historical data with a segment.

Google Analytics also provides a quick and simple way to reduce traffic from bots under View Settings. Towards the bottom of the screen there is a check box under Bot Filtering that states, “Exclude all hits from known bots and spiders”. Check this box and click Save.

It’s likely that applying all of these filters will reduce your total number of sessions and users, but you’ll also probably see positive improvements in bounce rate and session duration since a lot of spam traffic has 100% bounce rates and 0:00 session durations. Even if filtering out spam has a less than desirable effect on your data, it’s still important to remove it. Google Analytics can only help you make the best decisions for your website if the information is accurate and relevant.

If you need help with cleaning up your analytics, or any part of your SEO, fill out our Get Started form today!

Contact Us

Interested in a free consultation? Fill out our Get Started form…

Go for it!

Or, give us a call...

Drop in and say hi...
401 W. Capitol Ave. STE. 400
Little Rock, AR 72201


Pixel Perfect Creative | 501.537.2246 | info@pixelperfectcreative.com