What do you do when your site is leaking user information into search results? Well, let’s take a step back and look at what kind of personal information this could be:
- Form information that contains first name, last name, email address
- Actual address information, including zip
- Unsubscribe confirmation URLs that contain email addresses and/or names
- Online ordering information like what recently happened to Panera
There are a number of ways this information could be seeping into Google’s index. The good news is that we’ve got you covered on how to fix most cases of this below. The bad news? If you don’t take care of it and your customers find out:
Best case: You don’t fix it and you lose some customers
Worst case: Your customers are scammed through the information provided, a lawsuit, and Google removes your analytics data because it contains personal identifying information. You also get dinged by the new GDPR EU rules and have to pay a hefty penalty. And deal with the negative press.
At this point, if you want Seer to take a look at this, we’re happy to. Drop us your info and we’ll get in touch with you within 24 hours.
Google provides some best practices on how to not have this happen, but if you’re reading this, you might be beyond the prevention phase and need the removal solution first. (Our very talented Analytics Account Manager Paige Flanagan wrote a post on how to find and address these in Google Analytics. Even if you remove PII from SERPs, PII in your GA data is a violation of GA’s Terms of Service and puts you at risk of having your analytics data deleted, so don’t gloss over this one if you use GA! Her second point is spot on for this, so don’t just take my word for it. Go read and come back here. (We’ll even open the link in a new window for you.)
Scenario 1: Under 20 URLs indexed
You’ve found that a handful of URLs containing personal identifying information have slipped through. No sweat (for the moment), easiest way to get rid of those for at least 90 days until you plug that hole is by submitting to Google Search Console.
Step 1: Go to Google Search Console
Step 2: Click on the Google Index dropdown, then click “Remove URLs”
Step 3: Click “Temporarily Hide” and add the URLs that are indexed
WARNING: In this step, you have the ability to deindex your entire site by only entering “/”.
DO NOT DO THIS 🙂
Scenario 2: Hundreds or thousands of pages are indexed
If you’ve found that hundreds or even thousands of pages are live that contain user information, don’t panic just yet. Here’s a few ways to make sure it’s removed.
Template Level Removal
It’s very likely the pages you’re looking to remove are part of a template where a “noindex” meta robots can be added. Here are instructions to add a meta robots “noindex” tag to a page.
A full article of the screenshot above can be found at this Google support page.
If there is no pattern for how these are appearing, there’s the very manual process of hardcoding a noindex tag into each page. While tedious, just one page being live could result in a GDPR penalty or users being phished. How much would Panera pay to have their story go away? Likely the cost of an intern going in an adding these for 2 days worth of work.