Robots Tags: The Secret to Helping Google Index Your Content Properly
Ok, so Robots Tags aren’t really a secret… but they are often overlooked or worse misused. In January Google released a new Robots Tag, especially for media producers, so if your ministry has a Podcast, or audio sermons etc. you might want to take a closer look. Given that there is a new tag, we’re going to take a few minutes to give you the rundown, and also a few other easy ways to leverage and troubleshoot robots tags.
Let’s start with what’s new, afterall who doesn’t love shiny new robots tags. The new tag is “indexifembedded” yes that’s a mouthful for sure. It’s a little more readable as “index if embedded” and it does exactly what it sounds like it does. It enables Google to crawl and index content within an iframe that has been embedded within other content.
Where this might come in useful.
One fairly common issue that media producers run into from an SEO standpoint is that the media content is often hosted in one location and then embedded in another location. This is how most podcast hosting works (when you add the episode to your page using an embed code this is most likely done through an iframe). Previously you had two options when doing this, option one would be to noindex the original content and then it would also not be indexed when embedded on a page, or make it indexable and hope it doesn’t cause duplicate content issues (because Google will index both the original content and the page it’s embedded within).
In general when you are embedding media on a specific page, it’s because that page offers a richer experience for your users, as such you want that page to be indexed and shown to your users and because we like to have it all, you want your media content indexed on that page as well.
With the indexifembedded tag you can block indexing of the original source through the noindex tag and then allow it to be indexed when you embed it in a page. Google outlines several different ways this could be implemented here, but this simplest would be to add the following to the meta tags on your original content
<meta name="googlebot" content="noindex,indexifembedded" />
This is a very new tag from Google, so there is not a lot of data available yet as to how well it works or if there are any “gotchas” with implementing it. If you do decide to implement this tag let us know, we’d be excited to see how it works for you.
Even if you aren’t ready to give this one a try, below are a few quick tips on Robots Tags and the ways we see them cause issues.
1. Blocking your development or staging site, then forgetting to remove the block when you publish the site to production. We know it sounds silly but this happens more often than you’d think. If you’ve recently made a change to your website and seen your traffic tank, this is the first place you should look.
2. Pages that are blocked by robots.txt getting indexed. This is a fairly nuanced issue, which makes it one of the most common that we see. A page can be blocked from Google either through the robots.txt or using the noindex meta tag we demonstrated above. When you block the page using the robots.txt file however, that only stops Google from indexing the page through means it can find on your website. If an external source links to that page however, Google will add the URL to their index, but not add the content of the page. You will see this warning come up in your Search Console. If you want to ensure a URL does not appear in the search index the only way to do so at this time is to block it via the meta tag.
3. Submitting blocked pages in your sitemap. One thing we always recommend is having a very clean sitemap. Google gets to decide how they want to discover your content and index your pages. The fastest way (and the one that gives you the most input) is to leverage your sitemap. However, if your sitemap is full of errors or directing the search crawlers to pages they are not allowed to crawl Google will give less weight to your sitemap. So to get the best results from your sitemap make sure you are not including blocked pages.
There’s a lot deeper down the indexing rabbit hole that you could dive, but for now this should give you quite a bit to think about. As always we love learning alongside you, so if you’ve seen strange issues with Robots Tags, or seen success from implementing others let know that’s just the type of thing we like to geek out over.
Happy indexing!