Gary Illyes from Google revealed some interesting insights into site’s search results pages and the Google search results – Google’s algos can automatically detect some search result pages from a site’s own internal search engine and automatically remove them from Google’s search results.
Getting a site’s internal search results pages indexed into Google’s search results is an interesting thing. Google’s webmaster guidelines states that these types of pages should not be indexed. But for some site owners, they are sure these results pages are useful and want them indexed. But Google could be thwarting those attempts.
“We generally frown on getting search pages indexed,” Illyes said. “Generally, they are not that useful for the users. And we do have some algorithms which try to get rid of them, essentially not getting crawled, but sometimes that fails. And in those cases, we either need to manually intervene and create new search rules for those pages.”
Mariya Moeva also added that site owners can also use the URL parameter tool in these cases where it follows a pattern.
Not all types of these search results that are indexed are useless. But with so many of them delivering either a poor experience (landing on a page showing only similar products) or a non-existent one (no products found) it is not surprising that Google does this algorithmically. And for many sites, these types of internal search results pages are served just as well, if not better, through use of categories.
These types of pages can also waste crawl budget, and for larger sites, that crawl budget can usually be put to better use crawling actual pages instead.
And lastly, Googlebot can sometimes end up in an infinite loop, crawling more and more search results pages, which are often empty, through the use of “did you mean” or “suggested searches” which lead to more empty result pages. This will often trigger an alert in Google Search Console if it recogninzes it seems to be an error.
As for which types of site’s internal search results Google is able to algorithmically detect and remove, it is likely the ones triggered by the most popular CMS and ecommerce plaforms.
So if you are attempting to get some of your interal search results pages indexed – despite Google’s guidelines recommending that you do not index them – without any luck, it could be that Google’s algos are detecting those search results and removing them.
— Jennifer Slegg (@jenstar) March 23, 2017
Latest posts by Jennifer Slegg (see all)
- Google Treats Hreflang in Sitemaps and HTML the Same - June 13, 2018
- Google Dropping Meta Tag Description Length Warnings from Search Console - June 7, 2018
- Google Search Console Crawl Stats to Update Soon - June 6, 2018
- Google: Don’t Block Googlebot from Slow Resources on Page - June 5, 2018
- Google’s Googlebot Crawling, Search Visibility and Rankings - June 4, 2018