Do sites really need a Sitemap.xml for good SEO?

It’s taken as received wisdom that good SEO requires a sitemap.xml file. This is so well established that every “SEO audit” tool checks for the existence of a sitemap.xml file and complains bitterly if it’s not there. There is no shortage of SEOs saying a sitemap.xml is practically mandatory. But does SEO really benefit from a sitemap file?

What problem does sitemap.xml solve?

Google created the sitemap.xml in 2005 to fix a problem – it exposed content that the search engines could not find otherwise.

This was a big deal – that was still in the era of sites which heavily used Flash, Silverlight, Java, and JavaScript for navigation and content. These practices were starting to change, but it was a slow process.

Content that can’t be found by clicking around on a website is a problem sitemaps help to solve. To a human it is trivial to type what they’re looking for into a search field and press “Submit”. To a machine, that’s complex. Humans can infer from context and are thinking about what they want; machines require definite instructions.

Sitemaps also allow very large websites with poor internal linking to make sure that content is found by search spiders.

But… A few things have changed in the 14 years since the creation of the sitemap:

  • Google can run/read JavaScript (imperfectly, but enough to read your content)

  • Google can run/read Flash (besides which, Flash is walking dead: Even Adobe has finally given up on the security dumpster fire known as Flash Player)

  • Silverlight is dead

  • Java in browsers is on life support (I recommend the CheerpJ extension for the occasional legacy page)

  • Web developers have learned how to make deep content and database-created pages easy to browse and to spider

Most of the problems that sitemaps were meant to fix are no longer problems. If your website still has Flash content or Java applets, it’s time to adopt HTML5. If your content is available only by searching, you should make it browsable.

But what about Google? sitemap.xml is required, right? It helps your PageRank, right? Link juice?

No.

There is no Google “bonus” for having a sitemap.xml.

I repeat: There is no Google “bonus” for having a sitemap.xml.

Sitemaps don’t even guarantee that Google will index those pages. As Michael Cottam wrote in his excellent blog on sitemaps:

Probably the most common misconception is that the XML sitemap helps get your pages indexed. The first thing we’ve got to get straight is this: Google does not index your pages just because you asked nicely. Google indexes pages because (a) they found them and crawled them, and (b) they consider them good enough quality to be worth indexing. Pointing Google at a page and asking them to index it doesn’t really factor into it.

For an informational site, whether it’s small or large, no sitemap is required.

Still, most SEO audit tools will still flag a “missing” sitemap. As with other elements of a technical SEO audit, you should apply your knowledge and human judgment.

So are sitemaps useless?

Actually, no! There are several jobs that sitemap.xml does quite well.

Use sitemap.xml to Manage Crawl Budget

Google is huge, but it still needs to use its resources wisely. Google allocates a certain “budget” of its GoogleBot time to sites. It doesn’t necessarily crawl every page of your site at once. It often splits the site crawl jobs into batches; I’ve seen Google take weeks to finish crawling a site. If your site has thousands of pages, and you recently updated a dozen, you want Google to focus on the updates. Update the modification dates for them in sitemap.xml, which signals to Google that it should focus its effort there.

Caution: Don’t update the modification date on every page. That tells Google that they’re all the same priority, and you won’t actually manage that crawl budget.

Use sitemap.xml to Trigger a Re-crawl after major changes

If you relaunch a site and change many pages’ URLs, a sitemap can tell Google “Hey, I added a bunch of new links, would you please crawl them?” Google may or may not choose to do something with that – it’s a request, not a guaranteed SLA. But usually it does.

If you use Google as the back end site search service, doing this will (usually) help kick off a re-spider and refresh search results to point to the new content and pages. (Technically, this isn’t an SEO function.)

Google Search Console (GSC) no longer allows you to re-crawl entire sites by submitting a single changed URL. The new GSC only indexes the single page you asked it to look at. So, if you need to have Google re-spider the whole site, it’s sitemap time.

Use sitemap.xml to Temporarily Fix Internal Linking Problems

Is Google not finding some pages on your site? Is it promoting utility pages over the important content? Does the site have lots of broken links, or poor link structure?

Sure, you should fix the problems, but sometimes you can’t do that quickly. Maybe you don’t have source code access, or maybe the site templating system is limited or poorly structured, or maybe every minor edit must be approved by an overloaded reviewer. (I’ve run into all of these over the years.)

Whatever the cause, you need to a quick fix. You can use a sitemap to help Google find important-yet-lightly-linked pages, tell Google which pages are most important to you, which are new, and which can be ignored or de-prioritized.

Quick Fixes:

  • Problem: Page not found by Google

    • Quick Fix: Add it to sitemap.xml

    • Real Fix: Make sure it is linked from within page content of other pages on the site. Make sure those are real <a> links, not buttons and not JavaScript.

  • Problem: New pages are not found

    • Quick Fix: Set modification date to a recent date

    • Real Fix: Make sure it is linked from within page content of other pages on the site. Make sure those are real <a> links, not buttons and not JavaScript.

  • Problem: Unimportant pages are surfaced in SERP

    • Quick Fix: Remove from sitemap.xml. (Many sitemap generators include everything on your site. Prune the sitemap before publishing.)

    • Real Fix: If it should never appear in search results, add a no index meta tag to the page. Otherwise, set its priority to a low value.

  • Problem: Page is found & indexed, but doesn’t appear in SERP

    • Quick Fix: I have bad news for you…

    • Real Fix: This is commonly a content quality issue. Google found the page, but the page isn’t very valuable compared to other pages Google knows about. Check your own site for similar content that might be crowding it out. Check other sites that appear in the SERP; maybe their content is better.

A sitemap can buy you time to fix the site’s underlying problems. But, tell your stakeholders that this is a temporary fix, not a permanent one. Google will still, in the end, assess your site on its own quality standards. If your site stays broken, eventually Google will down-rank it.

Use sitemap.xml to Show a Quick Hit (maybe)

There is some empirical evidence that submitting a sitemap causes Google to crawl sooner and push pages into search results faster. But, the most recent evidence I’ve found for this dates back to 2011, and Google has made major changes in the past eight years. So take this one with a grain of salt.

If you must have a sitemap…

Sometimes you just have a stakeholder that wants a sitemap, and you can’t say no. Luckily, it’s fast and easy to install a Craft, WordPress, or Drupal plug-in that generates a sitemap.xml. It won’t hurt anything, and it might help your stakeholder feel more confident.

So: Do sites really need a sitemap.xml?

Yes, if:

  • Your site has thousands of unique pages

  • Your site has non-spiderable content – a common failure mode of dynamic pagination libraries – but you should fix this, it’s a bug!

  • Your site has terrible internal linking

  • You need to submit a bunch of changed pages for indexing by Google

  • You just re-launched your website

Then sure, upload a sitemap.xml. Aside from that, it’s a harmless peace-of-mind prophylactic.

Further reading:

CMS Plug-ins for Sitemap.xml files:

Originally written for and published on the Imarc.com blog at www.imarc.com/blog/do-sites-really-need-a-sitemap-for-seo. Updated here in 2022, 2023, and most recently on March 1 2024.

Previous
Previous

Experience Rot affects full-site redesigns, too

Next
Next

How to speed up Docker on Docker with an external SSD, without losing data