Sitemaps inform search engines which pages on a website should be crawled, and may help search engines discover and index those pages.
While sitemaps can be a simple text file listing the URLs of all of the pages you’d like to have indexed, they can also be an XML document carrying more information.
Are Sitemaps Required?
No, your ecommerce site doesn’t require a sitemap. That is the short answer. If your site is well built, with a navigable hierarchy and proper links, search engine crawlers should be able to discover your pages and index them.
There are several cases where Google and other search engines do, however, recommend a sitemap. For example, a sitemap can aid discovery on a large website with many pages. It can help a site with lots of content pages connected by only a few links, such as product detail pages. And a sitemap can help a new site that may not have many inbound links.
As Google explained, “Using a sitemap doesn’t guarantee that all the items in your sitemap will be crawled and indexed, as Google processes rely on complex algorithms to schedule crawling. However, in most cases, your site will benefit from having a sitemap, and you’ll never be penalized for having one.”
Automatically Generated
A good ecommerce platform or content management system will typically generate a sitemap automatically. What’s more, with a little help from a developer you can define how those sitemaps are created.
If your ecommerce platform doesn’t do it, there are also third-party sitemap generators or sitemap generation code libraries.
In short, you should not have to create a sitemap manually for your ecommerce business. Nonetheless, understanding how sitemap markup works and what it communicates may help your company’s search engine optimization efforts.
XML Sitemap Format
XML sitemaps are the most popular format for sharing link information with search engines.
The XML schema for the sitemaps protocol allows your site to communicate a page’s URL, when it was last updated, and how often it is updated.
An XML sitemap begins with the XML document type declaration. This declaration describes the rule set, if you will, that the document will follow. It is worth mentioning that XML sitemaps must be UTF-8 encoded, which is a method of converting letters, numbers, and others characters into a universal format.
<?xml version="1.0" encoding="UTF-8"?>
All of the specific pages described in the sitemap should be wrapped in a urlset tag, opened and closed. And this tag should include a reference for the current version of the sitemap XML schema, which at the time of writing was version 0.9.
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> … </urlset>
For each page listed in the XML sitemap, there should be a URL tag. This tag is the parent, and all of the other tags that describe a page are this tag’s children.
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url>...</url> <url>...</url> <url>...</url> </urlset>
There are at least two possible child tags to describe a page listed on an XML sitemap. The example below describes two category pages.
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://example.com/mens-hats</loc> <lastmod>2018-09-30</lastmod> </url> <url> <loc>https://example.com/ladies-hats</loc> <lastmod>2018-09-14</lastmod> </url> </urlset>
I’ll explain each of these tags individually.
<loc>https://example.com/mens-hats</loc>
First, the loc (for location) tag provides the canonical page URL. This is the official version of the page. This link should include the preferred version of your site’s fully qualified domain name.
The URL must also be escaped for non-alphanumeric characters and URL encoded according to the RFC-3986 standard. This is something that can be done programmatically. Finally, don’t include session IDs or parameters.
<lastmod>2018-09-30</lastmod>
The lastmod tag simply tells the search engine the last time the page in question was changed. The date should be listed as a four-digit year, two-digit month, and two-digit day format.
The lastmod tag may also pass minutes and seconds following the World Wide Web Consortium date and time format.
The XML schema also supports two other child tags: changefreq and priority. But Google has indicated that it does not use these tags when it reads your sitemap.
Submit a Sitemap
Once created, your sitemap should be submitted to Google, Bing, and other target search engines. You have a few options.
First, include a link in your robots.txt file. Simply include the path to your sitemap prefaced with the word “sitemap” and a colon.
Sitemap: https://example.com/sitemap.xml
Next, you can submit the sitemap directly to a search engine. For Google, open the sitemaps report in the Google Search Console. Then enter the relative URL for the sitemap — for example: /sitemap.xml — and click submit.
For Bing, navigate to the Bing Webmaster Tools, open the “Configure My Site” section, and select “sitemaps.” Then enter and submit the sitemap URL.
Finally, you may also submit the sitemap with an HTTP GET request or ping. Essentially, when you visit a specific URL (see the examples for Google and Bing below) and provide your sitemap address as a parameter, the search engine will capture that address. This can be done programmatically, but even pasting the link in a browser’s address bar will work.
If your sitemap were at https://example.com/sitemap.xml, you would use that address at the end of the GET request URL. Notice how the sitemap URL follows fits into these ping addresses below.
To Google.
http://www.google.com/ping?sitemap=https://example.com/sitemap.xml
To Bing.
http://www.bing.com/ping?sitemap=https://example.com/sitemap.xml