What are Sitemaps?

HTML Sitemaps - help humans navigate your website

HTML sitemaps can be:
  • Viewed by all browsers including FireFox, IE and Opera.
  • Crawled by all search engines including Google, Yahoo, MSN and ASK.

Some HTML sitemap tips and tricks:

  • HTML documents can be generated by PHP, ASP etc. It is the output format that matters.
  • Limit yourself to a few hundred links per page for best website results. Makes it easier to find your important pages.
  • You can read our article about creating HTML sitemaps for more detailed information.

Code example of HTML:

< html lang="en">
<head>This is a site map</head>
<body>
<h1>header of HTML site map</h1>
<p>site map paragraph with links
</body>
< /html>


XHTML Sitemaps - HTML sitemaps as XML

XHTML is the HTML specification moved into the XML standard.

Sitemap file with XHTML and HTML differences highlighted:

<?xml version="1.0" encoding="UTF-8">
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://­www.­w3.­org/­TR/­xhtml1/­DTD/­xhtml1-strict.dtd">
< html xmlns="http://­www.­w3.­org/­1999/­xhtml" xml:lang="en" lang="en">
<head>This is a site map</head>
<body>
<h1>header of XHTML site map</h1>
<p>site map paragraph with links</p>
</body>
< /html>


Text Sitemaps - simple sitemap

Text sitemaps contain one website url per line. Many search engines including Google and Yahoo can scan text sitemaps.

Improve compability between text sitemaps and search engines:

  • For Yahoo, name the primary text sitemap file urllist.txt.
  • Save text file sitemaps as UTF-8 documents. Especially if you have website urls with non-English characters.
  • Each text sitemap file should contain no more than 50.000 urls.

Example of text sitemap file:

http://­www.­example.­com/
http://­www.­example.­com/­some-directory/


RSS Feeds as Sitemaps - RSS 0.9, RSS 1.0 and RSS 2.0

The RSS protocol is often used in feed files for blogs, forums etc. The RSS file format uses XML and has evolved over multiple versions and names, all fairly compatible with each other:
  • Really Simple Syndication (RSS 2.0)
  • RDF Site Summary (RSS 1.0 and RSS 0.90)
  • Rich Site Summary (RSS 0.91)

After Google and Yahoo adopted RSS feeds as a kind of website sitemaps, more search engines have followed.

Note: There is no official standard for splitting RSS feed sitemaps into multiple files. However, if your RSS sitemap feed is too large, you may wish to, instead of just normal sitemap file split, create a RSS feed file per website category. (If using a sitemap generator tool try use include/exclude filters.)

Example of a RSS feed sitemap file:

< ?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>Website title</title>
<link>­http://­www.­example.­com</link>
<generator>A1 Sitemap Generator</generator>
<lastBuildDate>Tue, 13 Mar 2007 22:28:20 GMT</lastBuildDate>
<item>
<title>Page 1</title>
<link>­http://­www.­example.­com/­page1.­html</link>
</item>
<item>
<title>Page 2</title>
<link>­http://­www.­example.­com/­page2.­html</link>
</item>
</channel>
< /rss>


ROR Sitemaps - extends RSS sitemaps

ROR expands on the RSS protocol with its own extensions. The standard file extension for ROR files is .ror. All search engines that understand RSS sitemap files continue to understand the RSS parts of ROR files. However,no major search engine, if any at all, currently supports the ROR sitemap extensions. If you know of any major search engine that states support for ROR sitemaps, please write to me. Currently Yahoo SiteExplorer and Google Webmaster Tools have no mention of ROR sitemaps support.

ROR sitemap file with the ROR namespace extensions of RSS highlighted:

< ?xml version="1.0" encoding="UTF-8"?>
< rss version="2.0" xmlns:­ror="http://­rorweb.­com/­0.­1/">
<channel>
<title>Website title</title>
<link>­http://­www.­example.­com</link>
<generator>A1 Sitemap Generator</generator>
<lastBuildDate>Tue, 13 Mar 2007 22:28:20 GMT</lastBuildDate>
<item>
<title>Page 1</title>
<link>­http://­www.­example.­com/­page1.­html</link>
<ror:keywords>page1-keyword1, page1-keyword2, page1-keyword3</ror:keywords>
<­ror:­updatePeriod>day</­ror:­updatePeriod>
</item>
<item>
<title>Page 2</title>
<link>­http://­www.­example.­com/­page2.­html</link>
<ror:keywords>page2-keyword1, page2-keyword2, page2-keyword3</ror:keywords>
<­ror:­updatePeriod>day</­ror:­updatePeriod>
</item>
</channel>
< /rss>


XML Sitemaps Protocol - also called Google Sitemaps

In 2005 Google started its own sitemaps protocol based on XML. It was called Google Sitemaps. Google later convinced more search engines to follow and the standard was renamed toXML sitemaps protocol. Currently Google, Yahoo, Microsoft MSN Search, Ask, IBM and possibly more supports XML sitemaps. It is likely that more search engines will implement support for XML sitemaps.

The protocol of XML sitemaps also defines autodiscovery, i.e. how search engines can automatically discover website xml sitemaps. The answer is linking to the XML sitemap, e.g. sitemap.xml, from robots.txt.

User-agent: *
Sitemap: http://­www.­example.­com/­sitemap.­xml

Instead of just pointing to one XML sitemap file for auto discovery, you can list multiple sitemaps:

Sitemap: http://­www.­example.­com/­sitemap-1.xml
Sitemap: http://­www.­example.­com/­sitemap-2.xml
Or point to XML sitemap index file:
Sitemap: http://­www.­example.­com/­sitemap-index.xml

Information about XML sitemaps protocol:

  • Each XML sitemap file can contain max 50.000 urls and be 10 mb in size.
  • It is possible to link 1000 XML sitemaps using a sitemap index file.
  • You can read our article about page priorities in XML sitemaps.
  • XML sitemap files and sitemap index files have to be stored as UTF-8 documents.

Example of XML sitemaps file:

< ?xml version="1.0" encoding="UTF-8"?>
< urlset xmlns="http://­www.­sitemaps.­org/­schemas/­sitemap/­0.­9">
<url>
<loc></loc>
<priority>1.0</priority>
<changefreq>weekly</changefreq>
<lastmod>2007-06-18</lastmod>
</url>
<url>
<loc>blogs/</loc>
<priority>0.8</priority>
<changefreq>weekly</changefreq>
<lastmod>2007-06-21</lastmod>
</url>
< /urlset>
Posted in Blog on