If you find yourself here then chances are that you are a website owner or developer looking for how to add sitemap to robots.txt file. You are probably among the many people who want to do all they can to build the website authority. As well as get your website appearing as high up the Google search results as possible.
No doubt, one extremely important part of this improving SEO and ranking high on google search is allowing your website pages to be crawled and indexed by search engine bots (robots).
Behind the scenes, there are two different files that help to give these bots the information they need to quickly and effectively read the content on your website:
- Robots.txt file
- XML sitemap
What is a robots.txt file?
Before we move on, you may probably be wondering what at all a robot txt file is. In the simplest terms, a robots.txt file is a small text file representing some settings for your website. You may not find the robots txt file easily because it is located within the root directory of your website.
The contents of the file tell search engine robots what pages to crawl and not to crawl. And ultimately which search engines have permission to crawl your site. Having this file on your website is crucial when a search engine bot enters your site. The first thing it does is look for your robots.txt before doing anything else. Even if you think you want a bot to crawl all of your pages, you still need to have a default robots.txt file.
What is in the robots.txt file?
Basic format:
User-agent: [user-agent name e.g. ‘Googlebot’] Disallow: [URL string not to be crawled e.g. http://www.example.com/non-public]
The two simple lines above represent a complete robots.txt file. So you now understand why its a small text file right. Although its just a few texts, many lines of user-agents and directives can be written to give specific instructions to each bot.
If you want your robots file to allow all user-agents to search all pages, your file would look like this:
User-agent: * Disallow:
What is an XML Sitemap
A sitemap is an XML file which contains a list of all webpages on your site as well as metadata. (metadata being information that relates to each URL). Even though it sis stated web pages, it does not really mean just the pages. It includes your posts, categories, tags and even location. In the same way as a robots.txt file works, a sitemap allows search engines to crawl through an index of all the webpages on your site in one place.
How to add sitemap to robots.txt file
Creating a robots.txt file which includes your sitemap location can be achieved in three steps in this guide. Remember the robots.txt file is the instruction whilst the sitemap is the reference to content on your website. Follow these steps and know how to add sitemap to robots.txt file.
Step 1: Locate your sitemap URL
If you or your developer have already created a sitemap then it is likely that it will be located at http://www.example.com/sitemap.xml, where ‘example’ is your domain name.
To check if your sitemap is located here, simple type that URL into a browser and you will either see the sitemap or a 404 error which means it does not exist in that location.
Alternatively, you can use Google to locate your sitemap using search operators. Simply type site:example.com filetype:xml in Google’s search bar to see if Google finds it.
If you can’t find your sitemap it may not exist. In this case you can generate a site map yourself or request a developer to produce one for you.
Step 2: Locate your robots.txt file
Similarly to your sitemap, you can check whether a robots.txt file is available on your website by simply typing http://www.example.com/robots.txt, where ‘example’ is your domain name.
If you don’t have a robots.txt file then you will need to create one and ensure that it is at to the top-level directory (root) of your web server.
Simply create a .txt file and include the following text:
User-agent: * Disallow:
The above text allows all bots to crawl all your content. An easier way is to use a plugin like Yoast that allows you to create a robot.txt file for your site.
Step 3: Add sitemap location to file
Finally, you need to add your sitemap location to your robots.txt file.
To do so, you need to edit your robots.txt file and add a directive with the URL of your sitemap, as shown below:
Sitemap: http://www.example.com/sitemap.xml
And now your robots file should look like this:
Sitemap: http://www.example.com/sitemap.xml User-agent: * Disallow:
How to create a robots.txt file with multiple sitemap locations
Some larger website will have more than one sitemap to index all of their pages, or it may be that a site is having multiple sub-sections and grouping pages with multiple sitemaps is used to make things more manageable. In this case, you must create a “sitemap of sitemaps”, known as a sitemap index file.
The formatting of this file is similar to a standard XML sitemap file.
When you have multiple sitemaps, you can either specify your sitemap index file URL within your robots.txt file:
Sitemap: http://www.example.com/sitemap_index.xml User-agent:* Disallow
Or alternatively, you can specific each individual sitemap file URL’s as a list:
Sitemap: http://www.example.com/sitemap_1.xml Sitemap: http://www.example.com/sitemap_2.xml User-agent:* Disallow
We urge anyone looking to seriously improve their SEO to implement both of these files on their website. Without them, you will be lagging behind your competitors.