Help Centre

Support > Promoting Your Website > Search Engine Optimisation

How To Customise Your Robots.txt File

A Robots.txt file is a text file associated with your website that is used to tell the search engines which of your website's pages you would and would not like them to visit.

In this guide, we will be covering:

How The Robots.txt File Works

The structure of a robots.txt file is very simple. Essentially, it's a note that tells search engines how you want in to index your pages. The most basic robots.txt file looks like this: 

User-agent: *
Disallow:

"User-agent" refers to search engines. If it is a * it is telling all the search engines something. "Disallow:" is then followed by a page or directory you do not want the search engines to index.

So the above example is telling all the search engines they can access everything.

The Default Setting On Your Website

We automatically set up your robots.txt file to be the following:

user-agent: twitterbot
disallow:

user-agent: *
disallow: /include/
disallow: /shop/basket_new.php
disallow: /shop/checkout_process.php
disallow: /account/
Disallow: /cdn-cgi/

With this, we have disallowed the search engines from accessing the CSS and JavaScript files necessary for the structure of your site, as well certain specific pages.

How To Upload Your Own Custom Robots.txt

You can fully customise your robots.txt file however you wish by following the instructions below:

  1.  Go to your live website and type at the end of your domain name: /robots.txt
    (for example: www.yourdomain.co.uk/robots.txt) Here you will be able to see what your robots.txt file currently is
  2. On your computer, open NotePad (or TextEdit on a Mac)
  3. Use this program to write your new robots file in plain text
  4. Save the file as the name: "robots.txt".

Then, to upload it and replace the Create one, please follow the steps below:

  1. Log in to your Create account
  2. Click "Content" from the Top Menu
  3. Click on "Files" from the left-hand menu
  4. Upload your new robots.txt file.

Your robots.txt will now be changed.

Real-life Examples And How To Amend Your Robots.txt To Achieve This

Below are a couple of scenarios you may want for your website and how you can amend your robots.txt file to allow this:

1. Allow all search engines access to images

To specify all search engines you will need to add a "*" symbol as your user-agent, as this represents all search engines:

User-agent: *
Allow: /siteimages/

2. Disallow all search engines access to images

User-agent: *
Disallow: /siteimages/

3. Allow only some search engines

If you would like to allow only certain search engines you would need to specify these, as below:

User-agent: *
Disallow: /sitefiles/
Disallow: /siteimages/

User-agent: googlebot
Disallow:

User-agent: bingbot
Disallow:

In the example above, all search engines are blocked from crawling your files and images apart from Bing (bingbot) and Google (googlebot).

4. Only allow some search engines access to images

If you want only certain search engines to crawl your images you will need to specify these, as below:

User-agent: *
Disallow: /siteimages/

User-agent: googlebot-image
Disallow:

5. Allow your images to be crawled by Google but not appear in Google Images

If you would like Google to crawl your images but for them not to appear in Google Images you will need to specify this by listing the Google Image robot in your robots.txt, as shown below:

User-agent: *
Disallow: /sitefiles/

User-agent: googlebot-image
Disallow: /siteimages/

By specifically listing this Google Image robot you are stopping your images from appearing in a Google Image search, however by allowing Google to still crawl them this does mean they may still pop up in a Google web search. 

6. Disallow some search engines so they cannot crawl anything

If you would like a specific search engine to not crawl your website at all you will need to add a "/" symbol, as this represents all of your content:

User-agent: *
Disallow: /sitefiles/
Disallow: /siteimages/

User-agent: bingbot
Disallow: /

In the example above, your robots is not allowing Bing to access your website but all other sites such as Google can!

7. Allow all search engines access to all pages on the site

If you would like to allow all search engines access to everything you will need to add the following to your robots.txt file:

User-agent: *
Disallow:

8. Disallow search engines access to some pages on the site

If you would like all search engines to not have access to certain pages you will need to add the page filename to your file as below:

User-agent: *
Disallow: /guestbook/
Disallow: /onlineshop/

You can find this filename by following these steps:

  1. Go to your "Site Content" screen in the Top Menu
  2. Select "Page Options" next to your page
  3. Here you can see the "Page Filename" field.

With password protected pages, these pages can be crawled by your robots but it cannot be accessed by a site visitor without a username and password. Due to this, if you did not want this page indexed at all you could add this to your robots.txt file but it is not necessary, as it cannot be accessed by all visitors.

9. Disallow search engines access to private documents

User-agent: *
Disallow: /sitefiles/27/3/6/273678/contact_form.pdf

You can find the filenames for your files by following these steps:

  1. Go to your "Site Content" screen on the Top Menu
  2. Select "Files" from the left-hand menu
  3. Click the "Link" button above your files
  4. This will show your file locations next to your files
  5. Copy this file location and paste it in your robots.txt file as above.

If you are unsure about amending your file for any reason, please contact your Account Manager who will be happy to update it for you.