SEO

What is Robots.txt File and How to Set it up?

Hi Again! As I promised to write all the basic SEO tips on Stoogles.com, I have started the series Summarized SEO world and now I am writing its 8th part which is about Robots.txt. Hope you are enjoying the basic SEO tips here. If you have suggestions then do let me know. I love to hear from you.

In case you have missed the first 7 parts of this series, I am sharing the links of them here so that you will not have to face any hassle to find them. Check them out here:-

1. Site Analysis
2. Competitor Analysis
3. Keyword Research
4. URL Optimization
5. Writing Title and meta description
6. Content Optimization
7. Image Optimization

Introduction to Robots.txt

Robots.txt is nothing but a simple text file through which you can control the behaviour of search engine robots either to crawl and index your website/any particular directory or not.

By default every website allows the Search engines robots but if you want to restricts the robots either to not to crawl any certain directory, file or the full site then you will need the robots.txt file in which you have to write instructions for search engine bots.
Robot.txt

Why should you use robots.txt file on your site?

There are times when you are developing your site online and dealing with so many test pages, files etc. Those might be crawled by Search engines bots and as you know at the time of development and design those pages might not have good content, and if search engines crawl and index them then it will not good for your site’s SEO. There might be so many garbage from your site indexed by Search engines already which you will have to de-index later. That will be cumbersome to do. Hence, at the time of development robots.txt can be a great help through which you can restrict the search engine robots.
Secondly, when your site is running fine and you might have some private folders or anything which you don’t want to show up in search engines then Robots.txt plays an important role here. You can disallow them from crawling and indexing certain files and folders.

How to Write Robots.txt file

There is very simple rule to write the instructions in a robots.txt file. Lets have a look on the simplest robots. Txt:-

User-Agent:*
Disallow: /

Here The very first line “User-Agent” defines the agent’s or bot’s name such as Googlebot. T
The wildcard “*” defines all the robots.
The second line says “Disallow” where you define the particular directory or any particular directory which you dnt want to get crawled. Here “/” says that don’t crawl the whole site.

If you want to hide only 1 directory such as “private” from all the robots then :-

User-Agent:*
Disallow: /private

If you want to hide this directory only from Googlebot then:-

User-Agent: Googlebot
Disallow: /private

If you want to disallow certain page in a directory then you may write as:-

User-Agent: Googlebot
Disallow: /private/abc.html

In this way you may write a robots.txt file for your website or blog but there are so many generator out there online which can help you create this easily.

Here are names of few Google Bots for your reference:

  • Googlebot
  • Googlebot-Mobile
  • Googlebot-Image
  • Mediapartners-Google
  • Adsbot-Google

Where to upload your robots.txt file?

After writing the file save it as robots.txt on your system and then upload it to the root directory of your website. The url should form as yoursite.com/robots.txt

I know this post is not a new thing but I hope it might help many newbies to understand the importance of robots.txt. If you like the post, I would like you to share this on your social profiles. Thank your very much for giving time to read this post.

About the author

Atish Ranjan

Atish Ranjan is a web enthusiast and blogger who loves blogging. He enjoys the challenges of creativity by providing information from the field of technology, SEO, social media and blogging.

31 Comments

Click here to post a comment

CommentLuv badge

  • Yeah, robots.txt are useful in removing pages that you don’t want to show up in Google.

    If your pages like yoursite.com/page/2 are showing up in search engines, you might get penalized by Google Panda!

    Nice post Atish! 😀

  • It is a new information for me!

    I came across the name ” robots file” before, but now understood its necessity.

    But having little confusion in uploading the robots-txt file, as am not clear with it by reading.

    Hope it would be better to do an experiment to have better understanding, will do.

    Atish, Love your simple writing style which is easy to understand by the newbies like me.

    No need to think that it is a repeated information, no one can write like you 😛

    Thanks for writing and sharing the needful information!
    Nirmala recently posted…My contribution in a guest blogging contest and inviting bloggers to itMy Profile

    • thanks for kind words. the thing you haven’t got understand is about uploading the robots.txt on your server? Then let me telyou that you can upload this file either through your cpanel or FTP. When you login to your FTP you can see the files and folders of wordpress, just there you upload this. If you need more understanding then do let me know I will try to let you understand through team viewer.

  • I have worked with robots.txt in the past; recently, I have found certain plugins (yoast’s SEO and a security plugin, for example) make specific suggestions on how to deal with the robots.txt file, so I don’t have to rely on my own to get all the facts correct. But it’s good to learn how and why, so you know what the plugin is doing. For example, Dhruv mentioned the second page not getting indexed – you can learn how to read it in your robots.txt and allow the Yoast plugin to disallow the duplicate indexing so you don’t have to do all the manual hand coding but you will understand what is being done.
    Leora Wenger recently posted…Make Your Website into Home BaseMy Profile

    • Yes, Its good if you are dealing with a wordpress site. You don’t need to do everything manually but what if you are dealing with custom website which is not using any CMS then you might want to know the whole process. Good to get your suggestion Leora.
      Appreciate your constant visit on stoogles.com. Thank you so much! 🙂

  • Hi Atish – a very helpful post, one I’ll definitely bookmark and share. I did learn a bit about the Robots.txt file on a course I did a while back, but we were advised to use the WP Robots Txt plugin – I haven’t needed to use it yet but probably will in future, and your instructions for doing it manually are very clear. A great share – thanks!

    Sue
    Sue Neal recently posted…Why Your Writing Sucks!My Profile

    • Thanks for appreciating my work. If you ever need any help regarding this then do let me know please. thanks for coming here and giving your time to read and comment. 🙂

      – Atish

  • Hi Atsish,

    As you have told me about the Robots.txt File. Really its informative massage for those people who wanna hide the pages in the google. this Robots.txt file, Actually i have already used it. and i will keep using it. But I must say thanks to you dear. for reamind me about the Robots.Txt fiile. Well thanks for sharing with us nice post.

  • Hi,
    I have listen about robot.txt file but not understand what is and what’s important of robot.txt file. But Now I understand robot.txt. Thank you for sharing nice basic information.

  • Hey Atish,

    I learned about this particular file several years back now. You explained it well though for someone just now learning about this. I don’t claim to be an expert in all of this though but I do know what’s important and why we need these in place. Glad that’s all behind me now.

    Great job and have a good week.

    ~Adrienne
    Adrienne recently posted…How To Easily Check For Broken CommentLuv LinksMy Profile

    • Thats good Adrienne that you are already know about this. I hope newbies can get help with this post.

  • Thanks for the post!
    I have only a doubt: the “#” is for comments, right? Because in al instructions i used to have the “#” before the “Disallow”, now i take it off. Is that ok?

    PS: Sorry for my english!

    • You have to create a robots.txt file and then you have to upload on your webserver in the root directory then you can see here yoursite.com/robots.txt

  • Antique Article, I know about the robots.txt. and i have already used robots.txt if you dont wanna show any page of your site then with the help of robots.txt you can hide any page of your sites. and Mostly people use it. Really i am impressed with your post. Thanks dude.

  • Dear Atish ,
    very nice and informative post about robots.txt . You have shared very important information here . also this tag is very important you can hide any page our site and blog.

    Thanks for sharing

    • Yes Just write disallow and then page name. See in the post I have cleared very well with example.