What is Robots.txt?

Robots.txt is a file in Search Console/Webmaster make to educate web robots (ordinarily internet searcher robots) how to crawl pages on their site.

Syntax for robots.txt file

 An Easy robots.txt file uses two words User-agent and Disallow.

  • User-agents are search engine robots or web crawlers like Google, Yahoo, etc.
  • Where Disallow is a command for avoiding access of page/file for User-agent.
  • You can Disallow one or many file/ pages at a time.

Basic format for blocking all crawlers:

User-agent: *

Disallow: [URL string not to be crawled]

Basic format for blocking specific crawlers:

User-agent: Googlebot

Disallow: /example-subfolder/

 How does robots.txt work?

Web engines have two functions:

  1. Crawling the web to find content
  2. Indexing that content with the goal that it can be served up to searchers who are searching for data.

Need of robots.txt?

  • Robots.txt control crawler access to specific zones of your site.
  • Blocking copy content from showing up in SERPs
  • Preventing web crawlers from requesting certain records on your webpage
  • If there is no unhelpful page on your site you can avoid to use robots.txt

You can also allow the access of blocked  pages/files to User-agent by giving access through

User-agent: *

Allow: [Blocked URL string]

How to create and place robots.txt file?

  • Create robots.txt file by writing syntax and then save as text file.
  • Place the saved file into the top level directory of your site.
  • Saving the file as text helps web crawlers to recognize your file easily.

How can I find robots.txt on a site?

If you want to know whether your site contain robots.txt or not the just do the following step

Type your base URL then add /robots.txt

Ex. www.example.com/robots.txt

It shows one of the three results :

1) You Will find robots.txt file

2) It shows empty file

3) Get error 404

Things to keep in Mind :

The filename is case sensitive; please make sure to type robots.txt and not Robots.txt.

The document is publically accessible and anybody approaches the record. It is imperative to apply better security to areas of your site you are attempting to keep covered up for security reason.

You can write disallow ones per URL.

Alongside the record the “noindex, take after” marks should similarly be used on each and every related page

Conclusion:

In conclusion, robots.txt file allows you to block certain pages/files of your site that you don’t wish to index by the search engines. It helps to confirm security of a website.

 

 

 

 

 

 

 

What You Need To Know About Robots.txt In SEO?
It's only fair to share...Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInPin on PinterestShare on Tumblr

Leave a Reply

Your email address will not be published. Required fields are marked *