The robots.txt file is your first line of defense in controlling how search engines interact with your website — even if it only contains a few lines of text.
What is Robots.txt?
Robots.txt is a plain text file placed in your website's root directory. It tells search engine crawlers which pages or files they can or cannot request from your site. This is used mainly to avoid overloading your site with requests.
Basic Syntax
User-agent: * Disallow: /admin/ Allow: /public/ User-agent: Googlebot Disallow: /private/
Common Mistakes
- Blocking your entire site: Accidentally using
Disallow: /will stop all crawling. - Blocking important assets: Google needs access to CSS and JS files to render your page correctly.
- Thinking it hides content: Robots.txt is not a security measure. Use password protection for sensitive areas.
Conclusion
A well-configured robots.txt file is a vital part of technical SEO. Regularly check your file to ensure you aren't accidentally blocking critical content.