Robots.txt – Disallow a directory, but allow select files?

ByLyndsay Walker

Robots.txt – Disallow a directory, but allow select files?

In the past, I haven’t done anything overly complicated with robots.txt, and I have a situation I need a little help with.

I have a directory I want to disallow, with the exception of a select few files in that directory. Is there a way I can say “disallow all but “?

Your help is appreciated!

About the author

Lyndsay Walker administrator

2 Comments so far

Michael VanDeMarPosted on9:12 am - May 23, 2008

Lyndsay,

Unfortunately, no. Straight from Robotstxt.org:

To exclude all files except one
This is currently a bit awkward, as there is no “Allow” field. The easy way is to put all files to be disallowed into a separate directory, say “stuff”, and leave the one file in the level above this directory:

User-agent: *
Disallow: /~joe/stuff/

Alternatively you can explicitly disallow all disallowed pages:

User-agent: *
Disallow: /~joe/junk.html
Disallow: /~joe/foo.html
Disallow: /~joe/bar.html

Sorry. 🙂

natePosted on7:03 pm - Jan 15, 2009

Robotstxt.org is way out of date with the features supported by the major engines. This is the best reference on the web: http://janeandrobot.com/post/Managing-Robots-Access-To-Your-Website.aspx (caveat, I work at Microsoft Live Search, and I help write this article).

You could use the ‘allow’ directive, or build a creative pattern using the pattern matching.

Make sure to test this using google’s webmaster tools!

nate

Leave a Reply