In the past, I haven’t done anything overly complicated with robots.txt, and I have a situation I need a little help with.
I have a directory I want to disallow, with the exception of a select few files in that directory. Is there a way I can say “disallow all but
Your help is appreciated!
Category: Search Engine Optimization
Michael VanDeMar says:
Lyndsay,
Unfortunately, no. Straight from Robotstxt.org:
To exclude all files except one
This is currently a bit awkward, as there is no “Allow” field. The easy way is to put all files to be disallowed into a separate directory, say “stuff”, and leave the one file in the level above this directory:
Alternatively you can explicitly disallow all disallowed pages:
Sorry.
nate says:
Robotstxt.org is way out of date with the features supported by the major engines. This is the best reference on the web: http://janeandrobot.com/post/Managing-Robots-Access-To-Your-Website.aspx (caveat, I work at Microsoft Live Search, and I help write this article).
You could use the ‘allow’ directive, or build a creative pattern using the pattern matching.
Make sure to test this using google’s webmaster tools!
nate