Home > Search Engine Optimization > Robots.txt – Disallow a directory, but allow select files?

Robots.txt – Disallow a directory, but allow select files?

In the past, I haven’t done anything overly complicated with robots.txt, and I have a situation I need a little help with.

I have a directory I want to disallow, with the exception of a select few files in that directory. Is there a way I can say “disallow all but “?

Your help is appreciated!

Categories: Search Engine Optimization Tags:
  1. May 23rd, 2008 at 09:12 | #1

    Lyndsay,

    Unfortunately, no. Straight from Robotstxt.org:

    To exclude all files except one
    This is currently a bit awkward, as there is no “Allow” field. The easy way is to put all files to be disallowed into a separate directory, say “stuff”, and leave the one file in the level above this directory:

    User-agent: *
    Disallow: /~joe/stuff/

    Alternatively you can explicitly disallow all disallowed pages:

    User-agent: *
    Disallow: /~joe/junk.html
    Disallow: /~joe/foo.html
    Disallow: /~joe/bar.html

    Sorry. :)

  2. January 15th, 2009 at 19:03 | #2

    Robotstxt.org is way out of date with the features supported by the major engines. This is the best reference on the web: http://janeandrobot.com/post/Managing-Robots-Access-To-Your-Website.aspx (caveat, I work at Microsoft Live Search, and I help write this article).

    You could use the ‘allow’ directive, or build a creative pattern using the pattern matching.

    Make sure to test this using google’s webmaster tools!

    nate

  1. No trackbacks yet.