Allow only one file of directory in robots.txt? -
i want allow 1 file of directory /minsc
, disallow rest of directory.
now in robots.txt this:
user-agent: * crawl-delay: 10 # directories disallow: /minsc/
the file want allow /minsc/menu-leaf.png
i'm afraid damage, dont'know if must use:
a)
user-agent: * crawl-delay: 10 # directories disallow: /minsc/ allow: /minsc/menu-leaf.png
or
b)
user-agent: * crawl-delay: 10 # directories disallow: /minsc/* //added "*" ------------------------------- allow: /minsc/menu-leaf.png
?
thanks , sorry english.
according the robots.txt website:
to exclude files except one
this bit awkward, there no "allow" field. easy way put files disallowed separate directory, "stuff", , leave 1 file in level above directory:
user-agent: *
disallow: /~joe/stuff/
alternatively can explicitly disallow disallowed pages:
user-agent: *
disallow: /~joe/junk.html
disallow: /~joe/foo.html
disallow: /~joe/bar.html
according wikipedia, if going use allow directive, should go before disallow maximum compatability:
allow: /directory1/myfile.html disallow: /directory1/
furthermore, should put crawl-delay last, according yandex:
to maintain compatibility robots may deviate standard when processing robots.txt, crawl-delay directive needs added group starts user-agent record right after disallow , allow directives).
so, in end, robots.txt file should this:
user-agent: * allow: /minsc/menu-leaf.png disallow: /minsc/ crawl-delay: 10
Comments
Post a Comment