Simon Willison’s Weblog

Subscribe
Atom feed for andrew-wooster

1 item tagged “andrew-wooster”

2007

robots.txt Adventure. Interesting notes from crawling 4.6 million robots.txt, including 69 different ways in which the word “disallow” can be mis-spelled.

# 22nd September 2007, 12:36 am / andrew-wooster, crawling, robots-txt