Trashwiki.org:Robots.txt

From Trashwiki.org
Jump to: navigation, search

We're hiding most Trashwiki pages from bots through robots.txt - mostly because of Google and shop owners and corporations that are googling for their own names. See also the discussion related to this, and also project:tech.

It's good to have neutral pages that don't show brands indexed by Google, the more pages indexed, the more chance dumpster divers will find this wiki and contribute.

Check the actual robots.txt file.

Let guaka know if you want to add or remove something from robots.txt. Note, guaka's user page is also indexable by Google as he doesn't mind this (and it adds a little bit more authority to his blog).

September 2010 guaka also allowed archive.org's ia_archiver bot to crawl the site so that Trashwiki will be preserved for eternity ;)

Also check Wikipedia.

Default robots.txt

User-agent: *
Disallow: /w/
Disallow: /w/index.php
Disallow: /w/skins/
Disallow: /en/Special:Random
Disallow: /en/Special%3ARandom
Disallow: /en/Special:Search
Disallow: /en/Special%3ASearch
Disallow: /en/Special:Recentchangeslinked/