Page MenuHomeGitPull.it

Avoid to index in search engines all the Phabricator's Transactions or other shit (deletions, changes, etc.)
Closed, ResolvedPublic1 Points

Description

I noticed that Phabricator is exposing lot of stuff to search engines like this:

https://gitpull.it/transactions/detail/PHID-XACT-TASK-mbszbtu6i77llg4/

I think that every single changed bit is not something we want to track. Also because often we strip-out email addresses or phone numbers and we do not want to be like Wikipedia and keep track of every single damn change. We are just here to do some co-working and we should not feed search engines with shit.

To do this I've edited the Apache virtualhost to expose a robots.txt:

User-agent: *
Disallow: /transactions/
/etc/apache2/sites-available/gitpull.it.conf
Alias /robots.txt /home/www-data/gitpull.it/www-stuff/robots.txt

Related to:

https://secure.phabricator.com/T4610

Event Timeline

valerio.bozzolan triaged this task as Normal priority.
valerio.bozzolan created this task.

Reopened because Phabricator already has a robots.txt so the Alias does not work:

User-Agent: *
Disallow: /diffusion/
Disallow: /source/
Crawl-delay: 1

Context:

https://secure.phabricator.com/T4610

Current:

User-agent: *
Disallow: /diffusion/
Disallow: /source/
Disallow: /transactions/
Disallow: /multimeter/
Disallow: /auth/
Disallow: /differential/
Disallow: /policy/
Crawl-delay: 10