[pcre-dev] Administriva: TLS; bots and viewvc

Top Page
Delete this message
Author: Phil Pennock
Date:  
To: pcre-dev
Subject: [pcre-dev] Administriva: TLS; bots and viewvc
(1)

/var/log/nginx# fgrep -c Bot vcs.pcre.org-access.log
144876
/0[1:58](1051)/var/log/nginx# wc -l vcs.pcre.org-access.log
146977 vcs.pcre.org-access.log

That's a lot of access by systems identifying as having 'Bot' in their
name. Performance is suffering for everyone. I am going to take
administrative action against "AhrefsBot" and "YandexBot". Even without
a robots.txt I expect some degree of sanity for things chasing
query-parameter links. I believe we've been happy to have search
engines look at the contents of the codebase, but they need to not be
stupid about it. These bots have been stupid.

Well, mostly AhrefsBot. YandexBot we might relent on. We're returning
403 for them both now. The ViewVC access is now responsive for everyone
else.

(2)

At present, Subversion access is svn: schema only; a request came in to
consider secure access. At present, we're using nginx as a webserver
and Subversion https support is Apache-only.

To reduce misplaced reports, we are at least now offering https on the
"vcs.exim.org" hostname, which appears to still be in use. The
website is only a redirector, and has been for some time.
https://vcs.pcre.org/ is a ViewVC instance.

Is there a general desire for https-based Subversion access to the
repositories? Under which hostnames?

We _can_ do something like setup Apache behind nginx, but if anyone has
better suggestions, I'd like to hear them.

Please CC me on any responses which require my attention: I'm not a list
subscriber.

Regards,
-Phil