A large number of Tor users report getting hit by infinite CAPTCHA loops when visiting webpages fronted by Cloudflare. This makes them feel punished for using Tor to protect their privacy and prevents them from legitimately accessing websites.
For this project we would like to track in practice how often Cloudflare fronted webpages return CAPTCHAs to Tor clients.
Our proposed approach consists of:
There are two interesting metrics to track over time:
Then there are other interesting patterns to look for:
As this is a new project, in order to demonstrate your skills and familiarise yourself with this project you may want to:
There is pre-existing research by the Berkeley ICSI group which includes these sorts of checks:
For the original ticket and discussion, please see ticket #33010