bot_blocker
This module will allow site admins to block requests based on:
- Sub strings that occur in the user agent of the request.
- Older browser versions.
If you have a site that is being excessively crawled on search pages that include facets, see the Facet Bot Blocker module.
Wondering what versions of browsers to block? There is no "one size fits all" answer to that, and should be decided as a matter of your organization's policy, and what your tolerance for false positives is, Vs. need to avoid increased hosting costs from bot traffic. The maintainer of this module has generated an analysis of historic traffic of browser versions. The general conclusion is more nuanced than you may realize. There are legitimate reasons why a request may have an older user agent (Safari versions pinned to OS, the last supported versions for Windows 7, which may still be in use, despite end of support.
As the disclaimer mentions in the Facet Bot Blocker module, this module should be considered a last line of defense, if no other options are available to you. It is generally better to implement bot mitigation tools in a CDN/WAF layer instead.
2.0.x version
There is now a 2.0.x branch available, which takes a new "threat detection" approach by looking at more data points to identify bots. Feedback is welcome on this branch, and the plan is to tag a release after the community has had to review it.