Are there any apache server wizards in here? I'm l...
# water-cooler
p
Are there any apache server wizards in here? I'm looking for a way to limit bots/scrapers from intermittently hammering our sites, and see a lot of talk about using mod_limitipconn and mod_evasive, amongst other things, but neither of those apache modules look to have been updated for many years, and a lot of the guides/tutorials are dated as well. Is everyone using things like Cloudflare and hardware routers to handle rate limiting now?
I did consider doing it at application level, and found this post from Charlie Arehart which suggestions getting the server or load balancer etc to do it https://www.carehart.org/blog/2010/5/21/throttling_by_ip_address
So far the best I've come up with is significantly improved caching and page load speeds to satisfy the hungry bots
(which is not a bad thing at all)
c
https://www.imperva.com/ these guys are good but not on the cheap side
👍 1
z
i'd say just go cloudflare
☝🏻 1
m
I've used Cloudflare in the past and they are quite effective.
e
besides cloudflare you have mod_throttle, mod_evasive, with mod_rewrite you can do something like RewriteEngine On RewriteCond %{HTTP_USER_AGENT} (gumgum-bot|postmanruntime|ag_dm_spider|scrapy|chimebot) [NC] RewriteRule .* - [F,L]
r
cloudflare is top notch. We use it for all our public facing sites
p
cloudflare does seem to be the popular choice, I'm having a look at that
I suppose that mitigates any issues with changing the underlying server stack as well, apache+cf today, nginx+node (pardon my language) tomorrow and anything on cloudflare would remain
thanks all
@Evil Ware I've got a variety of things in place to block certain user agents but there's always a new one around the corner, or malicious bots using older but valid browser agents etc
e
I like bots, but it's all in what you do with them that matters. Another lesser-known solution is DNS Made Easy. and the AAAA and SVR records. You can, if you are really really bored or this is really an issue for you set up dynamic routing records for a static instance(S) to just handle the bot(S).
👍 1