Hey everyone,
Well, I had an exciting start to my day! Woke up this morning to find thecrazygm.com
completely offline. After a bit of digging into the server logs, it became clear that the Out-Of-Memory (OOM) killer had been hard at work, terminating most of the essential services.
It appears the trigger for the memory exhaustion was fail2ban
working overtime due to an absolute barrage of malicious connection attempts and probes.
Hardening the Defenses
To try and prevent this from happening again, I've made a few changes:
More Aggressive
fail2ban
Rules:
I've tweaked myfail2ban
configuration to be much less tolerant of suspicious activity. The goal is to block bad actors more quickly and permanently, rather than letting them repeatedly try their luck and chew up resources.For those interested in the nitty-gritty, here are the relevant snippets from my
jail.local
:For UFW (Uncomplicated Firewall) blocks, I'm now using an aggressive filter:
[ufw] enabled=true filter=ufw.aggressive action=iptables-allports logpath=/var/log/ufw.log maxretry=1 bantime=-1
The key things here are
maxretry=1
(they get one chance before a block) andbantime=-1
(which means a permanent ban).My
ufw.aggressive.conf
is simple:[Definition] failregex = [UFW BLOCK].+SRC=<HOST> DST ignoreregex =
This just looks for UFW block messages and bans the source IP.
And for SSH, the
sshd
jail is also set to be pretty strict:[sshd] backend=systemd enabled=true filter=sshd mode=normal port=22 protocol=tcp # logpath=/var/log/auth.log # Using systemd backend maxretry=3 bantime=-1 ignoreip = 127.0.0.0/8 ::1
Three strikes and they're out, permanently.
Adding Swap Space:
Since the OOM killer was the ultimate cause of the services going down, I've also added some swap space to the server. While not a replacement for having enough RAM, swap can act as a temporary buffer if memory pressure gets too high, potentially giving the system a chance to recover or forfail2ban
to deal with the abusive connections before critical services are killed.
A Quick Heads-Up
These more aggressive fail2ban
rules should primarily affect bots, scanners, and anyone trying to brute-force logins or probe for vulnerabilities. In the last 24 hours alone, there were 6,875 such attempts against the server, which is just wild and illustrates why these measures are needed.
Normal users browsing the site shouldn't be affected at all. However, if you do find that you suddenly can't access thecrazygm.com
and you suspect you might have been inadvertently caught by a firewall rule (perhaps you were running a script or tool that made a lot of rapid connections?), please reach out to me (e.g., on Discord or elsewhere), and I can check and remove you from the blacklist if needed.
Hopefully, these changes will lead to a more stable and secure server environment. It's always an ongoing battle!
As always,
Michael Garcia a.k.a. TheCrazyGM