Implementing fail2ban with 128T

Distributing a dynamic blacklist ACL among all network elements.

Implementing fail2ban with 128T
Note: this is a repost from my old blog. I'm reposting it now because of my continued interest in network security using fail2ban, and because I've recently cooked up a recipe for integrating fail2ban, 128T, and an SSH honeypot to do more network investigation. Enjoy!

During the holiday break I finally finished up a science project I've been cooking for a while now: getting fail2ban working on my home network in conjunction with 128T. As I've mentioned  previously, one of my 128T routers is on the public internet;  furthermore, this device needs SSH access so I can administer it when  I'm neither at home nor at work. I do have key-based authentication  enabled, but this doesn't dissuade tons of scripted brute force attacks  to try to log in anyhow.

I came across fail2ban while  looking for distributed blacklists that I could import into my 128T. My  original thought was that I'd find a list compiled by white hats of  known bad actors, and programmatically add the entries into my 128T  configuration via cron. In searching for such a blacklist, I came upon blocklist.de,  which is a community-sourced list of devices that have been detected as  malicious by the app "fail2ban." Well, I thought, why not just take it  straight from the source? I decided to get fail2ban running on my own  system.

Conceptually, my own fail2ban implementation  was to like a microcosm of blocklist.de's. My routers running fail2ban  would contribute their blocked entries to my 128T authority's  configuration, and this configuration would get pushed to all routers.  Thus, if any of my routers detected a bad actor, it would propagate to  all of my routers. Not bad! The challenge was to get it running on my  system and interfacing with 128T.

This turned out to be a fun exercise, as I finally got to use some of our published python libraries to do some NETCONF work. I also learned a bit about the guts of  fail2ban, and how to get it to hand over the IPs to my python script for  NETCONFing over to my conductor.  Last, I learned about how to  authenticate a user script on a router that needs to feed information  into a conductor. This will be a handy tool in the toolbelt.

In essence, this project was comprised of a small number of steps:

  1. Get fail2ban installed on my router. (Made easy due to yum.)
  2. Write a python script for adding/removing IP addresses from a 128T  tenant over NETCONF. This was tricky, since it was my first time -- and  my scripting skills are quite rusty.
  3. Figure out how to authenticate a NETCONF client running on a 128T  router to a server running on the 128T conductor. Most of our code  samples assume the NETCONF client is running on localhost, so I needed  to futz around with authorized_keys, user names, etc. for a while before  I struck upon the right sequence.
  4. Write a fail2ban action.d script that invokes my python script. Easy peasy.
  5. Change rsyslog.conf on my router to write log messages to  /var/log/secure (disabled by 128T, as it typically manages rsyslog for  you).
  6. Disable my 128T's native host-service for SSH and create a custom  service for SSH on my WAN port. This is because 128T's native  host-service uses a KNI interface that forcibly enables source-nat. This  hamstrings fail2ban, since the only source IP address it would ever see  would be 169.254.127.127.

After a bit of fussing, I got it all working on Monday. Now,  when fail2ban detects a new attacker, it shows up right away in my  configuration. Here's the status from the fail2ban client:

[root@labsystem2 log]# fail2ban-client status sshd
Status for the jail: sshd
|- Filter
|  |- Currently failed:    1
|  |- Total failed:    28
|  `- Journal matches:    _SYSTEMD_UNIT=sshd.service + _COMM=sshd
`- Actions
|- Currently banned:    3
|- Total banned:    3
`- Banned IP list:    174.77.73.152 212.83.138.18 39.111.209.51

And here's the tenant that contains these addresses.

admin@labsystem1.fiedler (tenant[name=blacklist-fail2ban])# show
name         blacklist-fail2ban
description  "Blacklist maintained by fail2ban"   

member       internet
    neighborhood  internet
    address       174.77.73.152/32
    address       212.83.138.18/32
    address       39.111.209.51/32
exit

This  tenant is denied access to the self-made SSH service I created. But  because all of my routers now include this tenant definition, this  blacklist can be used anywhere in my network, for any purpose.

While I'm pretty happy with my proof-of-concept (and it's working  wonderfully!), there are a lot of efficiencies to be gained. First, each  time fail2ban adds a new address to the blacklist this involves  committing the 128T configuration. If I'm in the middle of provisioning  new services on my router when my NETCONF script executes, this could  fail due to validation errors. Second, I can apply these same principles  to "native" 128T logging by creating custom fail2ban filters, and trap  events in highwayManager or fastLane to add bad actors.

This should be fun :-)