In the next part of the series, the protection of webservers is explained. In this example I use Apache HTTPd webserver. Similar configurations are also available for e.g. nginx. There are various reasons for protecting the webserver itself (webapplications installed will be described in on of the next posts). Attackers are e.g. scanning the server for vulnariblties or try to attack the server using invalid input.
HTTPd authorization failures
Overview
In case the pages are accessed without authorization, web servers can log failure messages like HTTP_UNAUTHORIZED, HTTP_METHOD_NOT_ALLOWED or HTTP_FORBIDDEN. Frequent error messages from the same URL can indicate an attacker trying to find a possibility to fool the web server. A typical error message could look like the following:
[auth_basic:error] [pid 1234] [client xxx.xxx.xxx.xxx] AH01617: user test123: authentication failure for "/test/file.html": Password Mismatch, referer: https://example.org/test/file.html
This can be handled using the apache-auth filter
Configuration
[apache-auth] port = http,https logpath = %(apache_error_log)s enabled = true
Overflow attacks
Overview
By sending invalid data, attackers can try to break into the server or the application running on it. An example log entry can look like the following:
[error] [client xxx.xxx.xxx.xxx] Invalid method in request \x80g\x01\x03\x01
In this case this someone or something tried to access a https page (e.g. port 443) using unencrypted connection, which would be usually handled via port 80.
Configuration
port = http,https logpath = %(apache_error_log)s maxretry = 2 enabled = true
Using of ModSecurity
Overview
ModSecurity is an OpenSource WebApplication Firewall which is available for different webservers including Apache HTTPd. Using configurable rules it identifies invalid application behaviour and blocks it. The also free available core rule set targets OWASP best practice and rules. ModSecurity needs to be configured and loaded – which I do not describe in this post. An example error entry could look like as follows:
TBD
Configuration
[apache-modsecurity] enabled = true port = http,https logpath = %(apache_error_log)s maxretry = 2
Shellshock
Overview
The Shellshock attack is one of the most serve attacks discovered in the recent past, allowing attackes to use vulnarable versions of bash to execute arbirariy commands – which can gain the attacker access rights to the sever. A good explanation on how the Shellshock attack works, can be found here.
A typical log entry could look like as follows:
TBD
Configuration
[apache-shellshock] enabled = true port = http,https logpath = %(apache_error_log)s maxretry = 1
Blocking unwanted bots
Overview
Some bots will visit your site crawling your content but they are known to crawl it to e.g. retrieve email addresses for spamming etc. Fail2ban maintains a list of bots marked as “bad” so you can easily restrict their access to your page.
Configuration
[apache-badbots] enabled = true port = http,https logpath = %(apache_access_log)s bantime = 172800 maxretry = 1
In this case the bandtime is set to 48h / 2days instead of the standard time of 24h / 1d as described in the basic configuration.
fake Google bots
Overview
Some bots pretend to be a Google bot, but the are not. On the one hand these bots try to hide there real identify what a “good” bot would not do. On the other hand some webapplications handle requests from Google differntly to e.g. give more access then a normal user would have. A typical log entry can look like as follows:
xxx.xxx.xxx.xxx - - [<date>] "GET /index.php HTTP/1.1" 404 5490 "-" "Mozilla/5.0 \(compatible\; Googlebot/2.1\; +http://www.google.com/bot.html\)"
Configuration
[apache-fakegooglebot] port = http,https logpath = %(apache_access_log)s maxretry = 1 ignorecommand = %(ignorecommands_dir)s/apache-fakegooglebot <ip> enabled = true
If activated, the filter handles each bot which pretends to be a Google bot as fake. To be able to allow the real Google bots to index the page, it calls the script /etc/fail2ban/filter.d/ignorecommands/apache-fakegooglebot executing a hostname resolution for the IP found in the log file. All Google bots belong to the domain “googlebot.com” – if this does not match, the found IP is a fake Google bot. Otherwise it is valid one and the IP is ignored and not banned.