LogSense generates reports and statistics from Ruby on Rails and Apache/Nginx log files.
Main features:
- Statistics for Rails app in production and Web server logs (combined format, which can be produced both by Apache and Nginx)
- Reports on performances, errors, visitors, and devices used to access your websites and webapps[fn:: LogSense parses also the data generated by the BrowserInfo gem, providing additional information for Rails apps, including devices, platforms and number of accesses to methods by device type.].
- Can combine one or more log files
- No need for cookies or other tracking technologies (but you need access to your log files)
- Filters allow to analyze specific periods distinguish traffic generated by self polls and crawlers.
- Reports can be generated in HTML, txt, ufw, and SQLite. HTML reports are responsive and come with dark and light theme.
LogSense is Written in Ruby, it runs from the command line, it is fast, and it can be installed on any system with a relatively recent version of Ruby. We use it with Ruby 3.1.4 and 3.3.0.
It is fast. On a ThinkPad P16, a 277M log file is parsed in 15 seconds, processing, that is, about 7740 events per second; a 569M log file is parsed in 50 seconds, that is, about 4700 events per second.
LogSense understands the Rails production log and generates the following reports in TXT and HTML:
- Daily Distribution
- Time Distribution
- Statuses
- Statuses by Day
- Rails Performance
- Controller and Methods by Device
- Fatal Events
- Fatal Events
- Fatal Events (grouped by type)
- Job Error
- Job Errors (grouped)
- Browsers
- Platforms
- IPs
- Countries
- IP per hour
- Sessions
LogSense reads the Apache/Nginx combined log format and generates the following reports in TXT and HTML:
- Time Distribution
- 20_ and 30_ on HTML pages
- 20_ and 30_ on other resources
- 40_ and 50_x on HTML pages
- 40_ and 50_ on other resources
- 40_ and 50_x on HTML pages by IP
- 40_ and 50_ on other resources by IP
- Statuses
- Statuses by Day
- Browsers
- Platforms
- IPs
- Countries
- IP per hour
- Combined Platform Data
- Referers
- Sessions
The ufw
output format generates directives for Uncomplicated Firewall,
blacklisting IPs requesting URLs matching a given pattern.
We use it to blacklist IPs requesting WordPress login pages on our websites… since we don’t use WordPress for our websites.
$ log_sense -f apache -t ufw -i apache.log # /users/sign_in/xmlrpc.php?rsd ufw deny from # /wp-login.php /wordpress/wp-login.php /blog/wp-login.php /wp/wp-login.php ufw deny from ...
gem install log_sense
If you want to collect information about browsers, platform and devices when
generating Rails reports, add the browser
gem to your bundle and the
following code to application_controller.rb
# Gemfile gem "browser"
# application_controller.rb class ApplicationController < ActionController::Base # [...] before_action do |controller| user_agent = request.env['HTTP_USER_AGENT'] ip = request.env['REMOTE_ADDR'] hashed_ip = Digest::SHA256.hexdigest ip b = Browser.new(user_agent) now = DateTime.now logger = Rails.logger browser_data = [ b.name, b.platform, b.device.name, controller.class.name, controller.action_name, request.format.symbol, hashed_ip, now ] browser_data_str = browser_data.map { |x| "\"#{x}\"" }.join(',') logger.info "BrowserInfo: #{browser_data_str}" end # [...] end
log_sense --help
Usage: log_sense [options] [logfile ...] --title=TITLE Title to use in the report -f, --input-format=FORMAT Log format (stored in log or sqlite3): rails or apache (DEFAULT: apache) -i, --input-files=file,file, Input file(s), log file or sqlite3 (can also be passed as arguments) -t, --output-format=FORMAT Output format: html, txt, sqlite, ufw (DEFAULT: html) -o, --output-file=OUTPUT_FILE Output file. (DEFAULT: STDOUT) -b, --begin=DATE Consider only entries after or on DATE -e, --end=DATE Consider only entries before or on DATE -l, --limit=N Limit to the N most requested resources (DEFAULT: 100) -w, --width=WIDTH Maximum width of long columns in textual reports -r, --rows=ROWS Maximum number of rows for columns with multiple entries in textual reports -p, --pattern=PATTERN Pattern to use with ufw report to select IP to blacklist (DEFAULT: php) -c, --crawlers=POLICY Decide what to do with crawlers (applies to Apache Logs) --no-selfpoll Ignore self poll entries (requests from ::1; applies to Apache Logs) (DEFAULT: false) --no-geo Do not geolocate entries (DEFAULT: true) --verbose Inform about progress (output to STDERR) (DEFAULT: false) -v, --version Prints version information -h, --help Prints this help This is version 2.0.0 Output formats: - rails: txt, html, sqlite3, ufw - apache: txt, html, sqlite3, ufw
log_sense -f apache -i access.log -t txt > access-data.txt log_sense -f rails -i production.log -t html -o performance.html
LogSense focuses on privacy, data-ownership, and simplicity: no need to install JavaScript snippets, no tracking cookies, just plain and simple log analysis.
LogSense is also inspired by static websites generators: statistics are generated from the command line and accessed as static HTML files. This significantly reduces the attack surface of your web server and installation headaches. We have a cron job running on our servers, generating statistics at night. The generated files are then made available on a private area on the web and rotated monthly.
Log poisoning is a technique whereby attackers send requests with invalidated user input to forge log entries or inject malicious content into the logs.
log_sense sanitizes entries of HTML reports, to try and protect from log poisoning. Log entries and URLs in SQLite3 tables, however, are not sanitized: they are read and stored from the log as they are. This is not, in general, an issue, unless you use the unsanitized data from SQLite as it is in environments where URL can be opened or code executed using the URLs as argument.
See the CHANGELOG file.
LogSense should run on any system on which a recent version of Ruby runs. We tested it with Ruby 2.6.9 and Ruby 3.0.x, and Ruby 3.3.x
- HTML reports use Zurb Foundation, Data Tables, and Apache ECharts
- The textual format is compatible with Org Mode and can be further processed to any format Org Mode can be exported to, including HTML and PDF, with the word of warning in the section above concerning log poisoning.
The code implements a pipeline, with the following steps:
- Parser: parses a log to a SQLite3 database. The database contains a table with a list of events, and, in the case of Rails report, a table with the errors.
- Aggregator: takes as input a SQLite DB and aggregates data, typically performing “group by”, which are simpler to generate in Ruby, rather than in SQL. The module outputs a Hash, with different reporting data.
- GeoLocator: add country information to all the reporting data which has an IP as one the fields.
- Shaper: makes (geolocated) aggregated data (e.g. Hashes and such), into Array of Arrays, simplifying the structure of the code building the reports.
- Emitter generates reports from shaped data using ERB.
See todo.org
We have been running LogSense for quite a few years with no particular issues. There are no known bugs; there is an unknown number of unknown bugs.
You are most welcome to report issues and missing features, using the Issue tracker.
LogSense is distributed under the terms of the MIT License.
Geolocation is made possible by dbip’s IP to City database, released under a CC license.
The world map is distributed under the terms of the MIT License by Pareto Softare, Simplemaps.com. It is used in LogSense with some changes to the class names and ids.