Log collection is essential to properly analyzing issues in production. An interface to search and be notified about exceptions on all your servers is a must. Well, if you have one server, you can easily ssh to it and check the logs, of course, but for larger deployments, collecting logs centrally is way more preferable than logging to 10 machines in order to find “what happened”.
There are many options to do that, roughly separated in two groups – 3rd party services and software to be installed by you.
3rd party (or “cloud-based” if you want) log collection services include Splunk,Loggly, Papertrail, Sumologic. They are very easy to setup and you pay for what you use. Basically, you send each message (e.g. via a custom logback appender) to a provider’s endpoint, and then use the dashboard to analyze the data. In many cases that would be the preferred way to go.
In other cases, however, company policy may frown upon using 3rd party services to store company-specific data, or additional costs may be undesired. In these cases extra effort needs to be put into installing and managing an internal log collection software. They work in a similar way, but implementation details may differ (e.g. instead of sending messages with an appender to a target endpoint, the software, using some sort of an agent, collects local logs and aggregates them). Open-source options include Graylog, FluentD, Flume, Logstash.
After a very quick research, I considered graylog to fit our needs best, so below is a description of the installation procedure on AWS (though the first part applies regardless of the infrastructure).
The first thing to look at are the ready-to-use images provided by graylog, including docker, openstack, vagrant and AWS. Unfortunately, the AWS version has two drawbacks – it’s using Ubuntu, rather than the Amazon AMI. That’s not a huge issue, although some generic scripts you use in your stack may have to be rewritten. The other was the dealbreaker – when you start it, it doesn’t run a web interface, although it claims it should. Only mongodb, elasticsearch and graylog-server are started. Having 2 instances – one web, and one for the rest would complicate things, so I opted for manual installation.
Graylog has two components – the server, which handles the input, indexing and searching, and the web interface, which is a nice UI that communicates with the server. The web interface uses mongodb for metadata, and the server uses elasticsearch to store the incoming logs. Below is a bash script (CentOS) that handles the installation. Note that there is no “sudo”, because initialization scripts are executed as root on AWS.