What’s so hard about logging? Throw a couple of “printfs” into the code and you’re done. Well — that’ll generate output but it won’t generate usable logs. The first commandment of logging is, “Thou shalt not write log by yourself.” That’s to ensure the logging works well with the rest of the system. Even when you use a tested framework to generate your logs, it isn’t easy to keep the logs from causing problems.
Effective logging requires thoughtful planning and consistent tuning and maintenance. There are more than three reasons why logging can be hard, but the three biggest challenges for logging are:
Arguably, the biggest logging challenge is making sure your logs contain meaningful information and that they’re easy to use. However, meaningful is a subjective characteristic.
To start, you need to decide to whom the content should be meaningful. For example, should your logs be readable by support personnel at a glance? If so, you will probably want to reduce the number of debug statements before going into production to make it easier for a support person to find and read the information that is meaningful to them. If the log will be fed into an analytics program, the error messages may be more abundant and compactly formatted, letting the software derive what is meaningful from the data.
Regardless, using a consistent log entry format will help ensure the logs are used by both humans that need to read the raw data as well as log analyzers. Every log entry should have a similar structure with the date and time in the same format.
Finally, logs should be used, not just captured and stored. Logging provides you with important information about the health of your systems. If you’re not utilizing them, you’re guaranteed to miss indicators about how well (or poorly) your systems are serving your teams and your customers.
Log files take up space, often a lot of it, but obviously file size can’t be your primary deciding factor in choosing what to log. It’s unproductive to focus on trying to reduce the size of log files.
Ultimately, you don’t definitively know what information will be important and what information will be useless until you actually need it to resolve an unexpected problem in your production environment. If you log at too high a level, you won’t capture enough information. If you log at too low a level, log files can become enormous, with a lot of extraneous details that hide the important data. Thus, making it hard to find the signal in the resulting noise.
Managing these log files is also complex. You should have a detailed log management policy to document how your organization will retain logs. You also need to define a retention policy in accordance with compliance regulations. Accordingly, there should be a plan in place for backing up and moving files off primary storage when appropriate.
For logging, like any other sensitive files or systems, you need to proactively assign and manage permissions to individuals who need access to these files to use them for support, analytics, or other business purposes.
Beyond ensuring that the right stakeholders (and only those individuals) have read access, you need to ensure that the log files cannot be modified to hide malicious activity. Michael Cobb wrote in ComputerWeekly, “No matter how extensive your logging, log files are worthless if you cannot trust their integrity. The first thing most hackers will do is try to alter log files to hide their presence.”
You also need to be careful when writing logs to avoid accidentally writing sensitive cleartext data into the log files. While it’s obvious that things like passwords, account numbers, and social security numbers shouldn’t be written to logs, there’s other data that may not seem sensitive which may be subject to compliance rules.
The new European Union General Data Protection Regulation (GDPR) makes organizations responsible for securing personal information of European Union subjects. The rule also requires businesses to correct or delete data upon request, so having these personal details saved within log files greatly complicates GDPR compliance.
Most of us really don’t relish in spending our time and other resources logging. It’s a necessary evil to running complex systems and infrastructure, not a core business activity. William Louth wrote on Medium, “Logging, metric monitoring and distributed tracing aren’t solutions they’re redefinitions, a substitution, of the original problem — that of observability.”
Until we’ve got a better way to see into our systems, we’ll keep struggling to write, manage, and use logs effectively.