Logs.
I know.
Logs are boring.
The mere thought of scanning through hundreds or thousands of lines of a log is enough to bring even the most diligent among us to tears.
Yet logs are one of the most valuable sources of information when we need to determine why something isn’t working in an application.
And that’s why it’s a good idea to incorporate logging in all your applications, because stepping through code with a debugger or, even worse, dynamically adding write statements is not the best use of time or resources.
But there’s more to logging than simply writing out unstructured data at various points in a program. Much more.
First, we need to determine how, what and how much to log.
Then we need the ability to semi-automatically analyze logs rather than manually scanning through all that information.
So let’s get started by discussing the, “how.”
How to Log (Part One)
The important point when it comes to, “how to log,” is to ensure you do it in a structured manner. Simply writing out free form text doesn’t cut it in today’s environments where data volume is massive.
Even if you can find people willing to sift through oceans of text, the sheer enormity of the task will quickly drive them to boredom. And following on the heels of boredom comes mistakes and missed signals.
On the other hand, injecting structure into a log allows automated analysis to come into play, makes finding things easier and lets you to hone in on problems faster.
That’s because you can query for data (such as searching for certain dates, times or user IDs) to eliminate unnecessary entries. And because writing a structured log doesn’t take much more effort than writing an unstructured one, it makes tons of sense to go structured all the way, every day.
At the very least all log records should contain a timestamp, a logging level (such as error or info) and the message.
But adding additional information can also come in handy when attempting to pinpoint a specific problem, so if an error is being logged and a user’s ID and IP Address are available, it rarely hurts to add these to the log.
Plus, if you standardize on labels, you set the stage for using automated tools to search, filter and analyze log entries. So ensure labels are in a standard format and used throughout the organization when possible.
The key is to ensure your logs can be easily read and digested by machines, rather than humans. And since there are existing structured formats available (such as JSON and XML) that enjoy wide support and have readily available parsers, you don’t have to invent anything – just reuse what’s already there.
A decade ago the go-to format was XML, but today it’s JSON. If you have a choice, I’d recommend JSON, but XML is still an acceptable (if somewhat more verbose) format.
So let’s look at an example.
{
"time_stamp": "12/02/2020 14:42:48",
"level": "error",
"message": "An error occurred"
}
Notice how each value (e.g. error) has a label (e.g. level). This allows you to search for specific levels and ignore others. But it also allows an automated process to parse the file to find, say, error level entries, bundle them up and send an email to a support person. The options are truly limitless when you structure logs the right way.
Of course there is no International Log Standards Body, so feel free to format logs as you see fit, but always ensure they are easy to parse and process by computers.
What and How Much to Log
Obviously you should log exceptions and other unexpected events.
However it really doesn’t hurt to log a variety of items that may help find problems in the future. Examples include the times users log in and out, calls to network services, important methods invoked, input parameters and such.
Keep in mind that while too much logging may impact performance (but usually not that much), if you’ve implemented a good logging structure, you won’t have to worry about being overwhelmed by the amount of data – because your automated tools will guide you to the important parts with very little effort.
So log whatever you feel may be useful (but don’t log sensitive information, such as passwords, because logs tend to stay around for long periods of time and may not be encrypted).
Structure of a Log Entry
As I’ve already mentioned, log entries should have standardized labels so they can be processed by machines. In addition, they should be flexible enough to allow the addition of custom fields to the standard ones so specific non-standard data can also be captured.
Look at the following log entry.
{
"time_stamp": "12/02/2020 14:42:48",
"level": "error",
"message": "An error occurred",
"userID": "marvin",
"ipAddress": "10.9.42.0"
}
See how it includes two additional fields (userID and ipAddress) that aren’t part of the standard entry? This allows us to search for these custom labels if we need to find a specific user. When correlated with a time range, it significantly narrows down the information we have to look through.
So don’t ignore custom fields, they can provide a powerful opportunity to lead you directly to problems.
But you can also use custom fields for analytics (such as processing specific log entries to aggregate data and calculate summaries and statistics that tell you interesting things – such as the average length of time users are on the system, what modules they use most or how many users are logged in at a certain time).
Log data can reveal much more than just illuminating bugs and problems.
How to Log (Part Two)
Alrighty then. We’ve now covered the basics of logging, so let’s dive into the mechanics.
Yes, of course you can use the humble write statement to output your log entry, or even custom-build a logging function inside your application, but there is a better way.
What we’d really like is something that’s easy to use, flexible and allows output to a variety of different places (such as the terminal, files, HTTP endpoints or FTP servers).
It would also be neat if we could choose the output format (such as JSON or XML) and easily change formats on a whim.
And the icing on the cake would be to choose the format based on the output endpoint (perhaps we’re sending log entries to a server over HTTP and want the format to be XML while at the same time we want to write the entry to a local file in JSON).
Oh… and what about selectively logging to different endpoints based on level (so error entries go to a local file while info entries go to a server over HTTP and warnings go to a server via FTP)?
Now that would be cool. But how can all this be done?
Using a logging library… that’s how.
As part of my series on Functional Programming (FP) with Magik, I’ve written a logging library based on the FP paradigm. I’ve also hooked it into The Observer Pattern described in a previous article.
Here’s a brief overview.
Magik> zlog << zaphod_logger()
Magik> console_t << zaphod_transport(console_transport, _unset, xml_log_formatter)
Magik> console_o << beeble_observable.new(zlog).subscribe(_proc(e) console_t(e.data) _endproc)
Magik> zlog(:error, "An error occurred")
<?xml version="1.0" encoding="UTF-8" ?>
<logdata>
<time_stamp>12/02/2020 15:07:16</time_stamp>
<level>error</level>
<message>An error occurred</message>
</logdata>
Magik>
Before we dive into what’s happening here, let’s define a few terms.
- Logger: a procedure that takes a message and processes it so it’s ready for output.
- Transport: a procedure that receives the output from a logger, applies a desired format (via a formatter) and outputs the log entry to the appropriate destination (such as the terminal, a file or an HTTP endpoint).
- Formatter: a procedure that creates a specific output format (such as JSON or XML) from the internal format generated by the logger.
Since everything’s a procedure, we gain a lot of flexibility with configuration options because we can simply replace one procedure with another to obtain different functionality. For example, if we want XML output instead of JSON, we pass in the XML formatter procedure rather than the JSON one.
With those definitions out of the way, let’s analyze the commands.
In line 1 we create a new logger. Line 3 defines a new transport that will send output to the console (i.e. terminal) and use the XML formatter to write out XML.
In line 5 we create an Observable based on the new logger created in line 1. We then subscribe, giving it an unnamed callback procedure that simply invokes the transport (we created in line 3) on the data passed to the callback (note the data are wrapped in an event, so we use e.data to extract the actual data).
At this point, when we log a message (line 7), the output is sent to the terminal in XML format (lines 9 to 14).
Pretty simple.
Okay, let’s get a bit fancier.
Magik> file_t << zaphod_transport(file_transport, property_list.new_with(:file_name, "D:\beeble\logtext.txt"), json_log_formatter)
Magik> file_o << beeble_observable.new(zlog).subscribe(_proc(e) file_t(e.data) _endproc)
Magik> zlog(:error, "An error occurred")
<?xml version="1.0" encoding="UTF-8" ?>
<logdata>
<time_stamp>12/02/2020 15:20:48</time_stamp>
<level>error</level>
<message>An error occurred</message>
</logdata>
Magik>
Hmmm… nothing appeared to change. But is that really the case? Let’s look at the contents of the logtext.txt file.
{ "time_stamp": "12/02/2020 15:20:48", "level": "error", "message": "An error occurred" }
How about that? Logging the error message sent a log entry to the terminal in XML format as well as an entry to the logtext.txt file in JSON format (because in line 1 we set up a file transport and told it to use the json_log_formatter and write to a file named logtext.txt. Then, in line 3, we set up a new observable, again based on zlog, subscribed to it and passed the file transport to its callback).
At this point we now have two observers watching for changes emitted by zlog. So when we log a message (line 5), the original observer (console_o) sends an XML formatted entry to the terminal and the new observer (file_o) sends a JSON formatted entry to the logtext.txt file.
Conceptually we could add as many observers as we want and each would format the entry as required and send it over the requested transport (such as HTTP or FTP).
But there’s more.
Remember those logging levels? What if we wanted to send all error messages to a file while logging all info messages to the terminal? How difficult would that be?
Let’s find out…
Logging Based on Levels
It’s late and dark outside with a deadline looming. Our application is misbehaving in a big way and we’re running short on time. The application is throwing all sorts of errors and we need help. We have to fix some minor issues with the UI, but need outside help with the backend errors.
What shall we do?
Well, to fix the minor UI problems, we’d like to display info level entries on the terminal where we can interactively work with them. But we want to log those pesky error entries to a file so we can email it to someone else.
Therefore we do this…
Magik> file_options << property_list.new()
Magik> file_options[:file_name] << "D:\\beeble\logtext.txt"
Magik> file_options[:log_levels] << {:error}
Magik> file_t << zaphod_transport(file_transport, file_options, json_log_formatter)
Magik> file_o << beeble_observable.new(zlog).subscribe(_proc(e) file_t(e.data) _endproc)
Magik> console_options << property_list.new_with(:log_levels, {:info})
Magik> console_t << zaphod_transport(console_transport, console_options, xml_log_formatter)
Magik> console_o << beeble_observable.new(zlog).subscribe(_proc(e) console_t(e.data) _endproc)
Magik>
Alright… let’s analyze what we’ve done.
Just so you know, options are passed to a transport via a property_list. So for the file_options, we set up the file name and, in line 5, the log level of error. This tells the transport to only log a message if its associated level is error. All other levels will not be logged by this transport.
In line 11, we do something similar for the console_options (this time we’re telling the transport to only log messages of level info — and no others).
We then create the file transport (in line 7, passing in the file_options propery_list) and the console transport (in line 13, passing console_options), then create and subscribe to new observables (for file in line 9 and for console in line 15).
At this point we’re ready to go. When we log a message with a level of info, it should only be sent to the terminal (not the file) and when we log a message with a level of error, it should only go to the file (not the terminal).
So let’s do that and see if it works.
Magik> zlog(:info, "This is an info-level message")
<?xml version="1.0" encoding="UTF-8" ?>
<logdata>
<time_stamp>12/02/2020 21:03:59</time_stamp>
<level>info</level>
<message>This is an info-level message</message>
</logdata>
Magik>
You’ll have to take my word for it, but nothing was logged in the file and we see the XML formatted log message was displayed on the terminal. So far, so good.
Now let’s do the same thing with an error level message.
Magik> zlog(:error, "This is an error-level message")
Magik>
Nothing was logged to the terminal. Let’s look at the file…
{ "time_stamp": "12/02/2020 21:06:22", "level": "error", "message": "This is an error-level message" }
And sure enough, we see the log message. What we’ve just done is set up two transports: one logs info level messages to the terminal in XML format while the other logs error level messages to a file named logtext.txt in JSON format.
Both are triggered, via the Observer Pattern, whenever we invoke the zlog procedure.
Further, we passed in predefined procedures to configure output format (json_log_formatter and xml_log_formatter) and transports (console_transport and file_transport).
Other formatter and transport procedures can easily be written to support custom formats or other types of transports (such as HTTP and FTP for example).
That’s all quite useful, but let’s make things even better and revisit what I said earlier about adding custom fields to a message.
How would we do that?
Logging Custom Fields
Well… it’s as simple as…
Magik> zlog(:add, "userID", "trillian")(:add, "ipAddress", "10.9.42.0")(:info, "Info message")
{ "time_stamp": "12/02/2020 10:18:04", "level": "info", "message": "Info message", "userID": "trillian", "ipAddress": "10.9.42.0" }
Magik>
And if we want to remove the custom fields…
Magik> zlog(:remove, "userID")(:remove, "ipAddress")(:info, "Another Info message")
{ "time_stamp": "12/02/2020 10:19:08", "level": "info", "message": "Another Info message" }
Magik>
Just like that we can send the file of errors to someone else while we look at the info level problems on the screen. Plus we can add additional fields that allow everyone to glean further insight into the issues, thereby fixing the problems and meeting the deadline.
Whew!
And that wraps up our overview of logging.
Excelsior!