Logging – Best Practices

Logging – Best Practices:

  1. Use key-value pairs.

Using key-value pairs in logs makes parsing the data much easier.  This ensures that a program, and not just a human being, can make sense of the data being logged.

Example:

key1=value1, key2=value2, key3=value3

  1. Create events that humans can read.

Avoid using complex encoding that would require lookups to make event information intelligible. For example, if logs are in a binary format, provide tools to easily convert them to a human-readable (ASCII) format. Don’t use a format that requires an arbitrary code to decipher it. And, don’t use different formats in the same file—split them out into individual files instead.

  1. Use timestamps for every event.

The correct time is critical to understanding the proper sequence of events. Timestamps are critical for debugging, analytics, and deriving transactions.

  • Use the most verbose time granularity possible.
  • Put the timestamp at the beginning of the line. The farther you place a timestamp from the beginning, the more difficult it is to tell it’s a timestamp and not other data.
  • Include a four-digit year.
  • Include a time zone, preferably a GMT/UTC offset.
  • Time should be rendered in microseconds in each event. The event could become detached from its original source file at some point, so having the most accurate data about an event is ideal.
  1. Use unique identifiers (IDs) – [User ID, Transaction ID].

Unique identifiers such as transaction IDs and user IDs are tremendously helpful when debugging, and even more helpful when you are gathering analytics. Unique IDs can point you to the exact transaction. Without them, you might only have a time range to use. When possible, carry these IDs through multiple touch points and avoid changing the format of these IDs between modules. That way, you can track transactions through the system and follow them across machines, networks, and services.

  1. Log in text format.

Avoid logging binary information because it is difficult to meaningfully search or analyze binary data. Binary logs might seem preferable because they are compressed, but they require decoding and won’t segment.  Instead of directly logging binary data, place textual meta-data in the event so that you can search through it.  For example, don’t log the binary data of a JPG file, but do log its image size, creation tool, username, camera, GPS location, and so on.

  1. Use structured and developer-friendly logging formats.

Developers like the ability to receive a stream of data over HTTP/S when possible, and with data structured so that it can be easily processed.  Developer-friendly formats like JSON are readable by humans and machines.

  1. Log more than just debugging events.

Put semantic meaning in events to get more out of your data. Log audit trails, what users are doing, transactions, timing information, and so on. Log anything that can add value when aggregated, charted, or further analyzed. In other words, log anything that is interesting to the business.

  1. Categorize the event.

Use the severity values TRACE, DEBUG, INFO, WARN, ERROR, and FATAL.

  • TRACE – Designates finer-grained informational events than the DEBUG.
  • DEBUG – Designates fine-grained informational events that are most useful to debug an application.
  • INFO – Designates informational messages that highlight the progress of the application at coarse-grained level.
  • WARN – Designates potentially harmful situations.
  • ERROR – Designates error events that might still allow the application to continue running.
  • FATAL – Designates very severe error events that will presumably lead the application to abort.
  1. Identify the source of the log event.

Include the source of the log event, such as the class, function, or filename.

  1. Keep multi-line events to a minimum.

Multi-line events generate a lot of segments, which can affect indexing and search speed, as well as disk compression. Consider breaking multi-line events into separate events.

Operational Best Practices

  1. Log to local files first.

If you log to a local file, it provides a local buffer and you aren’t blocked if the network goes down.  The logs could further be beamed to a remote secure log server once the local file has been created, in batch mode, or as close to the time of creation of the log as required.

  1. Use log rotation policies.

Logs can take up a lot of space. Maybe compliance regulations require you to keep years of archival storage, but you don’t want to fill up your file system on your production machines. So, set up good rotation strategies and decide whether to destroy or back up your logs.  This could be performed using housekeeping scripts.

  1. Collect data from as many sources as possible, which will give the log reviewer or the computer a bigger and fuller picture.
  • Operating System logs
  • Database logs
  • Network logs
  • Application logs
  • Batch Execution logs
  • Configuration files
  • Performance data (iostat, vmstat, ps, etc.)

References:

http://dev.splunk.com/view/logging-best-practices/SP-CAAADP6

https://journal.paul.querna.org/articles/2011/12/26/log-for-machines-in-json/

https://en.wikipedia.org/wiki/Log4j

http://www.tutorialspoint.com/log4j/log4j_logging_levels.htm

http://arctecgroup.net/pdf/howtoapplogging.pdf

JSON

JSON is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate.  JSON is an easier-to-use alternative to XML.  JSON is a text format that is completely language independent.

JSON is built on two structures:

  • A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.
  • An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.

These are universal data structures. Virtually all modern programming languages support them in one form or another. It makes sense that a data format that is interchangeable with programming languages also be based on these structures.

JSON is much Like XML Because:

  • Both JSON and XML is “self-describing” (human readable)
  • Both JSON and XML is hierarchical (values within values)
  • Both JSON and XML can be parsed and used by lots of programming languages
  • Both JSON and XML can be fetched with an XMLHttpRequest

JSON is much Unlike XML Because:

  • JSON doesn’t use end tag
  • JSON is shorter
  • JSON is quicker to read and write
  • JSON can use arrays
  • JSON is much easier to parse than XML

References:

http://www.json.org/

http://www.w3schools.com/json/default.asp

http://programmers.stackexchange.com/questions/170522/logging-in-json-effect-on-performance

Logging – Best Practices

Leave a Reply

Your email address will not be published. Required fields are marked *