Adventures in GELF

December 8, 2025 · 747 words · 4 min

When you run applications in containers, the easiest logging method is to write on standard output.

When you run applications in containers, the easiest logging method is to write on standard output. You can’t get simpler than that: just , , If your application is very terse, or if it serves very little traffic (because it has three users, including you and your dog), you can certainly run your logging service in-house. My  even has a  which might give you the false idea that running your own  cluster is all unicorns and rainbows, while the  and Therefore, you certainly want the possibility to send your logs to  who will deal with the complexity (and pain) that comes with real-time storing, indexing, and querying of semi-structured data. It’s worth mentioning that these people can do more than just managing your logs. Some systems like  are particularly suited to extract insights from errors (think traceback dissection); and many modern tools like By default, the Docker Engine will capture the standard output (and standard error) of all your containers, and write them in files using the JSON format (hence the name When you use the The The first issue can easily be fixed by giving  to the Alright, you can start developing (and even deploying) with the default Docker supports I’m going to stop the list here because GELF has a few features that make it particulary interesting GELF stands for . It was initially designed for the  logging system. If you haven’t heard about Graylog before, it’s an open source project that pioneered “modern” logging systems like With the syslog protocol, a log message is mostly a raw string, with very little metadata. There is some kind of agreement between syslog emitters and receivers; a GELF made a very  move and decided that OK, so GELF is a convenient format that Docker can emit, and that is understood by a number of tools like , , Moreover, you can switch from the default  to GELF very easily; which means that you can start with These options can be passed to , indicating that you want (If you are using the Docker API to start your containers, these options are passed to the  call, within the The “arbitrary options” vary for each driver. In the case of the GELF driver, you can specify docker run \

–log-driver gelf –log-opt gelf-address udp://1.2.3.4:12201 \

alpine echo Then, An easy technique to work around volatile IP addresses is tu use DNS. Instead of specifying  as our GELF target, we will use , and make sure that this points to 1.2.3.4 If you have to write some code sending data to a remote machine (say, gelf.container.church It would make sense to use a TCP connection, and keep it up as long as we need it. If anything horrible happens to the logging server, we can trust the TCP state machine to detect it eventually (because timeouts and whatnots) and notify us. When that happens, we can then re-resolve the server name and re-connect. We just need a little bit of extra logic in the container engine, to deal with the unfortunate scenario where the  on the socket gives us an When you create a container using the GELF driver,  is invoked, and it creates a new object by Then, when the container prints something out, eventually,  of the GELF driver is invoked. It essentially This GELF writer object is implemented by an external dependency, Let’s investigate this package, in particular the , the , and the other methods called by the latter,  and A slightly better solution is to send logs to This needs to run on each container host. It is very lightweight, and whenever  is updated, instead of restarting your containers, you merely restart (You could also send your log packets to a virtual IP, and then use some fancy Another option is to run Logstash on each node (instead of just Running Logstash (or another logging tool) on each node is also very useful if you want to be sure that you don’t lose any log message, because it would be the perfect place to insert a queue (using  for simple scenarios, or UDP packets sent to  can’t be lost, Even if running a cluster-wide  is relatively easy (especially with Swarm mode and There are already some GitHub issues related to this: , , and . One of the maintainers First, you can monitor the GitHub issues mentioned above (  and