in DevOps

Strange stunnel debug logging behavior in docker

I’ve been playing with xenserver lately to quickly model small Hadoop clusters. One of the frustrating things about xenserver is the lack of good graphical user interfaces that provide for a minimal amount of automation. This means I’m frequently dropping to the command line on the xenserver master and running the xe tools by hand in order to speed up certain operations. This can get tedious, so I started playing with Docker on my Mac to run the xe tools over an encrypted tunnel (via stunnel).

Down the stunnel rabbit hole

Now, xe handles the stunnel setup pretty seamlessly. You provide it a username, password, and the name of the xenserver master. It does the rest. When I started using this under docker, I quickly encountered weird behavior between the two tools. No matter what I did, stunnel would throw logging output regarding the session setup.

[root@8a4a435df6a8 stunnel]# xe vm-list name-label=vm params=uuid --minimal
stunnel: LOG5[137:139863907756096]: stunnel 4.56 on x86_64-redhat-linux-gnu platform
stunnel: LOG5[137:139863907756096]: Compiled/running with OpenSSL 1.0.1e-fips 11 Feb 2013
stunnel: LOG5[137:139863907756096]: Threading:PTHREAD Sockets:POLL,IPv6 SSL:ENGINE,OCSP,FIPS Auth:LIBWRAP
stunnel: LOG5[137:139863907756096]: Reading configuration from descriptor 4
stunnel: LOG5[137:139863907756096]: FIPS mode is disabled
stunnel: LOG5[137:139863907756096]: Configuration successful
stunnel: LOG5[137:139863907756096]: Service [stunnel] accepted connection from unnamed socket
stunnel: LOG5[137:139863907756096]: connect_blocking: connected 10.1.1.32:443
stunnel: LOG5[137:139863907756096]: Service [stunnel] connected remote server from 172.17.0.2:41462
6c281c5c-375c-48f8-0917-75cfc414f671

As you can see, that’s a bunch of crap dumped out. At first, I thought the logs were coming in via stderr, but attempts to redirect it went nowhere. It kept showing up. When running the same xe command on a non-Dockerized image, the output looked correct: I would get the uuid with no weird logging output.

realhost# xe vm-list name-label=vm params=uuid --minimal
6c281c5c-375c-48f8-0917-75cfc414f671

When good guesses go bad

My hunch was that logging was being written out on some file descriptor other than stderr. The quick way to confirm this was to run the command through strace and see what was happening under the hood when the syscalls were being invoked.

20:17:28.671410 execve("/usr/bin/stunnel", ["/usr/bin/stunnel", "-fd", "4"], [/* 10 vars */]) = 0 <0.000637>

The initial entry point that I found was possibly on file descriptor 4, but I realized this was being used as a communications input path from xe to stunnel, not output.

Further up in the strace output, I found where the stdin, stdout, and stderr file descriptors were being duplicated during the process fork, but eventually were closed out, so none of the standard channels were even in use. Where could it still be coming from?

My next stop was to see if there was anything interesting in and around the write() syscall outputting one of the messages.

95 20:17:28.831429 write(2, "2016.10.05 20:17:28 LOG5[95:140377872062528]: Compiled/running 
   with OpenSSL 1.0.1e-fips 11 Feb 2013\n", 100) = 100 <0.000251>
95 20:17:28.831867 mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
   0) = 0x7fac45337000 <0.000107>
95 20:17:28.832108 mremap(0x7fac45337000, 12288, 4096, MREMAP_MAYMOVE) = 0x7fac45337000 <0.000106>
95 20:17:28.832307 socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4 <0.000126>
95 20:17:28.832610 connect(4, {sa_family=AF_LOCAL, sun_path="/dev/log"}, 110) = -1 ENOENT 
   (No such file or directory) <0.000109>
95 20:17:28.832811 close(4) = 0 <0.000148>
95 20:17:28.833161 open("/dev/console", O_WRONLY|O_NOCTTY) = 4 <0.000115>
95 20:17:28.833375 fstat(4, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0 <0.000062>
95 20:17:28.833619 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
   0x7fac45339000 <0.000050>
95 20:17:28.833801 lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) <0.000046>
95 20:17:28.833927 write(4, "stunnel: LOG5[95:140377872062528]: Threading:PTHREAD 
   Sockets:POLL,IPv6 SSL:ENGINE,OCSP,FIPS Auth:LIBWRAP\r\n", 106) = 106 <0.000119>
95 20:17:28.834134 munmap(0x7fac45339000, 4096) = 0 <0.000057>
95 20:17:28.834354 close(4) = 0 <0.000063>

Here’s where it gets interesting.

Docker always gets you

Recall that I’m running this under Docker. For this test, I was doing something like:

$ docker run -i -t centos:7 /bin/bash
[root@8a4a435df6a8 /]#

This ends up running a pretty cut down environment where bash is doing the work of systemd or init. Nothing else is running.

[root@8a4a435df6a8 stunnel]# ps
 PID TTY TIME CMD
 1 ? 00:00:00 bash
 139 ? 00:00:00 ps

Ordinarily, this isn’t a big deal. Except in the case where a tool is attempting to use syslog messages to log information. Which stunnel does.

95 20:17:28.832610 connect(4, {sa_family=AF_LOCAL, sun_path="/dev/log"}, 110) = -1 ENOENT 
  (No such file or directory) <0.000109>
95 20:17:28.832811 close(4) = 0 <0.000148>
95 20:17:28.833161 open("/dev/console", O_WRONLY|O_NOCTTY) = 4 <0.000115>

A lightbulb appears

Here’s my “ahha!” moment. stunnel is attempting to talk to a local syslog daemon via the /dev/log device (a unix socket). On full installs of a system, this special device file exists to provide a pathway for system-local processes to communicate with syslog. But in a cut-down Docker environment, not everything is fully populated inside the container. So what happens?

  1. stunnel starts.
  2. It attempts to open the unix socket to communicate and log data to syslog.
  3. The syscall fails with No such file or directory because it doesn’t exist.
  4. stunnel falls through to a backup course of communication by looking for /dev/console.
  5. stunnel finds it, opens it, and writes a log to it.

But, why should I be seeing this? Am I running from a console? Yes. Because of the way I’ve invoked my container, the console is tied to my terminal, which I confirmed with the following:

[root@8a4a435df6a8 stunnel]# tty
/dev/console

stunnel was using a failsafe logging mechanism and writing directly to my terminal. The log output would show up on my screen, but because it wasn’t flowing through a file descriptor attached directly to the processes, I would never be able to redirect it.

I’m not sure how I’ll work around this. I do want the functionality of being able to run xe commands from my Mac, but I don’t want to go through the steps of building the xapi-xe packages natively. I tried that and failed after a few hours of staring at OCaml code.

For now, this is just annoying output and doesn’t hamper my ability to automate xenserver commands from the container, so I’ll just live with it.

A happy ending

A week after posting this, a few alternative methods were found to address the problem. I’m a happier person now.

Travis Campbell
Staff Systems Engineer at ghostar
Travis Campbell is a seasoned Linux Systems Engineer with nearly two decades of experience, ranging from dozens to tens of thousands of systems in the semiconductor industry, higher education, and high volume sites on the web. His current focus is on High Performance Computing, Big Data environments, and large scale web architectures.