Like any wildly managed environment, you probably have to create custom-defined settings in your
/etc/security/limits.conf because of application specific requirements. Maybe you have to allow for more open files. Maybe you have to reduce the memory allowed to a process. Or maybe you just like being ultra-hardcore in defining exactly what a user can do.
As an example, our environment requires that we up the number of open files. A lot. We tend to have a lot of open stuff in the file system. Ordinarily, this isn’t an issue. Except yesterday when we ran into a weird case after rebooting a server.
But first, let’s back track a bit.
limits.conf is part of the PAM chain. Specifically, it’s the configuration file for
pam_limits.so. In order to make use of this file, your process has to have been run through or inherited an environment that ran through PAM at some point in it’s history. For example, if you login with ssh, you run through PAM. If you use sudo, you use PAM. If you supply your username and password to an X Windows login screen, you probably use PAM.
The best way to tell if you’re able to use
limits.conf is to look in the PAM configs and see what commands invoke it. The configs for CentOS exist in
/etc/pam.d. Almost everything includes
pam_limits.so is set as a required module for the session phase.
So, now that we have a bit of background on what this is and where it’s from, let’s continue.
limits.conf pretty extensively, especially for our MySQL servers. We rebooted one of our servers yesterday and discovered that clients were having issues afterwards and the MySQL instance was complaining about not being able to open some tables. This was odd. We set our open file limit pretty high. High enough to know that if we were hitting it, we had a pretty crazy problem. We confirmed that our
limits.conf was correct, so we started poking the process itself to determine what was going on.
We wanted to see if
mysqld was not observing the correct limits setting. But, how do you determine that on a running process?
Every process shows it’s current set of limits in
/proc/$PID/limits. In our case, we found a surprisingly low setting.
$ sudo cat /proc/$(sudo cat /var/lib/mysql/mysql.pid)/limits | grep open Max open files 1185 1185 files
So, assumption confirmed. The running limit was definitely too low by an order of magnitude.
Now, we had just rebooted the machine, so we weren’t sure what was going on. We decided to restart the MySQL instance to see what happened. To our surprise, the open file settings went back to normal.
$ sudo cat /proc/$(sudo cat /var/lib/mysql/mysql.pid)/limits | grep open Max open files 96000 96000 files
What the heck?
I had a suspicion. We knew several things.
- The system was rebooted.
- The system started mysql on boot.
- Restarting mysql fixed the problem.
- A process must go through PAM in order to use limits.conf.
inithas no direct hook into PAM.
At boot time,
init is invoking daemons and processes in order to get the system to a running state. We looked at other daemons to determine if they had similar issues. Some did, some didn’t.
I ended up posing this question on the LOPSA irc channel.
pop quiz. /etc/security/limits.conf settings only get honored if you have something that goes through a pam context that invokes pam_limits.so … but at boot time, init doesn’t do this, so none of the correct settings get configured for limits. What’s the work around for this?
And got several responses, including this one:
geekosaur: this is why many startup scripts use su
And this was the clue we needed. It helped describe why this was only affecting some daemons, including MySQL. Here’s why.
At boot time,
init has a default limit set for root. When
init starts running the scripts in
init.d/rcX.d, these scripts inherit that limit. If a script is starting a daemon AND that daemon needs some custom limit set, very often that script will be designed to
su to the user that needs to run the limit. Since
su is a pam-enabled thing,
pam_limits.so gets invoked and reads the new default limits.
In the case of
mysqld_safe --user=mysql, which then invokes
/usr/sbin/mysqld --user=mysql. I suspect
mysqld is then just doing a
setuid()/seteuid() to go from root to mysql. This bypasses the entire PAM chain.
The workaround would be to have the init script either
su to mysql before invoking
mysqld_safe (possibly non-trivial change) or just modify the startup script to set the ulimits appropriately. We do this for supervisord, as an example.
It’s important to know and understand how the different pieces of your architecture work together. In this case, we thought we understood how and why things worked. What we hadn’t taken into account is how these things interact directly after a system reboot. Our MySQL servers end up running longer than other systems in the environment and they use a different mechanism to start up their MySQL processes compared to other daemons, so it wasn’t something that we would have quickly encountered