UNIX Sysadminning 101: Process Control

kill is your weapon when dealing with processes.  Despite its name, kill can be used to send any signal to a process you own.  If you’re root, then it can be used to send any signal to any process.  The signals vary by system but three are of vital importance:  SIGHUP, SIGTERM, and SIGKILL.

SIGHUP (“hangup”) is the modem hangup signal.  The original intent was that this signal was delivered to all processes on a given TTY when the modem connecting that TTY hung up the line.  Today, we don’t connect through modems but SIGHUP is still sent to all processes running under a shell when that shell exits.  The nohup command blocks the hangup signal and can be used to keep processes alive after their controlling shell has exited.  SIGHUP is usually caught by daemons and the canonical behavior is to reload their config files and reset any open sessions.  Some also flush internal buffers to disk when they catch a SIGHUP.  Non-daemon processes usually don’t catch SIGHUP and so they begin exiting gracefully (as per the default signal handling logic in UNIX).

SIGTERM (“terminate”) is the signal the kernel uses to request a process to exit.  The terminate signal is usually not caught and causes the process to begin exiting gracefully.  Daemons may catch SIGTERM to clean up internal memory and such but ultimately must exit as per POSIX spec.

SIGKILL (“kill”) is the last-ditch effort signal to kill a process that’s gone off the trolley and doesn’t respond to any other external stimulus.  No process may catch SIGKILL.  It’s always signal #9 (on every UNIX, per spec) and is 100% guaranteed to kill any process it’s sent to.  If it fails to kill a given process, that indicates that the kernel’s internal process tree is hosed and the system needs to be restarted.  But that’s a rare occurrence.  SIGKILL is the worst way to stop a process as it gives no warning or opportunity for the process to save internal state, flush buffers, or exit gracefully.  It should be used as a last resort for a process that’s off in deep space and can’t be shaken out of its catatonic state.

The usual procedure for dealing with unruly processes is threefold:  1.  Send two SIGHUPs (kill -HUP $pid; kill -HUP $pid).  This warns the process that we want its attention and it may be interrupted, so it should save internal state, flush buffers, and be ready to handle priority external interrupts.  This may cause the process to exit.  If so, then you’re done.

2.  Send one SIGTERM (kill -TERM $pid).  This requests the process to exit.  It’s a clear signal that we are asking the process to finish up its work, save its state, and exit.  If a process fails to exit in a timely manner after getting SIGTERM, then it’s in violation of UNIX and needs to die horribly through a merciless SIGKILL.

3.  If necessary, send one SIGKILL (kill -KILL $pid, or kill -9 $pid).  This kills the process outright.  No opportunity to save state.  Just kills it.  Often leaves a bloody mess behind in memory and on the filesystem and may leave a dead, unusable entry in the kernel process table (which manifests as a “zombie process” and shows up as “[defunct]” in ps listings).  This is never a *good* way to end a process’ life but it’s necessary when the process is completely and impossibly wedged and no amount of pleading will get it to budge.

If the process fails to die after a SIGKILL, then you need to reboot your machine.

NOTE:  NEVER SEND ANY SIGNAL TO PID 1!  init is always process #1.  Any signal caught by init will cause the system to reboot.  Killing init will cause a kernel panic.  This is always a bad policy and on modern machines should never be necessary as shutdown and telinit are more than capable of communicating with init in a safe manner that preserves system integrity.  Bottom line, never send any signal to init.  Ever.  Bad idea all around.

Advertisements

, , , ,

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: