nixpkgs-suyu/nixos/modules/system/boot
aszlig 67223ee205
nixos/stage-1: Don't kill kernel threads
Unfortunately, pkill doesn't distinguish between kernel and user space
processes, so we need to make sure we don't accidentally kill kernel
threads.

Normally, a kernel thread ignores all signals, but there are a few that
do. A quick grep on the kernel source tree (as of kernel 4.6.0) shows
the following source files which use allow_signal():

  drivers/isdn/mISDN/l1oip_core.c
  drivers/md/md.c
  drivers/misc/mic/cosm/cosm_scif_server.c
  drivers/misc/mic/cosm_client/cosm_scif_client.c
  drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
  drivers/staging/rtl8188eu/core/rtw_cmd.c
  drivers/staging/rtl8712/rtl8712_cmd.c
  drivers/target/iscsi/iscsi_target.c
  drivers/target/iscsi/iscsi_target_login.c
  drivers/target/iscsi/iscsi_target_nego.c
  drivers/usb/atm/usbatm.c
  drivers/usb/gadget/function/f_mass_storage.c
  fs/jffs2/background.c
  fs/lockd/clntlock.c
  fs/lockd/svc.c
  fs/nfs/nfs4state.c
  fs/nfsd/nfssvc.c

While not all of these are necessarily kthreads and some functionality
may still be unimpeded, it's still quite harmful and can cause
unexpected side-effects, especially because some of these kthreads are
storage-related (which we obviously don't want to kill during bootup).

During discussion at #15226, @dezgeg suggested the following
implementation:

for pid in $(pgrep -v -f '@'); do
    if [ "$(cat /proc/$pid/cmdline)" != "" ]; then
        kill -9 "$pid"
    fi
done

This has a few downsides:

 * User space processes which use an empty string in their command line
   won't be killed.
 * It results in errors during bootup because some shell-related
   processes are already terminated (maybe it's pgrep itself, haven't
   checked).
 * The @ is searched within the full command line, not just at the
   beginning of the string. Of course, we already had this until now, so
   it's not a problem of his implementation.

I posted an alternative implementation which doesn't suffer from the
first point, but even that one wasn't sufficient:

for pid in $(pgrep -v -f '^@'); do
    readlink "/proc/$pid/exe" &> /dev/null || continue
    echo "$pid"
done | xargs kill -9

This one spawns a subshell, which would be included in the processes to
kill and actually kills itself during the process.

So what we have now is even checking whether the shell process itself is
in the list to kill and avoids killing it just to be sure.

Also, we don't spawn a subshell anymore and use /proc/$pid/exe to
distinguish between user space and kernel processes like in the comments
of the following StackOverflow answer:

http://stackoverflow.com/a/12231039

We don't need to take care of terminating processes, because what we
actually want IS to terminate the processes.

The only point where this (and any previous) approach falls short if we
have processes that act like fork bombs, because they might spawn
additional processes between the pgrep and the killing. We can only
address this with process/control groups and this still won't save us
because the root user can escape from that as well.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Fixes: #15226
2016-05-06 16:24:42 +02:00
..
loader treewide: Use correct output of config.nix.package in non-string contexts 2016-04-25 16:44:38 +02:00
coredump.nix Restore default core limit of 0:infinity 2016-04-14 13:18:09 +02:00
emergency-mode.nix
initrd-network.nix initrd-network: call postCommands only if network is up 2016-02-03 16:35:21 +03:00
initrd-ssh.nix initrd-ssh module: don't check if network is up 2016-02-03 16:37:10 +03:00
kernel.nix Revert "Add the tool "nixos-typecheck" that can check an option declaration to:" 2016-03-01 20:52:06 +01:00
kexec.nix
luksroot.nix treewide: Mass replace 'openssl}/bin' to refer the 'bin' output 2016-02-01 20:46:16 +02:00
modprobe.nix kmod-debian-aliases: init at 21-1 2015-09-13 10:55:44 +02:00
networkd.nix networkd: add DHCPServer config section 2015-12-23 06:04:39 +01:00
pbkdf2-sha512.c
readonly-mountpoint.c
resolved.nix Create systemd-{network,resolve} user/group unconditionally 2015-07-22 12:23:45 +02:00
shutdown.nix
stage-1-init.sh nixos/stage-1: Don't kill kernel threads 2016-05-06 16:24:42 +02:00
stage-1.nix stage-1: Remove doublePatchelf hack 2016-04-15 01:53:34 +03:00
stage-2-init.sh nixos/stage-1/2: Added -r option to read so that read interprets backslashes literally, and corrected the comment about optional logging. 2016-02-24 18:54:25 +11:00
stage-2.nix Make stage-1/2 logging unconditional, and drop log level to "debug" 2016-02-23 11:56:09 +01:00
systemd-lib.nix Set ‘allowSubstitutes = false’ on various derivations 2015-07-09 15:10:37 +02:00
systemd-unit-options.nix replace makeSearchPath tree-wise to take care of possible multiple outputs 2016-04-13 22:09:41 +03:00
systemd.nix systemd.generators: Generate folders via environment.etc. 2016-04-08 14:50:20 +02:00
timesyncd.nix Don't include networkd units unless enabled 2015-04-19 22:06:45 +02:00
tmp.nix