This accomplishes multiple things:
- Allows us to start systemd without stage-2-init.sh. This was not
possible before because the environment would have been wrong
- `systemctl daemon-reexec` also changes the environment, giving us
newer tools for the fs packages
- Starts systemd in a fully clean environment, making everything more
consistent and pure
At some point, I'd like to make another attempt at
71f1f4884b ("openssl: stop static binaries referencing libs"), which
was reverted in 195c7da07d. One problem with my previous attempt is
that I moved OpenSSL's libraries to a lib output, but many dependent
packages were hardcoding the out output as the location of the
libraries. This patch fixes every such case I could find in the tree.
It won't have any effect immediately, but will mean these packages
will automatically use an OpenSSL lib output if it is reintroduced in
future.
This patch should cause very few rebuilds, because it shouldn't make
any change at all to most packages I'm touching. The few rebuilds
that are introduced come from when I've changed a package builder not
to use variable names like openssl.out in scripts / substitution
patterns, which would be confusing since they don't hardcode the
output any more.
I started by making the following global replacements:
${pkgs.openssl.out}/lib -> ${lib.getLib pkgs.openssl}/lib
${openssl.out}/lib -> ${lib.getLib openssl}/lib
Then I removed the ".out" suffix when part of the argument to
lib.makeLibraryPath, since that function uses lib.getLib internally.
Then I fixed up cases where openssl was part of the -L flag to the
compiler/linker, since that unambigously is referring to libraries.
Then I manually investigated and fixed the following packages:
- pycurl
- citrix-workspace
- ppp
- wraith
- unbound
- gambit
- acl2
I'm reasonably confindent in my fixes for all of them.
For acl2, since the openssl library paths are manually provided above
anyway, I don't think openssl is required separately as a build input
at all. Removing it doesn't make a difference to the output size, the
file list, or the closure.
I've tested evaluation with the OfBorg meta checks, to protect against
introducing evaluation failures.
We can perform most of the mkdir/ln/rm using systemd-tmpfiles
instead which cleans up the script.
/bin and /home are created by their activation script snippets
usbfs is deprecated and unused.
hwclock seems to be automatically executed by systemd on startup.
The mkswap to prevent hibernation cycles seems to be executed by systemd
as well since the provided regression tests succeeds.
Currently it is only possible to add upstream _system_ units. The option
systemd.additionalUpstreamSystemUnits can be used for this.
However, this was not yet possible for systemd.user. In a similar
fashion this was added to systemd-user.nix.
This is intended to have other modules add upstream units.
Use a quoted heredoc to inject installBootLoader safely into the script,
and restore the previous invocation of `system` with a single argument so
that shell commands keep working.
As of systemd/systemd@e908434458,
systemd-networkd now automatically configures routes to addresses
specified in AllowedIPs unless explicitly disabled with
"RouteTable=off".
This bug is so obscure and unlikely that I was honestly not able to
properly write a test for it. What happens is that we are calling
handleModifiedUnit() with $unitsToStart=\%unitsToRestart. We do this to
make sure that the unit is stopped before it's started again which is
not possible by regular means because the stop phase is already done
when calling the activation script.
recordUnit() still gets $startListFile, however which is the wrong file.
The bug would be triggered if an activation script requests a service
restart for a service that has `stopIfChanged = true` and
switch-to-configuration is killed before the restart phase was run. If
the script is run again, but the activation script is not requesting
more restarts, the unit would be started instead of restarted.
When initializing a system (e.g. first boot / livecd) we have no good
reference source for time. systemd-timesyncd however would revert back
to its configured fallback time (in our case 01.01.1980). Since we
probably don't want to hardcode a specific date as fallback we are now
using the current system time (wherever that might have come from) to
initialize the reference clock file.
The only systems that might be remotely affected by this change are
machines that have highly unreliable RTCs or those where the battery
that backs the RTC is running empty.
Historically these systems always had a tough time with anything time
related and likely required manual intervention.
For stateless systems (those that wipe / between reboots or our
installer CDs) this has the consequence that time will always be reset
to whatever the system comes up with on boot. This is likely the correct
time coming from an RTC. No harm done here the situation is likely
unchanged for them.
For stateful systems (those that retain the / partition across reboots)
there shouldn't be a change at all. They'll provide an initial clock
value once on their lifetime (during first boot / after installation).
From then onwards systemd-timesyncd will update the file with the newer
fallback time (that will be picked up on the next boot).
This effectively fixes the majority of all VM tests which were broken
because `/dev/vda` (or any other block device) wasn't mountable:
machine # mounting /dev/vda on /...
machine # mount: mounting /dev/vda on /mnt-root/ failed: No such device[ 2.820976] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
machine # [ 2.821757] CPU: 0 PID: 1 Comm: init Not tainted 5.10.72 #1-NixOS
machine # [ 2.821757] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
machine # [ 2.821757] Call Trace:
machine # [ 2.821757] dump_stack+0x6b/0x83
machine # [ 2.821757] panic+0x101/0x2c8
machine # [ 2.821757] do_exit.cold+0x14/0xb3
machine # [ 2.821757] do_group_exit+0x33/0xa0
machine # [ 2.821757] __x64_sys_exit_group+0x14/0x20
machine # [ 2.821757] do_syscall_64+0x33/0x40
machine # [ 2.821757] entry_SYSCALL_64_after_hwframe+0x44/0xa9
machine # [ 2.821757] RIP: 0033:0x7f67ec2800f6
machine # [ 2.821757] Code: 00 4c 8b 0d 2c 5d 11 00 eb 19 66 2e 0f 1f 84 00 00 00 00 00 89 d7 89 f0 0f 05 48 3d 00 f0 ff ff 77 22 f4 89 d7 44 89 c0 0f 05 <48> 3d 00 f0 ff ff 76 e2 f7 d8 64 41 89 01 eb da 66 2e 0f 1f 84 00
machine # [ 2.821757] RSP: 002b:00007fff8f5a71d8 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
machine # [ 2.821757] RAX: ffffffffffffffda RBX: 0000000000699704 RCX: 00007f67ec2800f6
machine # [ 2.821757] RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001
machine # [ 2.821757] RBP: 0000000000000004 R08: 00000000000000e7 R09: ffffffffffffff80
machine # [ 2.821757] R10: 00007f67ec33f3e0 R11: 0000000000000202 R12: 000000000000000b
machine # [ 2.821757] R13: 00007fff8f5a75a8 R14: 0000000000000000 R15: 00000000004fc198
machine # [ 2.821757] Kernel Offset: 0x31e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
machine # [ 2.821757] Rebooting in 1 seconds..
This happened because the kernel failed to load modules such as `ext4`
from `boot.initrd.availableKernelModules`[1] on e.g. a `mount(2)` syscall.
The problem is that `kmod` isn't linked against `libpthread.so.0`
anymore because it got merged into `libc.so.6` (however, the .so still
exists), but still needs it:
machine # newfstatat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/x86_64", 0x7ffd951114c0, 0) = -1 ENOENT (No such file or directory)
machine # openat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/x86_64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
machine # newfstatat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/x86_64", 0x7ffd951114c0, 0) = -1 ENOENT (No such file or directory)
machine # openat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
machine # newfstatat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib", 0x7ffd951114c0, 0) = -1 ENOENT (No such file or directory)
machine # openat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
machine # writev(2, [{iov_base="/nix/store/kdc9n48ksdc1a8y8w512w"..., iov_len=69}, {iov_base=": ", iov_len=2}, {iov_base="error while loading shared libra"..., iov_len=36}, {iov_base=": ", iov_len=2}, {iov_base="libpthread.so.0", iov_len=15}, {iov_base=": ", iov_len=2}, {iov_base="cy
machine # ) = 184
machine # exit_group(127) = ?
machine # +++ exited with 127 +++
machine # mount: mounting /dev/vda on /mnt-root/ failed: No such device
machine # [ 19.167180] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
machine # [ 19.167711] CPU: 0 PID: 1 Comm: init Not tainted 5.10.72 #1-NixOS
This is not a problem
* inside stage-1 because `LD_LIBRARY_PATH` points to `$out/lib` of
extra-utils where `libpthread.so.6` also exists.
* on a running system because `${pkgs.glibc}/lib` is part of kmod's
rpath.
However this is a problem inside the kernel which calls `modprobe` (in
our case `kmod`) to load modules and doesn't know about
`LD_LIBRARY_PATH`. Also, the rpath-reference was nuked.
To work around this, the kernel's `modprobe`
(i.e. `/proc/sys/kernel/modprobe`) now points to a wrapper which
explicitly declares `LD_LIBRARY_PATH`. We can't use `makeWrapper` here
because `modprobe` itself must not be renamed. Otherwise, `kmod` (which
is the link-target of `modprobe`) won't work because it expects
`argv[0] == "modprobe"` to perform modprobe's tasks.
[1] https://nixos.org/manual/nixos/stable/options.html#opt-boot.initrd.availableKernelModules
systemd needs this so special characters (like the ones in wireguard
units that appear because they are part of base64) can be escaped using
the \x syntax.
Root of the issue is that `glob()` handles the backslash internally
which is obviously not what we want here.
Also add a test case and fix some perlcritic issues in the subroutine.
wtmp and btmp are created by systemd, so the rules are more appropriate there.
They can be disabled explicitly with something like
services.ogrotate.paths = {
"/var/log/btmp".enable = false;
"/var/log/wtmp".enable = false;
};
if required.
This is accomplished by comparing the hashes that the unit files
contain. By filtering for a special key `X-Reload-Triggers` in the
`[Unit]` section, we can differentiate between reloads and restarts.
Since activation scripts can request reloads of units as well, more
checking of this behaviour is implemented. If a unit is to be restarted,
it's never reloaded as well which would make no sense.
Also removes a useless subroutine and perl dependencies that are
nowadays handled by the propagated build inputs feature of
`perl.withPackages`.
The mount options need to be passed as a comma-separated list of options so that they
end up one a single Options line in the resulting mount unit.
The current code passed the options as a list, resulting in several Options lines in
the mount unit, all but the first of these were ignored by systemd however.
This behaviour is not clearly defined in the systemd man page.
The `nix.*` options, apart from options for setting up the
daemon itself, currently provide a lot of setting mappings
for the Nix daemon configuration. The scope of the mapping yields
convience, but the line where an option is considered essential
is blurry. For instance, the `extra-sandbox-paths` mapping is
provided without its primary consumer, and the corresponding
`sandbox-paths` option is also not mapped.
The current system increases the maintenance burden as maintainers have to
closely follow upstream changes. In this case, there are two state versions
of Nix which have to be maintained collectively, with different options
avaliable.
This commit aims to following the standard outlined in RFC 42[1] to
implement a structural setting pattern. The Nix configuration is encoded
at its core as key-value pairs which maps nicely to attribute sets, making
it feasible to express in the Nix language itself. Some existing options are
kept such as `buildMachines` and `registry` which present a simplified interface
to managing the respective settings. The interface is exposed as `nix.settings`.
Legacy configurations are mapped to their corresponding options under `nix.settings`
for backwards compatibility.
Various options settings in other nixos modules and relevant tests have been
updated to use structural setting for consistency.
The generation and validation of the configration file has been modified to
use `writeTextFile` instead of `runCommand` for clarity. Note that validation
is now mandatory as strict checking of options has been pushed down to the
derivation level due to freeformType consuming unmatched options. Furthermore,
validation can not occur when cross-compiling due to current limitations.
A new option `publicHostKey` was added to the `buildMachines`
submodule corresponding to the base64 encoded public host key settings
exposed in the builder syntax. The build machine generation was subsequently
rewritten to use `concatStringsSep` for better performance by grouping
concatenations.
[1] - https://github.com/NixOS/rfcs/blob/master/rfcs/0042-config-option.md
This option behaves exactly like `boot.extraModprobeConfig`, except that it also includes the generated modprobe.d file in the initrd.
Many years ago, someone tried to include the normal modprobe.d/nixos.conf file generated by `boot.extraModprobeConfig` in the initrd: 0aa2c1dc46. This file contains a reference to a directory with firmware files inside. Including firmware in the initrd made it too big, so the commit was reverted again in 4a4c051a95.
The `boot.extraModprobeConfig` option not changing the initrd caused me much confusion because I tried to set the maximum cache size for ZFS and it didn't work.
Closes https://github.com/NixOS/nixpkgs/issues/25456.
Modules that do not depend on e.g. toplevel should not have to include it just to set
things in `system.build`. As a general rule, this keeps tests simple, usage flexible
and evaluation fast. While one module is insignificant, consistency and good practices
are.
If the Nix daemon has never been enabled (nix.enable has always been
set to false), the gcroots directory won't exist. If the Nix daemon
is later enabled, the GC roots for booted-system and current-system
will be missing, and they might end up being garbage collected. Since
it's cheap to add GC roots even if the daemon will never be enabled,
let's just always add them so we're okay in the case where the daemon
is enabled later.
- Fully get rid of `parseKeyValues` and use systemctl features for that
- Add some regex modifiers recommended by perlcritic
- Get rid of a postfix if
- Sort units when showing their status
- Clean the logic for showing what failed from `elif` to `next`
- Switch from `state` to `substate` for `auto-restart` because that's
actually where the value is stored
- Show status of units with one single systemctl call and get rid of
COLUMNS in favor of --full
- Add a test for failing units
This replaces the naive K=V unit parser with a proper INI parser from a
library and adds proper support for override files. Also adds a bunch of
comments about parsing, I hope this makes it easier to understand and
maintain in the future.
There are multiple reasons to do so, the first one is just general
correctness with is nice imo. But to get to more serious reasons (I
didn't put in all that effort for nothing) is that this is the first
step torwards more clever restart/reload handling. By using a library
like Data::Compare a future PR could replace the current way of
fingerprinting units (which is to compare store paths) by comparing the
hashes. This is more precise because units won't get restarted because
the order of the options change, comments are added, some dependency of
writeText changes, .... Also this allows us to add a feature like
`X-Reload-Triggers` so the unit can either be reloaded when these change
or restarted when everything else changes, giving module authors the
ability to have their services reloaded without having to fear that
updates are not applied because the service doesn't get restarted.
Another reason why this feature is nice is that now that the unit files
are parsed correctly (and values are just extracted from one section),
potential future rewrites can just rely on some INI library without
having to implement their own weird parser that is compatible with this
script.
This also comes with a new subroutine to handle systemd booleans because
I thought the current way of handling it was just ugly. This also allows
overriding values this script reads in an override file.
Apart from making this script more compatible with the world around it,
this also fixes two issues I saw bugging exactly 0 (zero) people. First
is that this script now supports multiple override files, also ones that
are not called override.conf and the second one is that `1` and `on` are
treated as bools by systemd but were previously not parsed as such by
switch-to-configuration.
This removes `/run/nixos/activation-reload-list` (which we will need in
the future when reworking the reload logic) and makes
`/run/nixos/activation-restart-list` honor `restartIfChanged` and
`reloadIfChanged`. This way activation scripts don't have to bother with
choosing between reloading and restarting.
I was confused why I couldn't find a mention of udev.log_priority in
systemd-udevd.service(8). It turns out that it was renamed[1] to
udev.log_level. The old name is still accepted, but it'll avoid
further confusion if we use the new name in our documentation.
[1]: 64a3494c3d
most modules can be evaluated for their documentation in a very
restricted environment that doesn't include all of nixpkgs. this
evaluation can then be cached and reused for subsequent builds, merging
only documentation that has changed into the cached set. since nixos
ships with a large number of modules of which only a few are used in any
given config this can save evaluation a huge percentage of nixos
options available in any given config.
in tests of this caching, despite having to copy most of nixos/, saves
about 80% of the time needed to build the system manual, or about two
second on the machine used for testing. build time for a full system
config shrank from 9.4s to 7.4s, while turning documentation off
entirely shortened the build to 7.1s.
Add `shellDryRun` to the generic stdenv and substitute it for uses of
`${stdenv.shell} -n`. The point of this layer of abstraction is to add
the flag `-O extglob`, which resolves#126344 in a more direct way.
some options have default that are best described in prose, such as
defaults that depend on the system stateVersion, defaults that are
derivations specific to the surrounding context, or those where the
expression is much longer and harder to understand than a simple text
snippet.
The first one doesn't make any sense because the directory where the
init binary resides does not contain other tools we need like
systemd-escape.
The second one doesn't make sense either because the errors are already
ignored.
Add `systemd.network.networks.*.dhcpServerStaticLeaseConfig` to allow
for configuring static DHCP leases through the `[DHCPServerStaticLease]`
section. See systemd.network(5) of systemd 249 for details.
Also adds the NixOS test `systemd-networkd-dhcpserver-static-lease` to
test the assignment of static leases.
This reverts commit 57961d2b83, reversing
changes made to b04f913afc.
(I.e. this reverts PR #141192.)
While well-intended, this change does unfortunately introduce very
serious regressions that are especially disruptive/noticeable on desktop
systems (e.g. users of Sway will loose their graphical session when
running "nixos-rebuild switch").
Therefore, this change has to be reverted ASAP instead of trying to fix
it in "production".
Note: An updated version should be extensively discussed, reviewed, and
tested before re-landing this change as an earlier version also had to
be reverted for the exact same issues [0].
Fix: #146727
[0]: https://github.com/NixOS/nixpkgs/pull/73871#issuecomment-559783752
mkDerivedConfig : Option a -> (a -> Definition b) -> Definition b
Create config definitions with the same priority as the definition of another option.
This should be used for option definitions where one option sets the value of another as a convenience.
For instance a config file could be set with a `text` or `source` option, where text translates to a `source`
value using `mkDerivedConfig options.text (pkgs.writeText "filename.conf")`.
It takes care of setting the right priority using `mkOverride`.
Before this change, one could set environment.etc.*.text and .source.
.source would always take precedence, regardless of the priorities set.
This change means that if, for instance, .text is set with mkForce but
.source is set normally, the .text content will be the one to take
effect. If they are set with the same priority they will conflict.
The regexp was only matching numbers and not the '.', so everyone using
systemd-boot would always see `could not find any previously installed
systemd-boot` on a `nixos-rebuild`.
By using the new extendModules function to produce the specialisations,
we avoid reimplementing the eval-config.nix logic in reverse and fix
cross compilation support for specialisations in the process.
Some specialisations (such as those which affect various boot-time
attributes) cannot be switched to at runtime. This allows picking the
specialisation at boot time.
/etc/crypttab can contain the _netdev option, which adds crypto devices
to the remote-cryptsetup.target.
remote-cryptsetup.target has a dependency on cryptsetup-pre.target. So
let's add both of them.
Currently, one needs to manually ssh in and invoke `systemctl start
systemd-cryptsetup@<name>.service` to unlock volumes.
After this change, systemd will properly add it to the target, and
assuming remote-cryptsetup.target is pulled in somewhere, you can simply
pass the passphrase by invoking `systemd-tty-ask-password-agent` after
ssh-ing in, without having to manually start these services.
Whether remote-cryptsetup.target should be added to multi-user.target
(as it is on other distros) is part of another discussion - right now
the following snippet will do:
```
systemd.targets.multi-user.wants = [ "remote-cryptsetup.target" ];
```
This makes the order of operations the same in dry-activate and a "true"
activate. Also fixes the indentation I messed up and drop a useless
unlink() call (we are already unlinking that file earlier).
The previous logic failed to detect that units were socket-activated
when the socket was stopped before switch-to-configuration was run. This
commit fixes that and also starts the socket in question.
The first FIXME is removed because it doesn't make sense to use
/proc/1/exe since that points to a directory that doesn't have all tools
the activation script needs (like systemd-escape).
The second one is removed because there is already no error handling
(compare with the restart logic where the return code is checked).
This commit changes a lot more that you'd expect but it also adds a lot
of new testing code so nothing breaks in the future. The main change is
that sockets are now restarted when they change. The main reason for
the large amount of changes is the ability of activation scripts to
restart/reload units. This also works for socket-activated units now,
and honors reloadIfChanged and restartIfChanged. The two changes don't
really work without each other so they are done in the one large commit.
The test should show what works now and ensure it will continue to do so
in the future.