* nixos/acme: Fix ordering of cert requests
When subsequent certificates would be added, they would
not wake up nginx correctly due to target units only being triggered
once. We now added more fine-grained systemd dependencies to make sure
nginx always is aware of new certificates and doesn't restart too early
resulting in a crash.
Furthermore, the acme module has been refactored. Mostly to get
rid of the deprecated PermissionStartOnly systemd options which were
deprecated. Below is a summary of changes made.
* Use SERVICE_RESULT to determine status
This was added in systemd v232. we don't have to keep track
of the EXITCODE ourselves anymore.
* Add regression test for requesting mutliple domains
* Deprecate 'directory' option
We now use systemd's StateDirectory option to manage
create and permissions of the acme state directory.
* The webroot is created using a systemd.tmpfiles.rules rule
instead of the preStart script.
* Depend on certs directly
By getting rid of the target units, we make sure ordering
is correct in the case that you add new certs after already
having deployed some.
Reason it broke before: acme-certificates.target would
be in active state, and if you then add a new cert, it
would still be active and hence nginx would restart
without even requesting a new cert. Not good! We
make the dependencies more fine-grained now. this should fix that
* Remove activationDelay option
It complicated the code a lot, and is rather arbitrary. What if
your activation script takes more than activationDelay seconds?
Instead, one should use systemd dependencies to make sure some
action happens before setting the certificate live.
e.g. If you want to wait until your cert is published in DNS DANE /
TLSA, you could create a unit that blocks until it appears in DNS:
```
RequiredBy=acme-${cert}.service
After=acme-${cert}.service
ExecStart=publish-wait-for-dns-script
```
Currently if you want to properly chroot a systemd service, you could do
it using BindReadOnlyPaths=/nix/store or use a separate derivation which
gathers the runtime closure of the service you want to chroot. The
former is the easier method and there is also a method directly offered
by systemd, called ProtectSystem, which still leaves the whole store
accessible. The latter however is a bit more involved, because you need
to bind-mount each store path of the runtime closure of the service you
want to chroot.
This can be achieved using pkgs.closureInfo and a small derivation that
packs everything into a systemd unit, which later can be added to
systemd.packages.
However, this process is a bit tedious, so the changes here implement
this in a more generic way.
Now if you want to chroot a systemd service, all you need to do is:
{
systemd.services.myservice = {
description = "My Shiny Service";
wantedBy = [ "multi-user.target" ];
confinement.enable = true;
serviceConfig.ExecStart = "${pkgs.myservice}/bin/myservice";
};
}
If more than the dependencies for the ExecStart* and ExecStop* (which
btw. also includes script and {pre,post}Start) need to be in the chroot,
it can be specified using the confinement.packages option. By default
(which uses the full-apivfs confinement mode), a user namespace is set
up as well and /proc, /sys and /dev are mounted appropriately.
In addition - and by default - a /bin/sh executable is provided, which
is useful for most programs that use the system() C library call to
execute commands via shell.
Unfortunately, there are a few limitations at the moment. The first
being that DynamicUser doesn't work in conjunction with tmpfs, because
systemd seems to ignore the TemporaryFileSystem option if DynamicUser is
enabled. I started implementing a workaround to do this, but I decided
to not include it as part of this pull request, because it needs a lot
more testing to ensure it's consistent with the behaviour without
DynamicUser.
The second limitation/issue is that RootDirectoryStartOnly doesn't work
right now, because it only affects the RootDirectory option and doesn't
include/exclude the individual bind mounts or the tmpfs.
A quirk we do have right now is that systemd tries to create a /usr
directory within the chroot, which subsequently fails. Fortunately, this
is just an ugly error and not a hard failure.
The changes also come with a changelog entry for NixOS 19.03, which is
why I asked for a vote of the NixOS 19.03 stable maintainers whether to
include it (I admit it's a bit late a few days before official release,
sorry for that):
@samueldr:
Via pull request comment[1]:
+1 for backporting as this only enhances the feature set of nixos,
and does not (at a glance) change existing behaviours.
Via IRC:
new feature: -1, tests +1, we're at zero, self-contained, with no
global effects without actively using it, +1, I think it's good
@lheckemann:
Via pull request comment[2]:
I'm neutral on backporting. On the one hand, as @samueldr says,
this doesn't change any existing functionality. On the other hand,
it's a new feature and we're well past the feature freeze, which
AFAIU is intended so that new, potentially buggy features aren't
introduced in the "stabilisation period". It is a cool feature
though? :)
A few other people on IRC didn't have opposition either against late
inclusion into NixOS 19.03:
@edolstra: "I'm not against it"
@Infinisil: "+1 from me as well"
@grahamc: "IMO its up to the RMs"
So that makes +1 from @samueldr, 0 from @lheckemann, 0 from @edolstra
and +1 from @Infinisil (even though he's not a release manager) and no
opposition from anyone, which is the reason why I'm merging this right
now.
I also would like to thank @Infinisil, @edolstra and @danbst for their
reviews.
[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-477322127
[2]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-477548395
So far we had MountFlags = "private", but as @Infinisil has correctly
noticed, there is a dedicated PrivateMounts option, which does exactly
that and is better integrated than providing raw mount flags.
When checking for the reason why I used MountFlags instead of
PrivateMounts, I found that at the time I wrote the initial version of
this module (Mar 12 06:15:58 2018 +0100) the PrivateMounts option didn't
exist yet and has been added to systemd in Jun 13 08:20:18 2018 +0200.
Signed-off-by: aszlig <aszlig@nix.build>
Noted by @Infinisil on IRC:
infinisil: Question regarding the confinement PR
infinisil: On line 136 you do different things depending on
RootDirectoryStartOnly
infinisil: But on line 157 you have an assertion that disallows that
option being true
infinisil: Is there a reason behind this or am I missing something
I originally left this in so that once systemd supports that, we can
just flip a switch and remove the assertion and thus support
RootDirectoryStartOnly for our confinement module.
However, this doesn't seem to be on the roadmap for systemd in the
foreseeable future, so I'll just remove this, especially because it's
very easy to add it again, once it is supported.
Signed-off-by: aszlig <aszlig@nix.build>
seems that this got broken when the config option was made to use enums. "secure" got replaced with "enum", which isn't a valid option for the failure mode.
My implementation was relying on PrivateDevices, PrivateTmp,
PrivateUsers and others to be false by default if chroot-only mode is
used.
However there is an ongoing effort[1] to change these defaults, which
then will actually increase the attack surface in chroot-only mode,
because it is expected that there is no /dev, /sys or /proc.
If for example PrivateDevices is enabled by default, there suddenly will
be a mounted /dev in the chroot and we wouldn't detect it.
Fortunately, our tests cover that, but I'm preparing for this anyway so
that we have a smoother transition without the need to fix our
implementation again.
Thanks to @Infinisil for the heads-up.
[1]: https://github.com/NixOS/nixpkgs/issues/14645
Signed-off-by: aszlig <aszlig@nix.build>
From @edolstra at [1]:
BTW we probably should take the closure of the whole unit rather than
just the exec commands, to handle things like Environment variables.
With this commit, there is now a "fullUnit" option, which can be enabled
to include the full closure of the service unit into the chroot.
However, I did not enable this by default, because I do disagree here
and *especially* things like environment variables or environment files
shouldn't be in the closure of the chroot.
For example if you have something like:
{ pkgs, ... }:
{
systemd.services.foobar = {
serviceConfig.EnvironmentFile = ${pkgs.writeText "secrets" ''
user=admin
password=abcdefg
'';
};
}
We really do not want the *file* to end up in the chroot, but rather
just the environment variables to be exported.
Another thing is that this makes it less predictable what actually will
end up in the chroot, because we have a "globalEnvironment" option that
will get merged in as well, so users adding stuff to that option will
also make it available in confined units.
I also added a big fat warning about that in the description of the
fullUnit option.
[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-472855704
Signed-off-by: aszlig <aszlig@nix.build>
Another thing requested by @edolstra in [1]:
We should not provide a different /bin/sh in the chroot, that's just
asking for confusion and random shell script breakage. It should be
the same shell (i.e. bash) as in a regular environment.
While I personally would even go as far to even have a very restricted
shell that is not even a shell and basically *only* allows "/bin/sh -c"
with only *very* minimal parsing of shell syntax, I do agree that people
expect /bin/sh to be bash (or the one configured by environment.binsh)
on NixOS.
So this should make both others and me happy in that I could just use
confinement.binSh = "${pkgs.dash}/bin/dash" for the services I confine.
[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-472855704
Signed-off-by: aszlig <aszlig@nix.build>
Quoting @edolstra from [1]:
I don't really like the name "chroot", something like "confine[ment]"
or "restrict" seems better. Conceptually we're not providing a
completely different filesystem tree but a restricted view of the same
tree.
I already used "confinement" as a sub-option and I do agree that
"chroot" sounds a bit too specific (especially because not *only* chroot
is involved).
So this changes the module name and its option to use "confinement"
instead of "chroot" and also renames the "chroot.confinement" to
"confinement.mode".
[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-472855704
Signed-off-by: aszlig <aszlig@nix.build>
Currently, if you want to properly chroot a systemd service, you could
do it using BindReadOnlyPaths=/nix/store (which is not what I'd call
"properly", because the whole store is still accessible) or use a
separate derivation that gathers the runtime closure of the service you
want to chroot. The former is the easier method and there is also a
method directly offered by systemd, called ProtectSystem, which still
leaves the whole store accessible. The latter however is a bit more
involved, because you need to bind-mount each store path of the runtime
closure of the service you want to chroot.
This can be achieved using pkgs.closureInfo and a small derivation that
packs everything into a systemd unit, which later can be added to
systemd.packages. That's also what I did several times[1][2] in the
past.
However, this process got a bit tedious, so I decided that it would be
generally useful for NixOS, so this very implementation was born.
Now if you want to chroot a systemd service, all you need to do is:
{
systemd.services.yourservice = {
description = "My Shiny Service";
wantedBy = [ "multi-user.target" ];
chroot.enable = true;
serviceConfig.ExecStart = "${pkgs.myservice}/bin/myservice";
};
}
If more than the dependencies for the ExecStart* and ExecStop* (which
btw. also includes "script" and {pre,post}Start) need to be in the
chroot, it can be specified using the chroot.packages option. By
default (which uses the "full-apivfs"[3] confinement mode), a user
namespace is set up as well and /proc, /sys and /dev are mounted
appropriately.
In addition - and by default - a /bin/sh executable is provided as well,
which is useful for most programs that use the system() C library call
to execute commands via shell. The shell providing /bin/sh is dash
instead of the default in NixOS (which is bash), because it's way more
lightweight and after all we're chrooting because we want to lower the
attack surface and it should be only used for "/bin/sh -c something".
Prior to submitting this here, I did a first implementation of this
outside[4] of nixpkgs, which duplicated the "pathSafeName" functionality
from systemd-lib.nix, just because it's only a single line.
However, I decided to just re-use the one from systemd here and
subsequently made it available when importing systemd-lib.nix, so that
the systemd-chroot implementation also benefits from fixes to that
functionality (which is now a proper function).
Unfortunately, we do have a few limitations as well. The first being
that DynamicUser doesn't work in conjunction with tmpfs, because it
already sets up a tmpfs in a different path and simply ignores the one
we define. We could probably solve this by detecting it and try to
bind-mount our paths to that different path whenever DynamicUser is
enabled.
The second limitation/issue is that RootDirectoryStartOnly doesn't work
right now, because it only affects the RootDirectory option and not the
individual bind mounts or our tmpfs. It would be helpful if systemd
would have a way to disable specific bind mounts as well or at least
have some way to ignore failures for the bind mounts/tmpfs setup.
Another quirk we do have right now is that systemd tries to create a
/usr directory within the chroot, which subsequently fails. Fortunately,
this is just an ugly error and not a hard failure.
[1]: https://github.com/headcounter/shabitica/blob/3bb01728a0237ad5e7/default.nix#L43-L62
[2]: https://github.com/aszlig/avonc/blob/dedf29e092481a33dc/nextcloud.nix#L103-L124
[3]: The reason this is called "full-apivfs" instead of just "full" is
to make room for a *real* "full" confinement mode, which is more
restrictive even.
[4]: https://github.com/aszlig/avonc/blob/92a20bece4df54625e/systemd-chroot.nix
Signed-off-by: aszlig <aszlig@nix.build>
For the hardened profile disable symmetric multi threading. There seems to be
no *proven* method of exploiting cache sharing between threads on the same CPU
core, so this may be considered quite paranoid, considering the perf cost.
SMT can be controlled at runtime, however. This is in keeping with OpenBSD
defaults.
TODO: since SMT is left to be controlled at runtime, changing the option
definition should take effect on system activation. Write to
/sys/devices/system/cpu/smt/control
For the hardened profile enable flushing whenever the hypervisor enters the
guest, but otherwise leave at kernel default (conditional flushing as of
writing).
Introduces the option security.protectKernelImage that is intended to control
various mitigations to protect the integrity of the running kernel
image (i.e., prevent replacing it without rebooting).
This makes sense as a dedicated module as it is otherwise somewhat difficult
to override for hardened profile users who want e.g., hibernation to work.
The OS Login package enables the following components:
AuthorizedKeysCommand to query valid SSH keys from the user's OS Login
profile during ssh authentication phase.
NSS Module to provide user and group information
PAM Module for the sshd service, providing authorization and
authentication support, allowing the system to use data stored in
Google Cloud IAM permissions to control both, the ability to log into
an instance, and to perform operations as root (sudo).
Having pam_unix set to "sufficient" means early-succeeding account
management group, as soon as pam_unix.so is succeeding.
This is not sufficient. For example, nixos modules might install nss
modules for user lookup, so pam_unix.so succeeds, and we end the stack
successfully, even though other pam account modules might want to do
more extensive checks.
Other distros seem to set pam_unix.so to 'required', so if there are
other pam modules in that management group, they get a chance to do some
validation too.
For SSSD, @PsyanticY already added a workaround knob in
https://github.com/NixOS/nixpkgs/pull/31969, while stating this should
be the default anyway.
I did some thinking in what could break - after this commit, we require
pam_unix to succeed, means we require `getent passwd $username` to
return something.
This is the case for all local users due to the passwd nss module, and
also the case for all modules installing their nss module to
nsswitch.conf - true for ldap (if not explicitly disabled) and sssd.
I'm not so sure about krb5, cc @eqyiel for opinions. Is there some nss
module loaded? Should the pam account module be placed before pam_unix?
We don't drop the `security.pam.services.<name?>.sssdStrictAccess`
option, as it's also used some lines below to tweak error behaviour
inside the pam sssd module itself (by changing it's 'control' field).
This is also required to get admin login for Google OS Login working
(#51566), as their pam_oslogin_admin accounts module takes care of sudo
configuration.
A module for security options that are too small to warrant their own module.
The impetus for adding this module is to make it more convenient to override
the behavior of the hardened profile wrt user namespaces.
Without a dedicated option for user namespaces, the user needs to
1) know which sysctl knob controls userns
2) know how large a value the sysctl knob needs to allow e.g.,
Nix sandbox builds to work
In the future, other mitigations currently enabled by the hardened profile may
be promoted to options in this module.
I think pam_lastlog is the only thing that writes to these files in
practice on a modern Linux system, so in a configuration that doesn't
use that module, we don't need to create these files.
I used tmpfiles.d instead of activation snippets to create the logs.
It's good enough for upstream and other distros; it's probably good
enough for us.
The order of sudoers entries is significant. The man page for sudoers(5)
notes:
Where there are multiple matches, the last match is used (which is not
necessarily the most specific match).
This module adds a rule for group "wheel" matching all commands. If you
wanted to add a more specific rule allowing members of the "wheel" group
to run command `foo` without a password, you'd need to use mkAfter to
ensure your rule comes after the more general rule.
extraRules = lib.mkAfter [
{
groups = [ "wheel" ];
commands = [
{
command = "${pkgs.foo}/bin/foo";
options = [ "NOPASSWD" "SETENV" ];
}
]
}
];
Otherwise, when configuration options are merged, if the general rule
ends up after the specific rule, it will dictate the behavior even when
running the `foo` command.
This introduces an option that allows us to turn off stateful generation
of Diffie-Hellman parameters, which in some way is still "stateful" as
the generated DH params file is non-deterministic.
However what we can avoid with this is to have an increased surface for
failures during system startup, because generation of the parameters is
done during build-time.
Aside from adding a NixOS VM test it also restructures the type of the
security.dhparams.params option, so that it's a submodule.
A new defaultBitSize option is also there to allow users to set a
system-wide default.
I added a release notes entry that described what has changed and also
included a few notes for module developers using this module, as the
first usage already popped up in NixOS/nixpkgs#39507.
Thanks to @Ekleog and @abbradar for reviewing.
This allows to set the default bit size for all the Diffie-Hellman
parameters defined in security.dhparams.params and it's particularly
useful so that we can set it to a very low value in tests (so it doesn't
take ages to generate).
Regardless for the use in testing, this also has an impact in production
systems if the owner wants to set all of them to a different size than
2048, they don't need to set it individually for every params that are
set.
I've added a subtest to the "dhparams" NixOS test to ensure this is
working properly.
Signed-off-by: aszlig <aszlig@nix.build>
@Ekleog writes in https://github.com/NixOS/nixpkgs/pull/39526:
> I think a default of 4096 is maybe too much? See certbot/certbot#4973;
> Let's Encrypt supposedly know what they are doing and use a
> pre-generated 2048-bit DH params (and using the same DH params as
> others is quite bad, even compared to lower bit size, if I correctly
> remember the attacks available -- because it increases by as much the
> value of breaking the group).
> Basically I don't have anything personal against 4096, but fear it may
> re-start the arms race: people like having "more security" than their
> distributions, and having NixOS already having more security than is
> actually useful (I personally don't know whether a real-size quantum
> computer will come before or after our being able to break 2048-bit
> keys, let alone 3072-bit ones -- see wikipedia for some numbers).
> So basically, I'd have set it to 3072 in order to both decrease build
> time and avoid having people setting it to 8192 and complaining about
> how slow things are, but that's just my opinion. :)
While he suggests is 3072 I'm using 2048 now, because it's the default
of "openssl dhparam". If users want to have a higher value, they can
still change it.
Signed-off-by: aszlig <aszlig@nix.build>
Previously the script would contain an empty `if` block (which is invalid
syntax) if both `data.activationDelay == null` and `data.postRun == ""`. Fix
this by adding a no-op `true`.
First of all let's start with a clean up the multiline string
indentation for descriptions, because having two indentation levels
after description is a waste of screen estate.
A quick survey in the form of the following also reveals that the
majority of multiline strings in nixpkgs is starting the two beginning
quotes in the same line:
$ find -name '*.nix' -exec sed -n -e '/=$/ { n; /'\'\''/p }' {} + | wc -l
817
$ find -name '*.nix' -exec grep "= *'' *\$" {} + | wc -l
14818
The next point is to get the type, default and example attributes on top
of the description because that's the way it's rendered in the manual.
Most services have their enable option close to the beginning of the
file, so let's move it to the top.
Also, I found the script attribute for dhparams-init.service a bit hard
to read as it was using string concatenation to split a "for" loop.
Now for the more substantial clean ups rather than just code style:
* Remove the "with lib;" at the beginning of the module, because it
makes it easier to do a quick check with "nix-instantiate --parse".
* Use ConditionPathExists instead of test -e for checking whether we
need to generate the dhparams file. This avoids spawning a shell if
the file exists already and it's probably more common that it will
exist, except for the initial creation of course.
* When cleaning up old dhparams file, use RemainAfterExit so that the
unit won't be triggered again whenever we stop and start a service
depending on it.
* Capitalize systemd unit descriptions to be more in par with most
other unit descriptions (also see 0c5e837b66).
* Use "=" instead of "==" for conditionals using []. It's just a very
small nitpick though and it will only fail for POSIX shells. Bash on
the other side accepts it anyway.
Signed-off-by: aszlig <aszlig@nix.build>
Cc: @Ekleog
This option allows us to turn off stateful generation of Diffie-Hellman
parameters, which in some way is still stateful as the generated DH
params file is non-deterministic.
However what we can avoid with this is to have an increased surface for
failures during system startup, because generation of the parameters is
done during build-time.
Another advantage of this is that we no longer need to take care of
cleaning up the files that are no longer used and in my humble opinion I
would have preferred that #11505 (which puts the dhparams in the Nix
store) would have been merged instead of #22634 (which we have now).
Luckily we can still change that and this change gives the user the
option to put the dhparams into the Nix store.
Beside of the more obvious advantages pointed out here, this also
effects test runtime if more services are starting to use this (for
example see #39507 and #39288), because generating DH params could take
a long time depending on the bit size which adds up to test runtime.
If we generate the DH params in a separate derivation, subsequent test
runs won't need to wait for DH params generation during bootup.
Of course, tests could still mock this by force-disabling the service
and adding a service or activation script that places pre-generated DH
params in /var/lib/dhparams but this would make tests less readable and
the workaround would have to be made for each test affected.
Note that the 'stateful' option is still true by default so that we are
backwards-compatible with existing systems.
Signed-off-by: aszlig <aszlig@nix.build>
Cc: @Ekleog, @abbradar, @fpletz
We're going to implement an option which allows us to turn off stateful
handling of Diffie-Hellman parameter files by putting them into the Nix
store.
However, modules now might need a way to reference these files, so we
add a now path option to every param specified, which carries a
read-only value of the path where to find the corresponding DH params
file.
I've also improved the description of security.dhparams.params a bit so
that it uses <warning/> and <note/>.
The NixOS VM test also reflects this change and checks whether the old
way to specify the bit size still works.
Signed-off-by: aszlig <aszlig@nix.build>
Cc: @Ekleog
This is needed because simp_le expects two certificates in fullchain.pem, leading to error:
> Not enough PEM encoded messages were found in fullchain.pem; at least 2 were expected, found 1.
We now create a CA and sign the key with it instead, providing correct fullchain.pem.
Also cleanup service a bit -- use PATH and a private temporary directory (which
is more suitable).
Kernel symlinks don't have st_size. Really thought I tested this, guess I ran the
wrong NixOS test :(
This reverts commit 6dab907ebe, reversing
changes made to eab479a5f0.