Massively reduce the time it takes running the test by building a
proper root disk image and increasing the virtualized core count to
4. This should make it much easier for the tests to pass even on
weaker systems.
With my laptop (AMD Ryzen 7 PRO 2700U) as the reference system, I see
the following test run times:
- No change:
Times out after 28 mins
- Building a root image:
7 mins, 48 secs
- Building a root image and bumping the core count:
7 mins, 17 secs
The times include the time it takes to build the image
(~1 min, 20 secs).
Massively reduce the time it takes running the test by building a
proper root disk image and increasing the virtualized core count to
4. This should make it much easier for the tests to pass even on
weaker systems.
With my laptop (AMD Ryzen 7 PRO 2700U, 32GB RAM) as the reference
system, I see the following test run times:
- No change:
25 mins, 49 secs
- Building a root image:
4 mins, 44 secs
- Building a root image and bumping the core count:
3 mins, 6 secs
The times include the time it takes to build the image (~40 secs).
pathsInNixDB isn't a very accurate name when a Nix store image is
built (virtualisation.useNixStoreImage); rename it to additionalPaths,
which should be general enough to cover both cases.
The existing tests for HDFS and YARN only check if the services come up and expose their web interfaces.
The new combined hadoop test will also test whether the services and roles work together as intended.
It spin up an HDFS+YARN cluster and submit a demo YARN application that uses the hadoop cluster for
storageand yarn cluster for compute.
The version 20 of Nextcloud will be EOLed by the end of this month[1].
Since the recommended default (that didn't raise an eval-warning) on
21.05 was Nextcloud 21, this shouldn't affect too many people.
In order to ensure that nobody does a (not working) upgrade across
several major-versions of Nextcloud, I replaced the derivation of
`nextcloud20` with a `throw` that provides instructions how to proceed.
The only case that I consider "risky" is a setup upgraded from 21.05 (or
older) with a `system.stateVersion` <21.11 and with
`services.nextcloud.package` not explicitly declared in its config. To
avoid that, I also left the `else-if` for `stateVersion < 21.03` which
now sets `services.nextcloud.package` to `pkgs.nextcloud20` and thus
leads to an eval-error. This condition can be removed
as soon as 21.05 is EOL because then it's safe to assume that only
21.11. is used as stable release where no Nextcloud <=20 exists that can
lead to such an issue.
It can't be removed earlier because then every `system.stateVersion <
21.11` would lead to `nextcloud21` which is a problem if `nextcloud19`
is still used.
[1] https://docs.nextcloud.com/server/20/admin_manual/release_schedule.html
during the rewrite the checkPasswords=false feature of the old module
was lost. restore it, and with it systems that allow any client to use
any username.
borg is able to process stdin during backups when backing up the special path -,
which can be very useful for backing up things that can be streamed (eg database
dumps, zfs snapshots).
mosquitto needs a lot of attention concerning its config because it doesn't
parse it very well, often ignoring trailing parts of lines, duplicated config
keys, or just looking back way further in the file to associated config keys
with previously defined items than might be expected.
this replaces the mosquitto module completely. we now have a hierarchical config
that flattens out to the mosquitto format (hopefully) without introducing spooky
action at a distance.
This commit changes a lot more that you'd expect but it also adds a lot
of new testing code so nothing breaks in the future. The main change is
that sockets are now restarted when they change. The main reason for
the large amount of changes is the ability of activation scripts to
restart/reload units. This also works for socket-activated units now,
and honors reloadIfChanged and restartIfChanged. The two changes don't
really work without each other so they are done in the one large commit.
The test should show what works now and ensure it will continue to do so
in the future.
allows configuration of foo-over-udp decapsulation endpoints. sadly networkd
seems to lack the features necessary to support local and peer address
configuration, so those are only supported when using scripted configuration.
A few minor changes to get #119638 - nextcloud: add option to set
datadir and extensions - ready:
* `cfg.datadir` now gets `cfg.home` as default to make the type
non-nullable.
* Enhanced the `basic` test to check the behavior with a custom datadir
that's not `/var/lib/nextcloud`.
* Fix hashes for apps in option example.
* Simplify if/else for `appstoreenable` in override config.
* Simplify a few `mapAttrsToList`-expressions in
`nextcloud-setup.service`.
The multipath-tools package had existed in Nixpkgs for some time but
without a nixos module to configure/drive it. This module provides
attributes to drive the majority of multipath configuration options
and is being successfully used in stage-1 and stage-2 boot to mount
/nix from a multipath-serviced iSCSI volume.
Credit goes to @grahamc for early contributions to the module and
authoring the NixOS module test.
Don't worry, it's is true by default. But I think this is important to
have because NixOS indeed shouldn't need Nix at run time when the
installation is not being modified, and now we can verify that.
NixOS images that cannot "self-modify" are a legitamate
use-case that this supports more minimally. One should be able to e.g. do a
sshfs mount and use `nixos-install` to modify them remotely, or just
discard them and build fresh ones if they are run VMs or something.
The next step would be to make generations optional, allowing just
baking `/etc` and friends rather than using activation scripts. But
that's more involved so I'm leaving it out.
The MariaDB version 10.6 doesn't seem supported with current Nextcloud
versions and the test fails with the following error[1]:
nextcloud # [ 14.950034] nextcloud-setup-start[1001]: Error while trying to initialise the database: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 4047 InnoDB refuses to write tables with ROW_FORMAT=COMPRESSED or KEY_BLOCK_SIZE.
According to a support-thread in upstream's Discourse[2] this is because
of a missing support so far.
Considering that we haven't received any bugreports so far - even though
the issue already exists on master - and the workaround[3] appears to
work fine, an evaluation warning for administrators should be
sufficient.
[1] https://hydra.nixos.org/build/155015223
[2] https://help.nextcloud.com/t/update-to-next-cloud-21-0-2-has-get-an-error/117028/15
[3] setting `innodb_read_only_compressed=0`
Moving the service before multi-user.target (so the `hardened` test
continue to work the way it did before) can result in locking the kernel
too early. It's better to lock it a bit later and changing the test to
wait specifically for the disable-kernel-module-loading.service.
The current name is misleading: it doesn't contain cli arguments,
but several constants and utility functions related to qemu.
This commit also removes the use of `with import ...` for clarity.
Previously the influxdb exporter test was flaky as even after the
service has started there is still a race before the service is actually
listening and accepting connection on port 9122.
With this commit the test will wait for the port to be open before
proceeding.
Hydra accepts timeouts as value of seconds after which the test is
terminated / considered failed. Using the value 30 here has the effect
that the test was terminate after 30 seconds. That time might be
sufficient for the test execution itself but it has another downside:
Jobs on hydra inherit the timeout of their parent. In this case all the
builds that are a dependency of the herbstluftwm test *must* finish
(each) within 30s. And since not all of the dependencies are cached in
the binary cache this could lead to an issue with pacakges that take
longer than 30s to build at the time when the herbstluftwm test is built
by hydra.
It is best to not set the timeout here and let hydra deal with it. Our
default timeout for builds is two hours which is more than sufficient
for most builds and tests. If the test fails we will spent ~2h doing
something or nothing at worst but at least we wont kill the build just
because a dependency wasn't fullfilled already.
This updates systemd to version v249.4 from version v247.6.
Besides the many new features that can be found in the upstream
repository they also introduced a bunch of cleanup which ended up
requiring a few more patches on our side.
a) 0022-core-Handle-lookup-paths-being-symlinks.patch:
The way symlinked units were handled was changed in such that the last
name of a unit file within one of the unit directories
(/run/systemd/system, /etc/systemd/system, ...) is used as the name
for the unit. Unfortunately that code didn't take into account that
the unit directories themselves could already be symlinks and thus
caused all our units to be recognized slightly different.
There is an upstream PR for this new patch:
https://github.com/systemd/systemd/pull/20479
b) The way the APIVFS is setup has been changed in such a way that we
now always have /run. This required a few changes to the
confinement tests which did assert that they didn't exist. Instead of
adding another patch we can just adopt the upstream behavior. An
empty /run doesn't seem harmful.
As part of this work I refactored the confinement test just a little
bit to allow better debugging of test failures. Previously it would
just fail at some point and it wasn't obvious which of the many
commands failed or what the unexpected string was. This should now be
more obvious.
c) Again related to the confinement tests the way a file was tested for
being accessible was optimized. Previously systemd would in some
situations open a file twice during that check. This was reduced to
one operation but required the procfs to be mounted in a units
namespace.
An upstream bug was filed and fixed. We are now carrying the
essential patch to fix that issue until it is backported to a new
release (likely only version 250). The good part about this story is
that upstream systemd now has a test case that looks very similar to
one of our confinement tests. Hopefully that will lead to less
friction in the long run.
https://github.com/systemd/systemd/issues/20514https://github.com/systemd/systemd/pull/20515
d) Previously we could grep for dlopen( somewhat reliably but now
upstream started using a wrapper around dlopen that is most of the
time used with linebreaks. This makes using grep not ergonomic
anymore.
With this bump we are grepping for anything that looks like a
dynamic library name (in contrast to a dlopen(3) call) and replace
those instead. That seems more robust. Time will tell if this holds.
I tried using coccinelle to patch all those call sites using its
tooling but unfornately it does stumble upon the _cleanup_
annotations that are very common in the systemd code.
e) We now have some machinery for libbpf support in our systemd build.
That being said it doesn't actually work as generating some skeletons
doesn't work just yet. It fails with the below error message and is
disabled by default (in both minimal and the regular build).
> FAILED: src/core/bpf/socket_bind/socket-bind.skel.h
> /build/source/tools/build-bpf-skel.py --clang_exec /nix/store/x1bi2mkapk1m0zq2g02nr018qyjkdn7a-clang-wrapper-12.0.1/bin/clang --llvm_strip_exec /nix/store/zm0kqan9qc77x219yihmmisi9g3sg8ns-llvm-12.0.1/bin/llvm-strip --bpftool_exec /nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool --arch x86_64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h
> libbpf: elf: socket_bind_bpf is not a valid eBPF object file
> Error: failed to open BPF object file: BPF object format invalid
> Traceback (most recent call last):
> File "/build/source/tools/build-bpf-skel.py", line 128, in <module>
> bpf_build(args)
> File "/build/source/tools/build-bpf-skel.py", line 92, in bpf_build
> gen_bpf_skeleton(bpftool_exec=args.bpftool_exec,
> File "/build/source/tools/build-bpf-skel.py", line 63, in gen_bpf_skeleton
> skel = subprocess.check_output(bpftool_args, universal_newlines=True)
> File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 424, in check_output
> return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
> File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 528, in run
> raise CalledProcessError(retcode, process.args,
> subprocess.CalledProcessError: Command '['/nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool', 'g', 's', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 255.
> [102/1457] Compiling C object src/journal/libjournal-core.a.p/journald-server.c.oapture output)put)ut)
> ninja: build stopped: subcommand failed.
f) We do now have support for TPM2 based disk encryption in our
systemd build. The actual bits and pieces to make use of that are
missing but there are various ongoing efforts in that direction.
There is also the story about systemd in our initrd to enable this
being used for root volumes. None of this will yet work out of the
box but we can start improving on that front.
g) FIDO2 support was added systemd and consequently we can now use
that. Just with TPM2 there hasn't been any integration work with
NixOS and instead this just adds that capability to work on that.
Co-Authored-By: Jörg Thalheim <joerg@thalheim.io>
I currently do not have much time to work on nixpkgs. Remove
myself as a maintainer from a bunch of packages to avoid that
people are waiting on me for a review.
The primary use case is tools like sops-nix and agenix to restart units
when secrets change. There's probably other reasons to restart units as
well and a nice thing to have in general.
These are all packages that I stopped using and hence just create noise
in my inbox for each change affecting them and let's face it, while I
still enjoy contributing to nixpkgs, it doesn't really make sense to be
listed there if I can't do much anyways.
Each of these packages can be taken over by someone or removed if
people think that's reasonable.
Of course, if other maintainers face issues, I can answer some questions
if needed & possible.
This doesn't work anymore and thus breaks the installation leaving a
broken `/var/lib/nextcloud`.
It isn't a big deal since we set this value in the override config
before, so the correct table-prefix is still used. In order to confirm
that, I decided to add a custom prefix to the basic test.
twisted is used in matrix-synapse for smtp handling.
Mostly this is used for password resets, but also notifications
are delivered that way.
older versions of twisted require the e-mail server to
have TLS1.0 enabled.
Obviously, quite a lot of servers have this disabled which means
synapse won't be able to deliver mails using such servers.
matrix-synapse issue:
https://github.com/matrix-org/synapse/issues/6211
Deluge 1.x requires Python 2 which upstream has end-of-lifed. Deluge depends
on pythonPackages.twisted, Python 2 support for which upstream has
nowdropped. If pythonPackages.twisted is upgraded then Deluge 1.x breaks.
So, remove it instead of leaving it broken.
Deluge 2.x (deluge-2_x) is available and continues to work.
Perform the tests on the package that the `tests` attribute is a child
of, i.e. if `discourseAllPlugins.tests` is built, the tests will run
with the `discourseAllPlugins` package, not the `discourse` package as
previously.
The problem behind this is that the hardened patchset[1]. Quite recently
this led to a weird problem when Linux 5.12 was dropped (and thus had to
be removed from `nixpkgs`), there were no patches for 5.13, so
`linuxPackages_hardened_latest` had to be downgraded to 5.10 as base[2]
which may be rather unintuitive and unexpected.
To avoid these kind of "silent downgrades" in the future, it makes sense
to drop the attribute entirely. If somebody wants to use a hardened
kernel, it's better to explicitly pin it using the newly introduced
versioned attributes, e.g. `linuxPackages_4_14_hardened`.
[1] https://github.com/anthraxx/linux-hardened/
[2] https://github.com/NixOS/nixpkgs/pull/133587
The paperless project has moved on to paperless-ng and the original
paperless package in Nixpkgs has stopped working recently (due to
version incompatibility with the providede Django package).
Instead of investing more time into the old module we should migrate all
users to the new module instead.
Previously, for processes launched by doas the unwrapped doas binary preceded the
setuid-wrapped doas binary in PATH.
This caused error `doas: not installed setuid` when running doas from
processes launched by doas.
doas seems to short-circuit the PATH lookup when called like
`doas -u myuser doas -u myuser ...` so the error doesn't appear in this case.
I expect it suffices that the channel only blocks on one firefox ESR
test - the one for the default ESR. I didn't want to have the
information about the default in two places, so either of the tests will
be evaluated twice (but to the same *.drv I hope).
Analogous to 6325d15e90.
The test certificate expiration date was set to the default 30 days.
This certificate is generated through its own derivation. As with
every derivation, it gets cached by cache.nixos.org once we build it.
In practice, we rebuild this derivation only if one of its input
changes. The only inputs here being openssl and stdenv.
While it's not an issue on the unstable branches, it can be
problematic on a stable release: the test will fail after 30 days.
Extending the certificate lifespan from 1 month to 100 years to prevent
it from getting expired while being cached.
There is no generic services.kea.enable option. Instead kea consists of
four daemons (dhcp4, dhcp6, ddns, ctrlagent) that can be enabled
individually. In this test we're just looking at dhcp6.