Previous to this commit, the entire test driver environment was shared
with the actual python test environment.
This is a hefty api surface. This commit selectively exposes only those
symbols to the test environment that are actually meant to be used by
tests.
This is the case when the test-script is empty. `nixos-build-vms(8)` is
primarily supposed to be used as tool to test changes or to reproduce
bugs (IMHO) where "just spinning up a few VMs" is the primary use-case.
In the ongoing discussion about these changes[1] it was suggested to
only expose it when needed (i.e. in the case I described above) to keep
the API surface as slim as possible.
[1] https://github.com/NixOS/nixpkgs/pull/133675#discussion_r688112485
This is relevant for `nixos-build-vms(8)` which doesn't have a
test-script. In that case it's more intuitive to directly go into the
interactive mode which is IMHO more intuitive.
Previously the driver was configured exclusively through convoluted
environment variables.
Now the driver's defaults are configured through env variables.
Some additional concerns are in the github comments of this PR.
the use of python further restricts possible RFC1035 host labels since
dash is not allowed for use in python identifiers.
The previous implementation of this check was flawed, since it did not
check the `hostName` value that is actually used to construe the
identifier, but the node name, which can be anything, e.g. just `machine`.
The previous implementation, by further restricting RFC1035 labels, only
for the sake of testing seems to be an unacceptable restriction and should
be addressed separately.
The disk image calculator was using find + exec forking du for every
file in the disk image, making it very slow. Change du to accept files,
nul delimeted on stdin to speed it back up.
Before change:
nix-build nixos/tests/image-contents.nix 9.71s user 1.06s system 8% cpu 2:13.11 total
After change:
nix-build nixos/tests/image-contents.nix 9.93s user 1.23s system 21% cpu 51.601 total
nixos tests are blended with other system configurations, hence
their settings must be either enforced or defaulted.
This particular setting is set via lib.nixosSystem as
`system.nixos.revision = final.mkIf (self ? rev) self.rev;` which would
mean that without this change no flake generated nixos could be blended
with nixos testing.
This setting was made previously constant in
169c6b4b14 in order to avoid pointless
rebuilds of the testing VMs, but was set without enforcing it.
Apparently this looks like it was forgotten when doing commit
3884ff70ba, which refactored the test
runner and driver a bit.
The passthru argument actually was correctly reintroduced in
setupDriverForTest, but the actual makeTest function didn't use it.
This fixes the nixpkgs tarball job, which previously failed with:
attribute 'elkPackages' missing, at /build/source/pkgs/tools/misc/logstash/6.x.nix:58:30
Signed-off-by: aszlig <aszlig@nix.build>
Acked-by: David Arnold <dar@xoe.solutions>
Fixes: https://github.com/NixOS/nixpkgs/issues/127274
Merges: https://github.com/NixOS/nixpkgs/pull/127346
Less nesting, where that improves readability. More nesteing, where
that improves readability, but most importantly:
Expose individual functions separately so that they can be more easily
built directly, eg.:
`nix build --impure --expr '(import ./testing-python.nix {system = builtins.currentSystem;}).mkTestDriver'`
At nixpkgs root:
`rg redirectSerial ./` does not result in any other match
nor does
`rg USE_SERIAL ./` except for an unrelated match in:
pkgs/tools/graphics/argyllcms/default.nix
Bash's standard behavior of not propagating non-zero exit codes
through a pipeline is unexpected and almost universally
unwanted. Default to setting `pipefail` for the command being run;
it can still be turned off by prefixing the pipeline with
`set +o pipefail` if needed.
Also, set `errexit` and `nonunset` options to make the first command
of consecutive commands separated by `;` fail, and disallow
dereferencing unset variables respectively.
When importing Nixpkgs within Nixpkgs, we should not consider aliases
to ensure we don't rely on them internally.
There are probably more places that need to be converted.
The root filesystem resizing step, `resize2fs -M', does not provide any
control over the amount of slack left in the result. It can produce an
arbitrarily tight fit, depending on how well the payload aligns with
ext4 data structures.
This is problematic, as NixOS must create a few files and directories
during its first boot, before the root is enlarged to match the size of
the containing SD card.
An overly tight fit can cause failures in the first stage:
mkdir: can't create directory '/mnt-root/proc': No space left on device
or in the second stage:
install: cannot create directory '/var': No space left on device
A previous version of `make-ext4-fs' (before PR #79368) was explicitly
"reserving" 16 MiB of free space in the final filesystem. Manually
calculating the size of an ext4 filesystem is a perilous endeavor,
however, and the method it employed was apparently unreliable.
Reverting is consequently not a good option.
A solution would be to create some sort of "balloon" occupying inodes
and blocks in the image prior to invoking `resize2fs -M', and to remove
these temporary files/directories before the compression step.
This changeset takes the simpler approach of simply dropping the
resizing step.
Note that this does *not* result in a larger image in general, as the
current procedure does not truncate the `.img' file anyway. In fact, it
has been observed to yield *smaller* compressed images---probably
because of some "noise" left after resizing. E.g., before-vs-after:
-r--r--r-- 2 root root 607M 1. Jan 1970 nixos-sd-image-21.11pre-git-x86_64-linux.img.zst
-r--r--r-- 2 root root 606M 1. Jan 1970 nixos-sd-image-21.11pre-git-x86_64-linux.img.zst
For now you had to know that the actions are retried for 900s when
seeing an error like
> Traceback (most recent call last):
> File "/nix/store/dbvmxk60sv87xsxm7kwzzjm7a4fhgy6y-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 927, in run_tests
> exec(tests, globals())
> File "<string>", line 1, in <module>
> File "<string>", line 31, in <module>
> File "/nix/store/dbvmxk60sv87xsxm7kwzzjm7a4fhgy6y-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 565, in wait_for_file
> retry(check_file)
> File "/nix/store/dbvmxk60sv87xsxm7kwzzjm7a4fhgy6y-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 142, in retry
> raise Exception("action timed out")
> Exception: action timed out
in your (hydra) build failure. Due to the absence of timestamps you were
left guessing if the machine was just slow, someone passed a low timeout
value (which they couldn't until now) or whatever might have happened.
By making this error a bit more descriptive (by including the elapsed
time) these hopefully become more useful.
Our test driver exposes a bunch of variables and functions, which
pyflakes doesn't recognise by default because it assumes that the test
script is executed standalone. In reality however the test driver script
is using exec() on the testScript.
Fortunately pyflakes has $PYFLAKES_BUILTINS, which are the attributes
that are globally available on all modules to be checked. Since we only
have one module, using this environment variable is fine as opposed to
my first approach to this, which tried to use the unstable internal API
of pyflakes.
The attributes are gathered by the main derivation of the test driver,
because we don't want to end up defining a new attribute in the test
driver module just to being confused why using it in a test will result
in an error.
Another way we could have gathered these attributes would be in
mkDriver, which is where the linting takes place. However, we do have a
different set of Python dependencies in scope and duplicating these will
again just cause confusion over having it at one location only.
Signed-off-by: aszlig <aszlig@nix.build>
Co-Authored-By: aszlig <aszlig@nix.build>
So far, we have used "black" for formatting the test code, which is
rather strict and opinionated and when used inline in Nix expressions it
creates all sorts of trouble.
One of the main annoyances is that when using strings coming from Nix
expressions (eg. store paths or option definitions from NixOS modules),
completely unrelated changes could cause tests to fail, since eg. black
wants lines to be broken.
Another downside of enforcing a certain kind of formatting is that it
makes the Nix expression code inconsistent because we're mixing two
spaces of indentation (common in nixpkgs) with four spaces of
indentation as defined in PEP-8. While this is perfectly fine for
standalone Python files, it really looks ugly and inconsistent IMO when
used within Nix strings.
What we actually want though is a linter that catches problems early on
before actually running the test, because this is *actually* helping in
development because running the actual VM test takes much longer.
This is the reason why I switched from black to pyflakes, because the
latter actually has useful checks, eg. usage of undefined variables,
invalid format arguments, duplicate arguments, shadowed loop vars and
more.
Signed-off-by: aszlig <aszlig@nix.build>
Closes: https://github.com/NixOS/nixpkgs/issues/72964
On my system I have XWayland disabled and therefore only WAYLAND_DISPLAY
is set. This ensures that the graphical output will still be enabled on
such setups (both Wayland and X11 are supported by the viewer).
Work around missing /dev files inside runInLinuxVM by creating a
symlink before calling nixos-enter.
This fixes https://github.com/NixOS/nixpkgs/issues/93381.
I ran into this issue when trying to create a VMware image that boots from EFI.
Thanks @colemickens for reporting this and @danielfullmer for fixing the same thing in in qemu-vm.nix (37676e77cb) and explaining what the issue was.
This ensures the following gptfdisk warning won't happen:
```
Warning: File size is not a multiple of 512 bytes! Misbehavior is likely!
```
Additionally, helps towards aligning the partition to be more optimal
for the underlying storage.
It is actually impossible to align for the actual underlying storage
optimally because we don't know what the block device will be!
But aligning on 1MiB should help.
This is a bit of a thorny issue. See, the actual `diskSize` variable is
for the *total* disk size, not for the filesystem!
The automatic numbers are meant to compute the *filesystem* required
space. So we have to add any other reserved space!
We have different requirements for reserved space. E.g. there could be
none (when it's actually a filesystem image). There could also be 1MiB
for alignment for an MBR image, legacy+gpt needs 2MiB, then GPT with an
ESP ("bootSize") needs to take the boot partition and GPT size into
account too!
Though luckily(?) for this latter situation we can cheat! As noted in the
change, `bootSize` is NOT the boot partition size. It is actually the
offset where the target filesystem starts.