This hopefully fixes intermittent test failures like
http://hydra.nixos.org/build/29962437
router# [ 240.128835] INFO: task mke2fs:99 blocked for more than 120 seconds.
router# [ 240.130135] Not tainted 3.18.25 #1-NixOS
router# [ 240.131110] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
assuming that these are caused by high load on the host.
... because we make it built-in by default.
I can't imagine anyone who wanted to purge this module from his/her system,
so let's keep it simple, at least for now.
This commit adds the options --build-host and --target-host to nixos-rebuild.
--build-host instructs nixos-rebuild to perform all nix builds on the
specified host (via ssh). Build results are then copied back to the
local machine and used when activating the system.
--build-target instructs nixos-rebuild to activate the configuration
not on the local machine but on the specified remote host. Build
results are copied to the target machine and then activated there (via ssh).
It is possible to combine the usage of --build-host and --target-host,
in which case you can perform the build on one remote machine and deploy
the configuration to another remote machine. The only requirement is that
the build host has a working ssh connection to the target host (if the
target is not local), and that the local machine can connect to both
the target and the build host. Also, your user must be allowed to copy
nix closures between the local machine and the target and host machines.
At no point in time are the configuration sources (the nix files) copied
anywhere. Instead, nix evaluation always happens locally
(with nix-instantiate). The drv-file is then copied and realised remotely
(with nix-store).
As a convenience, if only --target-host is specified, --build-host is
implicitly set to that host too. So if you want to build locally and deploy
remotely you have to explicitly set "--build-host localhost".
To activate (test, boot or switch) you need to have root access to the
target host. You can specify this by "--target-host root@myhost".
I have tested the obvious scenarios and they are working. Some of the
combinations of --build-host and --target-host and the various actions might
not make much sense, and should maybe be forbidden (like setting a remote
target host when building a VM), and some combinations might not work at all.
Previously this barfed with:
updating GRUB 2 menu...
fileparse(): need a valid pathname at /nix/store/zldbbngl0f8g5iv4rslygxwp0dbg1624-install-grub.pl line 391.
warning: error(s) occured while switching to the new configuration
* Patched fish to load /etc/fish/config.fish if it exists (by default,
it only loads config relative to itself)
* Added fish-foreign-env package to parse the system environment
closes#5331
We seem to be in an unfortunate situation: booting without 'nomodeset'
causes hangs when booting on some NVIDIA cards (6948c3ab80), but on the
other hand adding 'nomodeset' prevents X from starting on other hardware
(e.g. issue #10381 and my Thinkpad X250 with an integrated Broadwell GPU).
Attempt to remedy this situation a bit by adding a separate entry in the
ISOLINUX menu (with the non-'nomodeset' being the default).
The docker module used different code for socket-activated docker daemon than for the non-socket activated daemon.
In particular, if the socket-activated daemon is used, then modprobe wasn't set up to be usable and in PATH for
the docker daemon, which resulted in a failure to start the daemon with overlayfs as storageDriver if the
`overlay` kernel module wasn't already loaded. This commit fixes that bug (which only appears if socket
activation is used), and also reduces the duplication between code paths so that it's easier to keep
both in sync in future.
I think the name 'listenAddress' is more descriptive. Other NixOS
modules that define 'host' either use it as listen address or as address
a client connects to. listenAddress is unambiguous.
The addition of 'host' was added earlier today[1], so not bothering with
./nixos/modules/rename.nix.
[1]: 44ea184997 ("jenkins ci enhancement: add port and prefix option")
As named these options enable to specify a bind host and url prefix
to be used by jenkins. Adding these options in the config rather than
using extra arguments allows us to re-use those information in other
services using jenkins such as jenkins-job-builder or a reverse proxy.
Add new option declarations to control what information is published
by the avahi daemon. The default values are chosen to respect the
privacy of the user over the connectivity of the system.
It serves as a regression test, because right now if you enable
networking.useNetworkd the default loopback interface doesn't get
assigned any IP addresses.
To be sure, I have bisected this and it has been introduced with the
update to systemd 228 in 1da87d4.
Only the "scripted" networking tests have to succeed in order to trigger
a channel update of nixos-unstable, so I'm leaving this test as broken
and we have to figure out next what's the *exact* reason for the
breakage.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
The kernel default for `link_power_management_policy` is `"max_performance"`.
This commit:
f169f60575
set the NixOS default to `"min_performance"`.
This issue (https://github.com/NixOS/nixpkgs/issues/11276) details my long
journey to discover this after several file system failures incorrectly
attributed to `TRIM` and `NCQ` settings.
I think we should use the kernel default of `"max_performance"` to assure
the best experience for new users with SSDs and to conform to the defaults of
the kernel and other distros.
The three KDE package sets now have circular dependencies between them,
so they can only be built if they are merged into a single package set
during evaluation.
- if xserver.tty and/or display are set to null, then don't specify
them, or the -logfile argument in the xserverArgs
- For lightdm, we set default tty and display to null and we determine
those at runtime based on arguments passed. This is necessary because
we run multiple X servers so they can't all be on the same display
This reverts commit 02b568414d.
With a5bc11f and 6353f58 in place, we really don't need this anymore.
After running about 500 VM tests on my Hydra, it still didn't improve
very much.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
As @domenkozar noted in #10828, cache=writeback seems to do more harm
than good:
https://github.com/NixOS/nixpkgs/issues/10828#issuecomment-164426821
He has tested it using the openstack NixOS tests and found that
cache=none significantly improves startup performance.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
This seems to be the root cause of the random page allocation failures
and @wizeman did a very good job on not only finding the root problem
but also giving a detailed explanation of it in #10828.
Here is an excerpt:
The problem here is that the kernel is trying to allocate a contiguous
section of 2^7=128 pages, which is 512 KB. This is way too much:
kernel pages tend to get fragmented over time and kernel developers
often go to great lengths to try allocating at most only 1 contiguous
page at a time whenever they can.
From the error message, it looks like the culprit is unionfs, but this
is misleading: unionfs is the name of the userspace process that was
running when the system ran out of memory, but it wasn't unionfs who
was allocating the memory: it was the kernel; specifically it was the
v9fs_dir_readdir_dotl() function, which is the code for handling the
readdir() function in the 9p filesystem (the filesystem that is used
to share a directory structure between a qemu host and its VM).
If you look at the code, here's what it's doing at the moment it tries
to allocate memory:
buflen = fid->clnt->msize - P9_IOHDRSZ;
rdir = v9fs_alloc_rdir_buf(file, buflen);
If you look into v9fs_alloc_rdir_buf(), you will see that it will try
to allocate a contiguous buffer of memory (using kzalloc(), which is a
wrapper around kmalloc()) of size buflen + 8 bytes or so.
So in reality, this code actually allocates a buffer of size
proportional to fid->clnt->msize. What is this msize? If you follow
the definition of the structures, you will see that it's the
negotiated buffer transfer size between 9p client and 9p server. On
the client side, it can be controlled with the msize mount option.
What this all means is that, the reason for running out of memory is
that the code (which we can't easily change) tries to allocate a
contiguous buffer of size more or less equal to "negotiated 9p
protocol buffer size", which seems to be way too big (in our NixOS
tests, at least).
After that initial finding, @lethalman tested the gnome3 gdm test
without setting the msize parameter at all and it seems to have resolved
the problem.
The reason why I'm committing this without testing against all of the
NixOS VM test is basically that I think we can only go better but not
worse than the current state.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
- Add new service for `clamd`, the ClamAV daemon.
- Replace the old upstart "jobs" section with systemd.services
- Remove unnecessary config options.
- Use `mkEnableOption`
We hit page allocation failures a lot at random for VM tests, in case of
my own Hydra when it comes to the installer tests. The reason for this
is that once the memory of the VM gets heavily fragmented the kernel is
unable to allocate new pages.
Setting vm.min_free_kbytes to 16MB forces the kernel to keep a minimum
of 16 MB free.
I've done some testing accross repeated runs of the installer tests with
and without vm.min_free_kbytes set. So accross 30 test runs for each
settings, all of the tests with the option being set passed while 14
tests without that sysctl option triggered page allocation failures.
Sure, running 30 tests is not a guarantee that 16MB is enough, but we'll
see how it turns out in the long run across all VM tests.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Adds a new service module for shairport-sync. Tested with a local
and remote pulseaudio server. Needs to be run as a user in the pulse group
to access pulseaudio.
These options allow setting the start and stop scripts for the display
manager. Making these configurable is necessary to allow some hardware
configurations. Upstream ships empty scripts by default, anyway.
This new service invokes `simp_le` for a defined set of certs on a regular
basis with a systemd timer. `simp_le` is smart enough to handle account
registration, domain validation and renewal on its own. The only thing
required is an existing HTTP server that serves the path
`/.well-known/acme-challenge` from the webroot cert parameter.
Example:
services.simp_le.certs."foo.example.com" = {
webroot = "/var/www/challenges";
extraDomains = [ "www.example.com" ];
email = "foo@example.com";
validMin = 2592000;
renewInterval = "weekly";
};
Example Nginx vhost:
services.nginx.appendConfig = ''
http {
server {
server_name _;
listen 80;
listen [::]:80;
location /.well-known/acme-challenge {
root /var/www/challenges;
}
location / {
return 301 https://$host$request_uri;
}
}
}
'';