They're very expensive to run, especially if you don't have that many
cores, and can sometimes be a bit flaky (it looks like their CI doesn't
run things under the same constraints as we tend to).
Move them to a separate derivation, and make them test the actual
installed output rather than the local copy.