The main problem was GHC exceeding the Hydra output limit with profiling
libs on aarch64-linux which made us disable the feature. Nowadays the
limit is 3GB, the GHC output is a bit over 2GB, so easily under the
limit.
aarch64-darwin uses a different codegen backend and was never really
affected by the problem: Its output with profiling enabled is around
1.6GB.
Consequently we can enable profiling for all platforms again, as we have
no output size issues for those we build on Hydra.
Thanks to flokli for helping me track down these up to date numbers.
Hadrian (the GHC build tool) is built separately from GHC. This means
that if `haskell.compiler.ghc961` is overridden to add patches, those
patches will _only_ be applied to the GHC portion of the build, and not
the Hadrian build. For example, backporting this patch to GHC 9.6.1
failed because the changes to `hadrian/` files were not reflected in the
Nix build:
5ed77deb1b
By lifting `src` and `hadrian` from variables defined in the function
body to parameters with default values, the `hadrian/` files can be
overridden using the `haskell.compiler.ghc961.override` function. For
example:
self.haskell.compiler.ghc961.override {
# The GHC 9.6 builder in nixpkgs first builds hadrian with the
# source tree provided here and then uses the built hadrian to
# build the rest of GHC. We need to make sure our patches get
# included in this `src`, then, rather than modifying the tree in
# the `patchPhase` or `postPatch` of the outer builder.
src = self.applyPatches {
src = let
version = "9.6.1";
in
self.fetchurl {
url = "https://downloads.haskell.org/ghc/${version}/ghc-${version}-src.tar.xz";
sha256 = "fe5ac909cb8bb087e235de97fa63aff47a8ae650efaa37a2140f4780e21f34cb";
};
patches = [
# Enable response files for linker if supported
(self.fetchpatch {
url = "5ed77deb1b.patch";
hash = "sha256-dvenK+EPTZJYXnyfKPdkvLp+zeUmsY9YrWpcGCzYStM=";
})
];
};
}
Note that we do have to re-declare the `src` we want, but I'm not sure
of a good way to avoid this while also sharing one set of patches
between the GHC and Hadrian builds.
We previously thought that we only need python if we were going to run
./boot or using emscripten which implements all its wrappers in
python (and likes to reinvoke them). As it turns out, though, hadrian
likes to invoke python itself for generating certain headers of rts
using a script shipped with the GHC source. This fact was obscured
before, since (presumably) sphinx would propagate python into PATH.
Due to link time dead code elimination not working on aarch64-darwin,
some unused store path references in Paths_* modules are retained. This
causes reference cycles when a separate `bin` output is used.
To prevent this, add a patch to Cabal as shipped by GHC which infers
based on the installation layout (which is influenced by
enableSeparateBinOutput, enableSeparateDataOutput etc. in a Nix build)
which references can be retained without causing a reference cycle. This
ensures that packages that were fine with a bin output will also work on
aarch64-darwin. Packages that cause a reference cycle anyways (by
actually using references that do cause one) fail due to a missing
symbol – here we are trading the overall benefit for a more confusing
error message.
For details, refer to the explanation comment in the patch.
`*.*.rts.*.opts` is actually copied from the migration GHC blog post,
but does not, actually, parse: The format is
`<stage>.<package>.<program>.<filetype>.<setting>`, so it would need to
be `*.rts.ghc.opts`. This is already achieved by the broader rule on the
next line.
- Christmas is over!
- Upstream has changed the name of the target triplet used for the JS
backend from js-unknown-ghcjs to javascript-unknown-ghcjs, since Cabal
calls the architecture "javascript":
6636b67023
Since the triplet is made up anyways, i.e. autoconf does not support
it and Rust uses different triplets for its emscripten backends, we'll
just change it as well.
- Upstream fixed the problem with ar(1) being invoked incorrectly by stage0:
e987e345c8
Hadrian does this automatically unfortunately, but unless we correctly
set enableShared as well, mkDerivation will try building shared libs
which will inevitably fail due to missing shared core packages.
Let's stay away from fully_static which does a lot of funky stuff and
was not working before anyways for pkgsStatic.
There is a code generation bug in Cabal-3.6.3.0. For packages configured with
--enable-relocatable, Cabal would generate code that doesn't compile.
There isn't an upstream issue, but the issue is described in the commit that
fixed it:
6c796218c9
It was fixed in Cabal-3.8.*
Backport the fix to the Cabal library that ships with ghc-9.4.4
Cabal 3.8 ships with ghc-9.6, so when 9.6 is released this fix shouldn't be
necessary.
`ghc ? hadrian` can be used to check if a GHC was built using hadrian.
This is often relevant since hadrian changed the ghc libdir location, so
we need to install libs to a different location as well.
GHC ships a [modified] config.sub so that js-unknown-ghcjs is accepted
by autotools. For some platforms, we automatically update config.sub
from upstream's source in order to prevent that builds fail when we use
an outdated config.sub. In this case of course the perfectly up to date
config.sub would reject the target platform we are trying to use, so we
must disable this mechanism for now.
I have asked in the GHC IRC channel if there are any plans on
upstreaming the platform. It would be nice if were able to drop this
change in the future.
This is now possible by building a cross compiler for js-unknown-ghjs
using `pkgsCross.ghcjs.buildPackages.haskell.compiler.ghcHEAD`.
To allow this, the following things needed to be done:
* Disable dependencies that wouldn't work:
- Don't pull in ncurses for terminfo
- Don't pull in libffi
- Don't pull in libiconv
- Don't enable the LLVM backend
- Enable gmp-less native-bignum backend
* Use emscripten instead of a C compiler. The way this works is inspired
by emscriptenPackages, but avoids the following flaws:
- Instead of using a custom configurePhase, just set
`configureScript = "emconfigure ./configure";` which is much simpler.
- Create writable EM_CACHE before configuring, as configure scripts
want to compile test programs.
Additionally, we need to disable the targetCC check, as it is not
applicable with emscripten which never appears as part of stdenv.
* Use generic $configureScript in installPhase to be able to work with
our emconfigure trick.
Note that the corresponding Haskell package set does not work yet. Cabal
doesn't seem to like GHC 9.7 yet and the generic-builder is clueless
about the JS backend.
gmp is part of buildInputs _and_ depsTargetTarget, so we need to check
the host and target platform to be correct. In practice this doesn't
change much though, as gmp.meta.platforms is _quite_ liberal.
Finally building a cross compiler using hadrian is possible, but there
are some outstanding issues regarding external libraries in the package
db which causes issues with ghc-bignum.
There is a code generation bug in Cabal-3.6.3.0. For packages configured with
--enable-relocatable, Cabal would generate code that doesn't compile.
There isn't an upstream issue, but the issue is described in the commit that
fixed it:
6c796218c9
It was fixed in Cabal-3.8.*
Backport the fix to the Cabal library that ships with ghc-9.2.4
Co-authored-by: sternenseemann <sternenseemann@systemli.org>
Closes#200063.
Initial port of our GHC Nix expressions to the new hadrian build system,
as it has become required after 9.4. Unfortunately there are some
regressions affecting us, namely the inability to install a GHC
cross-compiler at the moment (see issue linked in relevant error
message). This means that a lot of specific configuration snippets for
cross-platforms and static compilation have been ported from make
speculatively, as we are unable to test them for the moment.
This shortens the bootstrap chain for 9.4.1 and should be kinder on
rebuilds. It requires some messing around in the configure file, since
it is not officially supported by upstream (but known to work). For now
it saves us the hassle of adding another bindist to nixpkgs. When we
support hadrian, we'll be able to use the already packaged 9.2.2
bindist.
Since the musl / alpine bindist now uses the GMP backend, we need to
learn to tell Hadrian bindists about GMP. Hadrian bindists no longer
have the buildinfo files, instead we need to patch the package db before
installing and recache it afterwards which is not too hard, luckily.
Same goes for libiconv and base as well as libffi and rts on
darwin (those bindists are all produced using hadrian).
See also: https://gitlab.haskell.org/ghc/ghc/-/issues/21554#note_431000
Note that pkgsMusl.haskell.compiler.ghc922Binary still has severe
issues: It can't produce shared libraries because the bindist ships
none (and using the GMP backend has a hard requirement for shared
objects, apparently) and ghci segfaults for unknown reasons at the
moment. However, I've successfully compiled hadrian with it so far, so
perhaps it's good enough.
This was disabled basically by accident before.
The links are jacked, but that was is true for every package; it is not
unique to this PR. I fixed it upstream here:
https://github.com/haskell/haddock/pull/1482
but it's not in any release distributions yet I don't think.
Fixes#171841
Otherwise attempt to build ghcHEAD from local checkout fails as:
$ nix build -L --impure --expr 'with import ~/nm {}; haskell.compiler.ghcHEAD.overrideAttrs (oa: { src = ./.; patches = []; nativeBuildInputs = oa.nativeBuildInputs ++ [ git ]; })' --keep-failed
...
ghc> checking C++ standard library flavour... ./configure: line 11487: /nix/store/r7r10qvsqlnvbzjkjinvscjlahqbxifl-gcc-wrapper-11.3.0/bin/cxx: No such file or directory
I think 'cxx' is not provided by stdenv.
Very confusingly, the `isPowerPC` predicate in
`lib/systems/inspect.nix` does *not* match `powerpc64le`!
This is because `isPowerPC` is defined as
isPowerPC = { cpu = cpuTypes.powerpc; };
Where `cpuTypes.powerpc` is:
{ bits = 32; significantByte = bigEndian; family = "power"; };
This means that the `isPowerPC` predicate actually only matches the
subset of machines marketed under this name which happen to be 32-bit
and running in big-endian mode which is equivalent to:
with stdenv.hostPlatform; isPower && isBigEndian && is32bit
This seems like a sharp edge that people could easily cut themselves
on. In fact, that has already happened: in
`linux/kernel/common-config.nix` there is a test which will always
fail:
(stdenv.hostPlatform.isPowerPC && stdenv.hostPlatform.is64bit)
A more subtle case of the strict isPowerPC being used instead of the
moreg general isPower accidentally are the GHC expressions:
Update pkgs/development/compilers/ghc/8.10.7.nix
Update pkgs/development/compilers/ghc/8.8.4.nix
Update pkgs/development/compilers/ghc/9.2.2.nix
Update pkgs/development/compilers/ghc/9.0.2.nix
Update pkgs/development/compilers/ghc/head.nix
Since the remaining legitimate use sites of isPowerPC are so few, remove
the isPowerPC predicate completely. The alternative expression above is
noted in the release notes as an alternative.
Co-authored-by: sternenseemann <sternenseemann@systemli.org>
At some point in hadrian's installation Makefile used for e.g. alpine
bindists 'xxx' is used as a placeholder for three spaces and later
substituted back. This breaks if the store path itself contains 'xxx'.
We adapt an upstream patch which will be part of 9.4 (and presumably
9.0.3 and 9.2.3) as a workaround for this issue for 8.10.2 and 8.10.7
which are the binary GHCs in nixpkgs affected.
The issue we are working around is concerned with the lack of dead
code elimination in the aarch64-darwin LLVM codegen backend. Thus
the relevant condition is the target platform, not the host
platform as checked previously.
The condition used in the past to detect iOS was "is this
aarch64-darwin"? Since we have aarch64-darwin devices running macOS
nowadays which do allow large address space, let's use the more accurate
flag.
enableDwarf requires elfutils on the host, which doesn't support darwin.
Instead of hardcoded isDarwin/isWindows, switch to self-documenting
availableOn conditional for elfutils.
Fixes ghcHEAD cross-compilation when build = host = darwin,
target = linux.
Co-authored-by: sternenseemann <sternenseemann@systemli.org>
This is no substantial change, as we already assert that the
build->target and host->target LLVM are the same, but this brings the
terminology in the file in a more consistent order, since we use
build->target for CC/CXX and bintools already.
In fact we should be passing build->target to configure always,
host->target would come into play when updating GHC's settings file
after installing.
On aarch64-darwin we have additional wrappers for install_name_tool and
strip to deal with codesigning requirements (e. g. updating the
signature / checksum after changing a binary). These wrappers don't
exist on x86_64-darwin which is why the unwrapped versions were used
before, causing the kernel to kill (some) executables produced by GHC.
CC, CXX, LD, AR, …, LLC, OPT and CLANG will be invoked by GHC's build
system at build time in the build->target role. However, since we are
passing absolute paths, they will get saved in GHC's settings file and
later invoked at runtime, when they should be host->target. This means
that the build->target and host->target tools need to be the same for
our built GHC to work properly which is what we guard using these new
asserts.
Being able to drop these asserts would be a step towards cross-compiling
GHC (as opposed to building a GHC cross-compiler which still works).
* By taking clang from llvmPackages we make sure there is no version
mismatch between LLVM (where llc and opt come from) and clang (which
previously would be taken from stdenv on darwin for example).
* Only pass CLANG if useLLVM is true. Previously we also set this if
targetCC was clang. This would cause potentially confusing behavior if
llc and opt as well as clang are provided via PATH (and GHC is
compiled with useLLVM == false), because clang from PATH would be
ignored, but not llc and opt.
Since 4c75874560 it is possible to
introspect if ld.gold is contained in the used bintools, so we can also
check if it is available before deciding to use it as done in the other
GHC derivations in 0908812372.
These targets also have NCG support, but they are tested less (in fact
SPARC seems to be untested atm) and may have issues. In such cases being
able to fallback to -fllvm without rebuilding the compiler could be
useful. OTOH GHC will default to -fasm and the backends probably work
well enough in most cases.
GHC can actually accept absolute paths for its runtime tools (except for
touch) at configure time which are then saved in
`$out/lib/ghc-${version}/settings`. This allows us to drop the wrapper
entirely if we assume that a POSIX compliant touch is in PATH when we
run GHC later.
The touch problem can presumably be fixed by either patching the
configure file of GHC (although we need to take care not to change the
touch GHC uses during its compilation) or messing with the settings file
after installation.
The rationale for dropping the wrapper PATH entry completely is that
it's always possible to invoke GHC via its library which will bypass the
wrapper completely, leading to subtly different behavior.
Binary GHCs are not touched in this commit, but ideally they'll get a
similar treatment as well, so they are more robust, although we
generally don't need to use them as a library.
Note that GHC 8.8.4 doesn't care about install_name_tool or otool, so
the respective environment variables are not set.
This has two main benefits:
* GHC will work reliably outside of stdenv, even when using -fllvm since
everything it'll call at runtime will be provided in PATH via the
wrapper scripts.
* LLVM will no longer leak into haskell packages' configure
scripts. This was an issue with llvm-hs which fails to build if the
LLVM version of the compiler since the propagatedBuildInputs of GHC
take precedence over the nativeBuildInputs added in the derivation.
This brings the binary GHCs on parity with the source built ones in
terms of the wrapper. The upshot of this is that compiling something
using the binary GHCs no longer depends on PATH being populated with
the tools included in stdenv at all. We can even test this by running
the installCheck with an empty environment (via `env -i`).
Copy the approach from the normal GHC derivations for adding an
`export PATH` into the scripts in `$out/bin` and use it to put the
specific LLVM version of the binary GHC into its PATH. This will
prevent the LLVM version of the GHC we are building later to take
precedence over the LLVM version this GHC needs.
Since we inherit the platform list from the bootstrap GHC, we get
differing lists depending on which platform we evaluate the platform
list on (depending on whether 8.10.2 or 8.6.5 is used). This leads to
Hydra thinking aarch64-linux is not supported as it evaluates on
x86_64-linux usually.
Since LLVM itself doesn't depend on target at all, this doesn't change
anything *in effect* (i. e. rebuild count should be zero), but it is
more clear about the intention and what LLVM is used for here (i. e. in
depsBuildTarget).
This means we only have to update the llvmPackages attribute in one
place now and should prevent situations like with 8.6.5 where different
versions would be used in the package set compared to the compiler
build.
Drop comments in the configuration-ghc-X.Y.x.nix files as well, since
LLVM version isn't tied to the compiler minor version at
all (e. g. 8.10.2 and 8.10.7 have different support ranges).