This makes clang-format useful on those. Also add a bunch of forgotten transitive includes, which otherwise prevented compilation.
This removes explicit checks sprinkled all over the codebase to instead just have the SW rasterizer expose an implementation with no-ops for most operations.