Reduces some header churn and reduces rebuilds when some header
internals change.
While we're at it we can also resolve a missing include in buffer_cache.
All drivers (even Intel) seem to have a device local memory type that is
not host visible. Remove this flag so all devices follow the same path.
This fixes a crash when trying to map to host device local memory on
integrated devices.
The job of this abstraction is to provide staging buffers for temporary
operations. Think of image uploads or buffer uploads to device memory.
It automatically deletes unused buffers.