-- the data output is muxed between RAMs and register bank. Here we generate a combinatorial mux if we don't want the output to be registered. This gives us
-- memory access time of 2 clock cycles. Otherwise the ram output is handled by the main process.