All of the White Rabbit capable reference designs for the SPEC7 board rely on the Xilinx XDMA core to perform communications to the host Front-end-Computer via the PCIe.
Unfortunately, there are several issues with XDMA core for Xilinx 7th Series, including the Zynq-7000 family that powers the SPEC7 board.
The first issue is that the XDMA core doesn't support PCIe Tandem modes, i.e. the transparent partial reconfiguration approach that Xilinx provides to ensure that the PCIe Hard-IP gets configurated early in the bitstream loading process to allow for proper PCIe device enumeration by the Host BIOS. For this reason, we cannot include a XDMA based design to be loaded by the Zynq FSBL and, at the same time, to warrant that the SPEC7 PCIe will be configured in less than 100ms after power-up and ready for enumeration from the Host BIOS.
It's important to highlight again that this issue is common for all of the Xilinx 7th Series device families, not only for the Zynq-7000 one. Indeed, because the device featured by the SPEC7 is a Zynq-7000, we envisioned a flexible workaround to overcome this issue by using the dual Arm Cortex-A9 processor.
Because of the way the Zynq-7000 works at the time of loading a bitstream from the QSPI memory, in the original SPEC7 start-up process a BOOT binary was generated including two partitions:
- A FSBL was generated in Vitis from the HW description exported from Vivado for a specific design.
- The bitstream for the specific design generated by Vivado.
This BOOT binary performed the following basic actions at start-up:
- The FSBL configures the PS subsystem, i.e. the Arm Cortex-A9 complex an peripherals, including the PS DDR, so that it can execute non-trivial software applications.
- The FSBL executes a portion of its code to grab the bitstream from the appropriated offset at the storage media where the BOOT binary is located and load it to the Programmable Logic.
In this original start-up procedure for the SPEC7, that was valid for both Standalone and PCIe Slave operation modes, the dual Arm Cortex-A9 was totally unused once the bitstream was loaded to the FPGA.
Now, before going forward, we should consider how the Zynq-7000 devices are mostly used as standalone devices. For the most of the use cases, the Processing System is the main element where the intelligence resides while the Programmable Logic is used to augment the real-time and hardware accelerated capabilities of the software being executed by the dual Arm Cortex-A9 processor.
When using a baremetal application in the PS, the standard BOOT binary includes a third partition after the bitstream one to store the application executable. In this way, just after the FSBL has loaded the bitstream, the baremetal application is loaded by the FSBL to the target processor core and the control is passed to this app.
Now, we should note that this baremetal application can be of very different nature depending on the features the use case requires from the Zynq-7000 system, e.g.:
- A standard C or C++ application that makes use of the provided Xilinx standalone drivers.
- A Real-Time O.S., e.g. FreeRTOS, to execute time deterministic control routines
- A bootloader, e.g. U-Boot, to further configure the platform and bring-up a higher-level O.S. such as Linux or Android.
If we focus in the U-Boot + Linux use case, when there is not a strong requirement on loading the bitstream as fast as posible on start-up, the standard approach is just using a two partitions BOOT binary including the FSBL and the U-Boot. This is possible because the Zynq-7000 is PS centric, i.e. you can boot even a fully-featured O.S. without configuring the the PL at all, and then loading the bitstream under demand on the PL when this is actually required.
This configuration of the PL with a bitstream residing in a external storage media can be conducted by either the U-Boot or the Linux Kernel itself.