White Rabbit Newsletter, May 2017
Quite a lot has happened in the WR scene since the last newsletter. With two companies producing switches, WR is now a true multi-vendor solution. We have also seen more open and proprietary hardware incorporating WR support. The WR user list keeps getting bigger and bigger. We also saw people starting to advertise WR-related jobs in the Netherlands, Switzerland and Germany. WR was even on Spanish TV!
A fiber cleaning course was recently organized to raise awareness about this important issue.
WR Switch firmware activities
We have released WR Switch firmware v5.0 with many improvements in both
software and gateware. Inside you will find an updated Buildroot and
Linux kernel, added VLANs configuration to the main dot-config file,
improved Syslog messages, added tools for reading and writing SFP
EEPROMs and much more. On the HDL side, version v5.0 brings among others
a new bandwidth throttling module for the traffic going to the ARM CPU
and re-written, more efficient multi-port linked list in the Ethernet
switching core. Please check the official
news message about the release. We are
currently in the process of integrating WR switches within the CERN
monitoring (SNMP) and configuration (dot-config generation from a
central database) infrastructure.
We plan soon to prepare a v5.0.1 release with several minor software fixes to v5.0. Please check the issues with a target version v5.0.1 for more details.
WR PTP Core activities
We have released v4.0 of the WR PTP Core. This new release adds among others the SNMP and Syslog for remote monitoring as well as VLAN support. On the HDL side, we have re-written the Mini-NIC module to fix problems with handling high loads of traffic and improved a lot the synthesis time with RAM initialization for Xilinx. We took the opportunity of this new release to reorganize the wr-cores HDL repository to add platform and board support modules as well as a few reference designs for officially supported boards. Please check the release user manual for a detailed description of these new modules and instructions for how to use them in your custom designs. In the coming months we would like to expand the platform, board, reference design modules to cover also Xilinx series 7 FPGAs and work on porting the WR PTP Core to Kintex Ultrascale FPGAs.
New manufacturer of the WR Switch hardware and the official Production Test Suite
Creotech Instruments S.A. is another company that started producing WR Switch hardware last year. We have bought a few devices and successfully evaluated them together with GSI to make sure they are produced from the same design files as switches sold by 7Solutions. Currently, we are working together with INCAA to prepare an official Production Test Suite (PTS) for the WR Switch hardware. This PTS consists of a set of tests that can be used either by the companies during production, to verify that all the hardware components are soldered and working properly, or by users to test the hardware in case of failures.
SPEC and SVEC PTS-es expanded to assign official MAC addresses by manufacturers.
Till now, the Ethernet MAC address of a WR port on SPEC and SVEC cards was generated from the 1-Wire thermometer ID. Last year we updated the Production Test Suites (PTS) for SPEC and SVEC cards to allow manufacturers to assign unique MAC addresses from the official pool. The MAC address is stored in the Flash memory and can be used by the default WR PTP Core firmware. The update to PTS-es was sent to all the companies producing these cards. Please contact your supplier if you're interested in having WR Nodes with official MAC addresses.
Proof of concept demo of super low latency triggers distribution in White Rabbit.
In 2016 we made a proof of concept implementation of super low latency trigger distribution in White Rabbit. We added a module that injects special 8B/10B characters to the Ethernet data stream. These special characters are then detected on the other side and interpreted as triggers. Thanks to the fact that the injection and reception is done in the physical layer, the overall latency of the trigger is the result of only the SerDes, PCB and fiber latencies. This opens new possibilities of using White Rabbit as a trigger distribution network in high energy physics experiments. This is currently a proof of concept implementation and still requires more work to be included in the stable release of the WR Switch and WR PTP Core. If you find this interesting for your application, please let us know and have a look at branch supertrigger in the wr-cores HDL repository.
"Methods to Increase Reliability and Ensure Determinism in a White Rabbit Network" - thesis published and available online.
The PhD thesis by Maciej Lipinski on "Methods to Increase Reliability and Ensure Determinism in a White Rabbit Network" is available on the CERN Document Server (http://cds.cern.ch/record/2261452). It describes the developed methods as well as a reference design of the WR-based control and timing network for all the CERN accelerators. The thesis was defended in April 2017 at the Warsaw University of Technology.
We have adapted WR technology for a pre-clinical Positron Emission tomographic Instrument. The PET ring consists of 8 detector modules, each containing 6 LYSO arrays coupled to a 8*8 SiPM matrix. The signals from the SiPM are processed by a dedicated ASIC and then digitized and time-tagged with a commercial ADC and FPGA-based TDC respectively. Each module acts as a White Rabbit slave node to synchronize the time reference of the TDC. The extracted list mode data is also transmitted via the WR link. A standard WR switch collects the data from all 8 detector modules, but the firmware has been modified. The list mode data packages from each module are extracted while other packages enter the switching matrix as normal. The list mode data are then processed by dedicated timestamp coincidence logic. The outcomes are then re-packeted and send out via a separate port to a reconstruction PC. The scheme greatly simplifies the PET design by replacing traditional off ring electronics with a standard White Rabbit switch.
The design of dual-port Cute-WR mezzanine has been released. It can act as a slave with a redundant connection to improve the reliability or in daisy chain mode to support a cascade topology. We have verified the WR functionality on each port for both master/slave. The daisy chain for PTP transmission has been verified, and the routing of data frames is still under development. With the dual port mezzanine, we are also designing a portable calibration node which can play an important role in a WR auto-calibration procedure, WR array deployment and validating the synchronization accuracy, bringing the advantage of higher integration, more convenience and lower human workload (http://ieeexplore.ieee.org/document/7820218/).
A prototype array of 100 detector nodes are running in Tibet. For quite a long period, we suffered some instabilities, randomly losing nodes during power up and normal operation. The new released WRPC 4.0 has greatly improved the situation. The LHAASO project has been finally approved and more than 500 WRS and 7000 WR nodes will be deployed at an altitude of 4300m asl. At the end of 2018,a quarter of the full array is foreseen to operate.
Nikhef and Vrije Universiteit Amsterdam
Xilinx Family-7 GTH (Nikhef)
VHDL sources are available for a deterministic PHY for the Xilinx Family-7 GTH (used by some Virtex-7 devices).
Rabbit_FX: White Rabbit Oscillators on an FMC mezzanine (Nikhef)
A White Rabbit FMC module that includes the DACs and Oscillators needed to run White Rabbit has been built (https://redmine.nikhef.nl/et/project/rabbit_fx/wiki). It enables code development on a vendor FPGA development board before new hardware is available.
White Rabbit Switch mini backplane for embedded applications. (Nikhef)
In embedded detector electronics there is usually not much space available. Work is in progress to create a small WR Switch mini backplane with two 2x5 SFP cages.
Calibration (Nikhef, Vrije Universiteit Amsterdam):
Calibration of WR gear is addressed by the following three main items:
1) Absolute delay calibration
It is proven that hardware delay calibration parameters can be derived (https://www.ohwr.org/project/wr-calibration/wiki). These parameters define the timing relationship between measurable external phase planes (i.e. the PPS connector and SFP electrical connector). Additional "mode abscal" still needs to be added to PPSi.
2) Electrical/Optical and Optical/Electrical calibration
Although still in a laboratory state, it has been proven that hardware delay calibration parameters of electrical/optical and optical/electrical converters (i.e. SFPs in most cases) can be measured with pico-second precision. Storing the delay parameters in SFP EEPROM will enable exchange of SFPs without the need for re-calibration. A definition for storing calibration parameters can be found here: https://www.ohwr.org/project/sfp-plus-i2c/wikis/User_EEPROM.
3) In-situ asymmetry coefficient determination
Work is in progress to determine the fiber asymmetry coefficient in-situ using SFPs with a tunable laser.
Tools for calibration have been developed.
a) It is planned to produce a batch of SFP+ Loop Back Modules) used for absolute calibration. Currently the mechanical housing causes some issues that prevent us from production. When these issues are solved an announcement will follow via the white-rabbit-dev mailing list such that people interested can sign up for a module.
b) A 10 Gbps capable Multi SFP crate is designed including "SaFariPark" software (https://www.ohwr.org/project/sfp-plus-i2c/wiki) which enables us to exercise SFP modules, both with respect to their serial link, as well as the I2C interface. The latter is needed for accessing digital diagnostics, storing calibration parameters and enabling SFP tuning. Moreover "SaFariPark" computes SFP checksums and can correct SFPs with corrupted EEPROM content (unfortunately something we experienced too often).
c) A proposal for a generic software library to access SFPs via the I2C
bus is launched (https://www.ohwr.org/project/libsfp/wiki). When this
software is deployed in WR gear then calibration
parameters stored in EEPROM are accessible, as well as reading out SFP Digital Diagnostics.
The distributed measurement of synchrophasors via Phasor Measurement
Units (PMUs) represents one of the most advanced sensing layers in the
domain of Wide Area Monitoring and Control (WAMC) systems. According to
the IEEE Std. C37.118.1-2011, a PMU is a measurement device synchronized
to a common Coordinated Universal Time (UTC) reference, which reports
phase-aligned and time-stamped measurements of frequency, amplitude and
phase angle of the voltage and current phasors of a power system.
In general, the time synchronization of PMUs relies on the Global Positioning System (GPS) as it represents a good tradeoff between performance and cost. However, this synchronization system has three main drawbacks: (i) accuracy, (ii) accessibility and (iii) security. Concerning the first point, it is worth noting that, in order to correctly support synchrophasor based applications, PMUs must be characterized by high accuracy levels in estimating the synchrophasors, especially when power distribution systems applications are envisaged. Modern PMUs are adopting synchrophasor estimation algorithms exhibiting steady state phase accuracies in the order of tens of ppm of radians. Since these values correspond to time jitter that typically characterizes the GPS units usually adopted by the PMU hardware, we may conclude that one of the barriers to improve the steady state PMU accuracy is the uncertainty of the time dissemination technology.
Concerning the second point listed before, i.e., the accessibility of the GPS, we can observe that such a time source might not provide a stable and reliable time reference especially in cases where underground substations, with limited or no access to the sky, need to be equipped with PMUs. Therefore, a more suitable time dissemination technique, deployable over the legacy power system’s telecom infrastructure, might be required. Concerning the third point, security, recent works have shown that, since civilian GPS satellite signals are not authenticated, they can be spoofed by superimposing a fake signal with a higher signal-to-noise ratio, which would enable an attacker to manipulate the GPS clock.
Among the possible alternatives to the GPS, White Rabbit (WR) represents an excellent candidate as time synchronization protocol in view of its superior peculiarities. In this project, we have developed a PMU integrating WR technology. Using a WR network composed of a WR switch and WR-cRIO modules, we assessed the performance of the WR-PMU by means of a PMU calibrator. Moreover, we compared the performance of the WR-PMU with the one of a GPS-based PMU. The two PMUs were characterized by the same synchrophasor estimation algorithm and by the same hardware platform, with the exception of the time synchronization module and technique. The results exhibit similar performance between the WR-PMU and the GPS-PMU, showing WR’s applicability for PMUs. The developed system can be used as an alternative for specific conditions where the GPS signal is not available.
Seven Solutions and University of Granada
White-Rabbit-enabled Data Acquisition System development by UGR
UGR is evaluating the feasibility of using White Rabbit technology to discipline all the clock domains of a data acquisition system. Existing systems provide synchronization capabilities, albeit the performance achieved is often not accurate enough for scientific infrastructure and high-end industrial applications. White Rabbit networks are known for their sub-nanosecond accuracy attained with fiber-optic Ethernet links. By integrating an existing open hardware data acquisition FMC card into a White Rabbit node, we show that it is possible to tag the acquired data in a deterministic manner without loss of timing performance. This result can be the first step towards a multi-device, remotely operated synchronized data acquisition. Extended OS support as well as the utilization of middlewares as DDS for achieving high QoS are also under study.
Research on redundant topologies based on the HSR protocol (Seven Solutions - UGR)
UGR together with Seven Soltuions are about to finish the development of
the High-availability Seamless Redundancy (HSR, IEC 62439-3) protocol
for WR devices in order to increase the availability and robustness of
the White Rabbit technology and thus, extend its utilization to
industrial networks such as Smart Grid. Applications related to this
field demand high availability and very short switch-over time. HSR
guarantees zero-time recovery in case of failure in ring and mesh
network topologies for both time and data frames and it is based on the
duplication of frames in the network.
UGR and Seven Solutions are finishing the implementation of the HSR protocol for WR devices, in particular the WR-Switch to increase availability and avoid single point of failure for time and data distribution. This work has been done in the framework of the European ARTEMIS JU EMC2 project (http://www.artemis-emc2.eu/) and involves the development of the Peer-2-Peer mechanism for clock propagation, PeerDelay for the delay measurement, the utilization of HSR tags in frames, and the Switch-over mechanism developed by CERN to recover from a node failure. Current progress and more information can be found at the OHWR WR-HSR webpage
WR upgrading for KM3NeT by Seven Solutions
KM3NeT is a research infrastructure housing the next generation neutrino telescopes, which is located in the deepest seas of the Mediterranean. The topology of KM3NeT consists of final nodes (DOMs) and the shore station. The DOMs have unidirectional uplinks to reach the onshore station, while the shore station has a unique unidirectional downlink shared by all the DOMs. This topology reduces the communication resources cost, but it requires a higher customization level at the shore station.
Seven Solutions has been upgrading the software and firmware for the WR switch to adapt the previous customization of KM3NeT from v3.3 to v4.2. These new modifications include deep changes in the PPSi module, including a modification of the WR standard, which uses bidirectional links to synchronize, a new monitoring tool, and the possibility to start up at least 150 final nodes in parallel with optimized start up times.