White Rabbit High-availability Seamless Redundancy (WR-HSR)
WR-HSR is a research project to implement the High availability Seamless
Ring (HSR) protocol (IEC 62439-3 Clause 5) on White Rabbit switches and
dual-port end nodes.
The implementation is not part of the roadmap of the White Rabbit
project.
Introduction
HSR guarantees zero-time recovery in case of single point of failure. Including the protocol in WR elements, we could extend HSR features to time and frequency distribution in WR ring networks.
The nodes (devices) in an HSR network are attached by two Ethernet ports. A source node sends the same frame over both ports. A destination should receive, in the fault-free state, two identical frames within a certain time skew, forward the first frame to the application and discard the second frame when (and if) it comes. A sequence number is used to recognize such duplicates.
Fig1: HSR typical network topology. (non-wr)*
HSR nodes are arranged into a ring, which allows the network to operate
without dedicated switches, since every node is able to forward frames
from port to port. HSR originally meant "High-availability Seamless
Ring", but HSR is not limited to a simple ring topology. Redundant
connections to other HSR rings and to PRP networks are possible.
Since the forwarding delay of every node in a HSR ring adds to the total
network latency, it is important that frames are forwarded quickly. In
practice, special hardware support is required to bring down the per-hop
latency to a reasonable value, often using cut-through switching.
Another property of a HSR ring is that only about half of the network
bandwidth is available to applications (compared to RSTP). This is
because all frames are sent twice over the same network, even when there
is no failure.
WR-HSR Implementation
The implementation of WR-HSR relies on the development of a peer-to-peer mechanism instead of the current end-to-end method to measure the link delay between the master node of the ring and end-nodes. The nodes that conform the ring must implement a Transparent Clock (TC), able to forward and consume sync and follow_up messages with an HSR tag. In case an untag PTP message reaches a TC, it assumes it comes from a master clock so that it tags the frame, duplicates it, and sends it out through the two HSR ports following different paths in the ring.
One of the challenges of this project is the sintonization process involved in the synchronization carried out by Synchronous Ethernet (SynqE), which current implementation requires of Master/Slave states for the frequency distribution and shall be adapted to TC.
To sum up, this development implies:
- the development of peer delay message exchange to measure the link delay between two adjacent nodes (Fig.2 & Fig.3).
- the development of peer-to-peer for sync and follow_up messages over TC (Fig.4).
- computing the residence time of sync messages on each node to be added in the correction field of follow_up messages. This correction field, accumulated from all nodes the frame passes by, together with the link delay measured, will be used by the end node to get synchronized with the master (Fig.5).
- each PTP message includes a HSR tag, used to drop duplicated messages from the network and checks possible errors in transmission.
- PTP is computed per port, applying a Best Master Clock algorithm on each.
- in case of node failure, a switchover to the other path must be performed to keep sinchronization to the master node active.
- adaptation of SyncE to TC for the syntonization process.
Fig2: Link delay measurement using peer delay
mechanism*
Fig3: Peer Delay message exchange
diagram*
Fig4: Peer-to-Peer Sync/Follow_up message
exchange*
Fig5: Residence Time Measurement for PTP Correction Field*
Project information
Contacts
- José Luis Gutiérrez - University of Granada - General questions about the project.
- Javier Díaz - University of Granada - General questions about the project.
Project Status
Date | Event |
01-07-2014 | Start of brainstorming to increase reliability in WR networks. |
01-10-2014 | HSR as first candidate for adding reliability in WR networks. |
12-10-2014 | Starting to study of the HSR protocol and how to implement it for WR |
08-04-2015 | Starting implementation on the WRS |
17-04-2015 | Problem TBD: what about WR syntonization using pure TC? |
Possible solution: fake TC = BC + forwarding sync/follow_up. Problem: switchover will not be 0 time recover | |
20-04-2015 | Porting César's PeerDelay implementation to current version of PPSi |
27-04-2015 | HSR: Adding HSR tag to PTP frames on PPSi |
HSR tag is in another branch -> merge TBD after finishing TC/fwd | |
06-05-2015 | P2P: Starting "TC" implementation (forwarding) |
18-05-2015 | P2P: Residence time for follow_up messages in 2-step clocks implementation |
03-06-2015 | P2P: First working version of "TC". Forwarding together with the calculation of Residence time and link delay working correctly for offset calculation |
15-06-2015 | P2P: Fixing correction field bugs of Peer-to-Peer mechanism. Now working correctly |
18-06-2015 | Switchover: Adding peerDelay and Peer-to-Peer mechanisms to Maciej's switchover for the switch |
18-06-2015 | HSR: Adapting PPSi and switchover to ring topologies |
19-10-2015 | HSR: backup-backup ports able to echange wr-messages, LOCK and synchronize |
29-10-2015 | HSR: Ring closed with 3 WRS and working with switchover. PPSs output sometimes behave weird, but TRACK PHASE is still there and servo is updating (to be fixed)... |
09-11-2015 | HSR: Outer Master of the ring added. When a non-HSR sync message arrives, HSR tag is added and forwarded to the ring. |
12-11-2015 | HSR CONFIG: HSR and forwarding setup is now in ppsi.conf file. |
30-11-2015 | HSR-TC: the WRS can now work as a BC+PTP messages forwarder, or as a Transparent Clock (only sintonization and not synchronization) |
04-12-2015 | HSR: Connecting a non-HSR slave out of the ring (using both BC and TC) |
November 12th, 2015