diff --git a/figures/robustness/clockDistribution.pdf b/figures/robustness/clockDistribution.pdf new file mode 100644 index 0000000000000000000000000000000000000000..313a6a07a8e3459043b19b3e6a406fff3deef235 Binary files /dev/null and b/figures/robustness/clockDistribution.pdf differ diff --git a/figures/src/robustness/clockDistribution.cdr b/figures/src/robustness/clockDistribution.cdr new file mode 100644 index 0000000000000000000000000000000000000000..46d3f354b79d9220015c50b257e189cfdc6f2e90 Binary files /dev/null and b/figures/src/robustness/clockDistribution.cdr differ diff --git a/papers/ICALEPCS2011/ClockDistribution.tex b/papers/ICALEPCS2011/ClockDistribution.tex new file mode 100644 index 0000000000000000000000000000000000000000..6b760e35660c90b504a0e4da06c7ce50baee3470 --- /dev/null +++ b/papers/ICALEPCS2011/ClockDistribution.tex @@ -0,0 +1,45 @@ +\section{Clock Distribution} + + +% The resilience of the Clock Distribution translates into continuous and stable +% synchronization of all the nodes and switches in the WRN (Table~\ref{tab:requirements}). +% A loss of time notion in a node can be caused by a link or switch failure - break of clock path +% between the TM and the node. In order to prevent synchronization break, redundancy of network +% elements (switches, cables) can be introduced ensuring redundant clock paths. However, +% the switch-over between redundant elements might introduce instability and render the network +% unreliable despite the costly redundancy. Therefore, the seamless switch-over between redundant +% clock paths is one of the design-goals to enable network topology redundancy and, as a consequence, +% offer robust and stable synchronization. The other reasons for the deterioration of synchronization +% accuracy are the variation of external conditions (e.g. temperature) and loss of Ethernet frames with +% timing information (PTP). + +%\subsection{Switch-over} + +A seamless switch-over between redundant sources of timing (uplink ports) is heavily supported by +the Clock Recovery System (CRS) \cite{biblio:TomekMSc} of the switch and the WR extension to PTP +(WRPTP)\cite{biblio:WRPTP}. + +Figure~\ref{fig:switch-over} presents an example where a switch (timing slave) is connected +(by its uplinks 1 \& 2) +to two other switches (primary and secondary masters) -- the sources of timing. On both +uplinks the frequency is recovered from the signal and provided to the CRS. Similarly, WRPTP +measures delay and offset on each of the links and provides this data to the CRS. +The modified Best Master Clock (mBMC) algorithm \cite{biblio:WRPTP} decides which of the +timing masters is "better" and elects it the primary, the other is considered secondary (backup). +The information from {\it uplink 1} (primary) is used to control +the CRS and adjust the local time. However, at any time all the necessary information from the +{\it uplink 2} is available and a seamless switch-over can be performed in case of +primary master failure \cite{biblio:TomekMSc}. + +\begin{figure}[t] +\centering +\includegraphics[width=3.2in]{../../figures/robustness/clockDistribution.eps} +\caption{Seamless switch-over.} +\label{fig:switch-over} +\end{figure} + +%\subsection{Variable conditions and loss of PTP messages} +In addition to the switch-over-related synchronization instability, the variation of external temperature +can cause an accuracy degradation. This problem, however, is solved by the PTP standard itself. By +frequent link delay measurements, the fluctuation is compensated. + diff --git a/papers/ICALEPCS2011/ControlDataDistribution.tex b/papers/ICALEPCS2011/ControlDataDistribution.tex new file mode 100644 index 0000000000000000000000000000000000000000..4905d9fb1401a12187f84fa3697dca49d0e0de4d --- /dev/null +++ b/papers/ICALEPCS2011/ControlDataDistribution.tex @@ -0,0 +1,108 @@ +\section{Data Resilience} + + +\subsection{Forward Error Correction} +\label{sec:fec} + +The objective of the FEC scheme is to decrease the loss rate of the CMs, preferably, +to less then one per year. WR uses as a physical medium Fiber Optic and CAT-5. The number +of received corrupted bits compared to the total number of received bits is called Bit Error Rate +(BER). The value of BER characterizes a physical medium and can be used to characterize the entire +switched network. +% if the following factors are taken into account: +% (1) {\it type of cabling} (fiber/twisted pair), +% (2) {\it logic topology}, +% (3) {\it network address} (broadcast/unicast). +A WRN can be seen as a Packet Erasure Channel (PEC) or as a Binary Erasure Channel (BEC) depending +on the effect of a bit error on the frame. If the frame is lost (e.g. dropped by the switch due to +a corrupted header or lost during switch-over between redundant components), the WRN is a PEC. +If the bit error happens in the link between a switch and node, a corrupted frame +\modified{can be used (optional)} +%is used +to attempt frame recovery. In such case, the channel is called BEC. Each type of channel requires +a different FEC solution. Therefore two concatenated FECs are used in WR. +Reed-Solomon (R-S) %\cite{biblio:r-s} +%\cite{biblio:coding} +coding is used for the PEC and +allows to encode k original-frames into n encoded-frames ($n>k$). +Reception of any k encoded-frames can be used to decode the original frames. +\modified{Hamming coding with additional parity (SEC-DED)} +%Hamming coding +is used for the BEC and allows to detect up to two simultaneous bit errors and +correct a single error. +These two schemes (R-S and Hamming) are combined to encode each CM -- it is +split into two and encoded using R-S into four messages (two original and two +with redundant data). Each of the four messages is then encoded using Hamming. Such encoded +messages are sent in a burst of 4 Ethernet frames. Reception of any two of these frames enables +to decode the original Control Messages. +A systematic analysis, using the BER characteristic of the WRN, proves that the presented FEC scheme +guarantees less than single CM lost per year due to physical medium +imperfection, as can be seen from Table~\ref{tab:gsi_cern_fec}. + +\begin{table}[ht] + \begin{center} +\caption{GSI and CERN FEC characteristics.} +\begin{tabular}{|p{4cm}|c|c|} \hline +% \cline{2-3} +% & \multicolumn{2}{|c|}{Use Case} \\ \cline{2-3} +\rowcolor{gray!35}{} +{\bf Parameter} & {\bf GSI} & {\bf CERN} \\ \hline + \multicolumn{1}{|p{4cm}|}{Control Message length} & 500 bytes & 1500 bytes \\ \hline + \multicolumn{1}{|p{4cm}|}{Control Message per year} & $3.145 10^{11} $ &$ 3.145 10^{8} $ \\ \hline + \multicolumn{1}{|p{4cm}|}{Max Bit Correct.} & 1 & 1 \\ \hline +% \multicolumn{1}{|p{4cm}|}{Parity-Check Bits} & 13 & 13 \\ \hline +% \multicolumn{1}{|p{4cm}|}{PEC Code Overhead} & 3 & 2 \\ \hline +% \multicolumn{1}{|p{4cm}|}{Payload Length} & 400 b & 800b \\ \hline + \multicolumn{1}{|p{4cm}|}{Payload Length} & \modified{294 bytes} & \modified{854 bytes} \\ \hline + \multicolumn{1}{|p{4cm}|}{Num Encoded Frames} & 4 & 4 \\ \hline + \multicolumn{1}{|p{4cm}|}{Needed Frames to Receiver} & 2 & 2 \\ \hline + \multicolumn{1}{|p{4cm}|}{Probability of Loosing a CM} & $10^{-14}$ & $10^{-13}$\\ \hline + \end{tabular} + + \label{tab:gsi_cern_fec} + \end{center} +\end{table} + +\subsection{Rapid Spanning Tree Protocol (RSTP)} + +In an Ethernet network with redundant topology, the problem of loops (causing "broadcast storms") +is handled by the Rapid Spanning Tree Protocol (RSTP) +% \cite{biblio:IEEE8021D} +. It creates a loop-free +logical topology by blocking appropriate ports in switches, and unblocks them in case of topology +break (due to element failure). + +The functionality provided by the RSTP is essential for the WRN. However, the convergence speed +provided by the standard implementation of the RSTP (milliseconds +%\cite{biblio:RSTPperf} +at best) +would cause many CMs to be lost during the process. This is not acceptable, we need +a solution which is fast enough to prevent loosing the CMs at all. Since we know the +size-range of the CMs (Table~\ref{tab:requirements}) and how they are FEC-encoded into Ethernet frames, +we can calculate the maximum value of the convergence time: 3$\mu s$. This time is smaller than +the duration of transmitting a single frame with FEC-encoded CM -- this ensures that no more than +two frames with FEC-encoded CM are lost, thus the CM can be recovered. + +In order to achieve a convergence time of 3$\mu s$, the switch-over between active +and backup connections needs to be performed in the hardware as soon as the link-down is detected. +It can only be done if the alternative topology is known in advance. The knowledge of alternative +topology is translated into an RSTP-assignment of alternative and backup roles of switch ports, +i.e at least one port with alternative role must be identified in every switch +(except the topology-root switch). +%\modified{, i.e at least one port with each of these roles must be identified in every switch}. +% +%If at least one port of a switch is assigned an alternative role, it means that +%the RSTP algorithm establishes more than one path to the topology-root switch and therefore +%the alternative topology is know in advance. +%Such ports are identified when the RSTP algorithm establishes more than one path to the +%topology-root switch and all paths can be used simultaneously, +% +If we ensure, by restricting the topology, that RSTP identifies the alternative links, +we can use its data to feed the hardware, consequently achieving the required convergence time +and staying standard-compatible: +the hardware switch-over is just a faster RSTP-driven convergence. The required topology +restrictions, described in \cite{biblio:robustness}, greatly overlap with these imposed by +the Time Distribution. + + + diff --git a/papers/ICALEPCS2011/Determinism.tex b/papers/ICALEPCS2011/Determinism.tex new file mode 100644 index 0000000000000000000000000000000000000000..cac3579b4ff8f533ab07fc808f63654a7c14aac4 --- /dev/null +++ b/papers/ICALEPCS2011/Determinism.tex @@ -0,0 +1,53 @@ +\section{Determinism} + +% The delivery latency of an Ethernet frame varies with cable length and the number of hops (switches) +% it has to traverse to reach its destination, the traffic load on the way and +% the assigned Class of Service (CoS, \cite{bilbio:vlan}). +A carefully configured and properly used WRN offers deterministic Ethernet frame delivery +thanks to the implementation of CoS and the fact that the delay introduced by the switch can be +verified by analysis of {\bf publicly available source code} \cite{biblio:whiteRabbit}. +Such analyses were performed to verify the worst-case upper bound +delivery latency of a CM against the requirements listed in the Table~\ref{tab:requirements}. +The results, presented in Table~\ref{tab:CMlatency} ({\it Store-and-forward} column), +take into account the fact that a CM is encoded into 4 Ethernet frames (as required by the FEC +and described in the next Section), it is sent with the highest priority (CoS) and it always +traverses 3 hops. + +\begin{table}[ht] +\caption{Control Message(CM) deliver latency estimations.} +\centering + + \begin{tabular}{| c | c | c | c | c | c |} \hline +\rowcolor{gray!35}{} + & \multicolumn{4}{|>{\columncolor{gray!35}}c|}{\textbf{CM deliver latency}} \\ \cline{2-5} +\rowcolor{gray!35}{} +\textbf{CM size}& \multicolumn{2}{|>{\columncolor{gray!35}}c|}{\textbf{Store-and-forward}} + &\multicolumn{2}{|>{\columncolor{gray!35}}c|}{\textbf{Cut-through}} \\\cline{2-5} +\multicolumn{1}{|>{\columncolor{gray!35}}c|}{} & +\multicolumn{1}{|>{\columncolor{gray!35}}c|}{GSI} & +\multicolumn{1}{|>{\columncolor{gray!35}}c|}{CERN} & +\multicolumn{1}{|>{\columncolor{gray!35}}c|}{GSI} & +\multicolumn{1}{|>{\columncolor{gray!35}}c|}{CERN} \\ \hline +% & GSI & CERN & GSI & CERN \\ \hline +%200 bytes & ???$\mu s$ & ???$\mu s$ & ??$\mu s$ & ???$\mu s$ \\ \hline +500 bytes & 221$\mu s$ & 283$\mu s$ & 76$\mu s$ & 118$\mu s$ \\ \hline +1500 bytes & 285$\mu s$ & 325$\mu s$ & 102$\mu s$ & 142$\mu s$ \\ \hline +5000 bytes & 324$\mu s$ & 364$\mu s$ & 162$\mu s$ & 202$\mu s$ \\ \hline +\end{tabular} +\label{tab:CMlatency} +\end{table} + +The analysis revealed that GSI's requirements are not fulfilled: the upper-bound delivery latency +for the required size of CM and max distance of 2km is greater then 100$\mu s$. + +The solution to decrease delivery latency is targeted into the CD only and +takes advantage of its characteristics (broadcast within a VLAN, sent by privileged node). +We propose to break the highest priority of +the CoS into two (unicast and broadcast) and use the highest priority broadcast Ethernet traffic only for +the CD. Moreover, this particular traffic shall be forwarded using the cut-through method +(unlike the store-and-forward method used normally in the switch) which can be effectively fast +for the broadcast traffic with a single source (DM). The results, +presented in Table~\ref{tab:CMlatency} ({\it Cut-through} column), show a significant improvement. +The solution requires hardware-supported cut-through forwarding in the switch as described +in \cite{biblio:robustness}. + diff --git a/papers/ICALEPCS2011/FailureStudy.tex b/papers/ICALEPCS2011/FailureStudy.tex new file mode 100644 index 0000000000000000000000000000000000000000..894186694c253960a28434b0b16107dba2de6dff --- /dev/null +++ b/papers/ICALEPCS2011/FailureStudy.tex @@ -0,0 +1,56 @@ +\section{Failure Study} + +One of the main possible reasons for WRN failure, which affects both Timing and Data Distribution, is +a malfunction of its elements (switches or links). Since the distribution of information +in the WRN is of one-to-all character (Data/Timing Master to all nodes), all the elements of the WRN are +considered Single Points of Failure (SPoF)\cite{biblio:mtbf}. Malfunction of any SPoF +results in failure of the entire system. +SPoFs can be eliminated by introducing redundancy of the system components. Due to its special features +(distribution of frequency over physical layer) and strict requirements (determinism, low data loss), +the number of possible redundant topologies of the WRN is restricted, as explained in the +following sections. + +Imperfections of the physical medium as well as switching between redundant elements of the network +(which takes time) can cause loss or corruption of data. The deterministic and \modified{mostly} broadcast character +of the data distribution in the WRN enforces application of the Forward Error Correction (FEC) +%\cite{biblio:coding} +-- adding redundant information on transmission to enable recovery of lost or corrupted data +on reception. This brings constant data overhead and the probability that the added redundancy is +not sufficient to recover the data. However, it is the price to pay for ensuring low latency +and determinism of data delivery in the WRN. + +The delivery latency of an Ethernet frame varies with cable length and the number of hops (switches) +it has to traverse to reach its destination, the traffic load on the way and +the assigned Class of Service (CoS). Therefore, to ensure the required determinism +of the CD delivery, we need to make sure that there is no congestion of Ethernet frames +carrying CMs. Moreover, the number of hops (the latency introduced by them) needs to be +sufficiently small, which can be done by restricting the topology. + +The resilience of the Clock Distribution translates into continuous and stable +synchronization of all the nodes and switches in the WRN (Table~\ref{tab:requirements}). Although, +the network redundancy eliminates SPoFs, the switch-over between redundant elements might introduce +instability and render the network unreliable despite the costly redundancy. +Therefore, a seamless switch-over between redundant clock paths needs to be ensured. +Another reason for the deterioration of the synchronization +accuracy is the variation of external conditions (e.g. temperature) which needs to be compensated. + +% In terms of the Data Distribution reliability, the topology redundancy can turn out to be +% useless, if the switch-over between redundant elements causes more data to be lost then the +% capabilities of FEC scheme. +% {\it [add here, change the rest]} +% In summary, we need investigate how to : +% \begin{Itemize} +% \item eliminate/decrease data loss due to : +% \begin{Itemize} +% \item physical medium imperfection, +% \item switch over between redundant elements, +% \item traffic congestion, +% \end{Itemize} +% \item eliminate synchronization instability due to: +% \begin{Itemize} +% \item switch over between redundant data paths, +% \item external condition variations, +% \item Ethernet frame loss (PTP), +% \end{Itemize} +% \item ensure required upper-bound delivery latency of Control Data. +% \end{Itemize} diff --git a/papers/ICALEPCS2011/Makefile b/papers/ICALEPCS2011/Makefile new file mode 100644 index 0000000000000000000000000000000000000000..325c73c9fc73452ec96cc41f2877b3674d22c447 --- /dev/null +++ b/papers/ICALEPCS2011/Makefile @@ -0,0 +1,14 @@ +all : WhiteRabbit.pdf + +.PHONY : all clean + +WhiteRabbit.pdf : WhiteRabbit.tex + latex $^ + bibtex WhiteRabbit + latex $^ + latex $^ + dvips -j0 WhiteRabbit + ps2pdf -dPDFX -dEmbedAllFonts=true -dSubsetFonts=true -dEPSCrop=true WhiteRabbit.ps + +clean : + rm -f *.eps *.pdf *.dat *.log *.out *.aux *.dvi *.ps *~ *.bbl *.blg diff --git a/papers/ICALEPCS2011/OverallReliability.tex b/papers/ICALEPCS2011/OverallReliability.tex new file mode 100644 index 0000000000000000000000000000000000000000..b4a9145464610eefdef5a09a54e950c4a643f252 --- /dev/null +++ b/papers/ICALEPCS2011/OverallReliability.tex @@ -0,0 +1,67 @@ +\section{Overall Reliability} + +The final equation of the WRN reliability is a sum of the data and clock distribution reliabilities. +The clock distribution is assumed to be sufficiently accurate as long as there is a connection +between the TM and all the nodes. The same applies to the CD distribution: +as long as there is a valid connection, the FEC makes sure that the data is delivered with +a sufficient reliability and the latency calculations prove it to be deterministic while the +congestion is prevented by CoS and limited number of data sources (DM). Consequently, the overall +reliability is strongly dependent on the WRN topology, which needs to be appropriate for the proposed +solutions (SyncE, H/W-supported RSTP, upper-bound latency). + +For the comparison of different network topologies, we consider the reliability of a network of +switches. +%with M inputs (connected to DM/TM). +Each node is connected to such a network with M links +(each to a separate switch). The value of M reflects the level of redundancy +(M=1 for no redundancy, M=2 for double redundancy, etc). + + +In the calculations of the network reliability we used the idea of Mean Time Between Failure (MTBF) +and its relation with the failure probability presented in \cite{biblio:mtbf} +(a very simplified mathematical model). In order to calculate the MTBF of the entire network, we need the +MTBFs of each network component: switches and links. Since the WR switches are still under +development (no MTBF measured), we used representative values for CISCO switches +({2, 10 and 100}$*10^4$[h]). Two estimation methods were used: "Fault Tree analysis" +\cite{biblio:faultTree} and analytic. Both provide just rough estimations of the reliability. +The former allowed to estimate two-terminal reliability (DM to single node) +%\cite{biblio:INF_TECH} +of simple non/double/triple-redundancy topologies ($P_f$). The most desired value is the +all-terminal network reliability ($P_{f\_Network}$), where : $P_f < P_{f\_Network} < N_{nodes}*P_f$. +Table~\ref{tab:2000nodesReliability} +presents rough estimations of $P_{f\_Network}$ using analytic calculations for the three considered +topologies ($MTBF_{Switch}$=200 000[h]). However, to meet the requirement of $\approx$2000 nodes and +only three network layers (hops), +\modified{the Data Master node is connected to more separate switches than +the level of redundancy (M).} +% the topologies are of the type M-inputs/N-outputs, where +% $N \geq M$. +The estimations show that a triple redundancy topology can barely satisfy the requirements by CERN +(Table~\ref{tab:requirements}). + +% \begin{figure}[t] +% \centering +% \includegraphics[width=3.4in]{fig/threeTopologies.ps} +% \caption{Examples of topologies with different level of redundancy.} +% \label{fig:threeTopology} +% \end{figure} + +\begin{table}[ht] + +%\caption{Different topologies ($\approx 2000$ nodes).} +\caption{WRN topologies's reliabilities.} +\centering +%\rowcolors {0}{gray!35}{} + +\begin{tabular}{| c | c | c | c |} \hline +%{\bf Redundancy}& \textbf{Switches} & \multicolumn{2}{| c |}{\textbf{$MTBF_{Switch}$= 20 000[h] }} \\ +% & & $P_f$ & MTBF[h] \\ \hline +\rowcolor{gray!35}{} +{\bf Redundancy}& \textbf{Switches} & $P_f$ & MTBF[h] \\ \hline +No & 127 & $ 2.08*10^{-3}$ & $ 5.77*10^{3}$ \\ \hline +Double & 292 & $ 4.71*10^{-7}$ & $ 2.55*10^{7}$ \\ \hline +Triple & 495 & $ 3.06*10^{-11}$ & $ 4.08*10^{11}$ \\ \hline +\end{tabular} +\label{tab:2000nodesReliability} +\end{table} + diff --git a/papers/ICALEPCS2011/ReliabilityDefinition.tex b/papers/ICALEPCS2011/ReliabilityDefinition.tex new file mode 100644 index 0000000000000000000000000000000000000000..e506b99186f12dca609f0941b63576369d650905 --- /dev/null +++ b/papers/ICALEPCS2011/ReliabilityDefinition.tex @@ -0,0 +1,77 @@ +\section{Definition of reliability in a WRN} + +A WRN, consisting of White Rabbit Switches (switches) connected by fiber +or copper, is meant to transport information among White Rabbit Nodes (nodes). We distinguish +two types of information distributed over the WRN: +%(1) {\bf Timing} (frequency and International Atomic Time) and +(1) {\it Timing} (frequency and Coordinated Universal Time) and +(2) {\it Data} (the Ethernet traffic). +This translates into two types of services provided by the WRN which have their own requirements and +can be handled separately. The requirements are defined by GSI and CERN as the prospective +users of WR to control their accelerators. + + +\subsection{Timing Distribution} + +Timing is distributed in the WRN from a switch/node called Timing Master (TM) +to all the other nodes/switches in the network. +% The TM is usually connected +% to the external source, such as Global Positioning System (GPS) receiver. +All the devices in the +WRN lock their frequency (syntonize) and adjust their local clocks (synchronize) to that of the TM. +The deviation between the clock of the TM and that of any other node/switch is called {\bf accuracy}. +A stable and continuous synchronization of all the nodes with an appropriate accuracy is the key +requirement of the Timing Distribution in the WRN. + +\subsection{Data Distribution} + +The critical data distributed over the WRN is the one carrying sets of commands (events) which are +organized into Control Messages (CM). The CMs are sent by a privileged node (Data Master, DM) in the +payload of the Ethernet frame(s). Therefore, the Data Distribution in the WRN is broken into +(1) {\it Control Data (CD)} -- the Ethernet frames carrying CMs, critical, and +(2) {\it Standard Data (SD)} -- the Ethernet frames which do not carry CMs, non-critical. +The reliability of the WRN depends on the successful delivery of the CD to all +the designated nodes. The CMs are always broadcast within a VLAN +% \cite{bilbio:vlan} +, which can span +the entire network. The worst-case upper bound of their delivery latency from the DM to any node in +the network, regardless of it's location ({\bf maximum distance from the DM}), is required to be +guaranteed by the network -- this is {\bf a determinism} requirement. + +\subsection{Reliability of the WRN} + +The reliability of the WRN relies on the {\bf deterministic} delivery of the CD +to all the designated nodes and their sufficiently {\bf accurate and stable synchronization}. +This means that the WRN is considered non-functional if one or more of the following occur: +\begin{Itemize} + \item A node is synchronized with insufficient accuracy. + \item A designated node receives corrupted CD or no CD. + \item The upper-bound delivery latency has been exceeded. +\end{Itemize} +% (1) A node is synchronized with insufficient accuracy; +% (2) A designated node receives corrupted CD or no CD; +% (3) The upper-bound delivery latency has been exceeded. +Unreliability is translated into the number of CMs considered lost (not delivered, delivered +corrupted or in a non-deterministic way) in a given period of time. During this time, +the synchronization must be always of the required quality. +Quantitative requirements of the accelerator facilities are listed in Table~\ref{tab:requirements}. + +\begin{table}[ht] +\caption{GSI's and CERN's requirements summary.} +\centering + \begin{tabular}{| l | c | c |} \hline +%\textbf{Requirement}& \multicolumn{2}{|c|}{\textbf{Value(s)}} \\ +\rowcolor{gray!35}{} +\textbf{Requirement} & {\bf GSI} & {\bf CERN} \\ \hline +Max latency & 100$\mu s$ & 1000$\mu s$ \\ \hline +CM failure rate & $3.17*10^{-12}$ & $3.17*10^{-11}$ \\ \hline +CMs lost per year & 1 & 1 \\ \hline +$d_{max}$ from DM & 2km & 10km \\ \hline +CM size & 200-500 bytes & 1200-5000 bytes \\ \hline +Accuracy & probably 8ns & 1$\mu s$ to ~2ns \\ +%accuracy & & few nodes ~2ns \\ +\hline + +\end{tabular} +\label{tab:requirements} +\end{table} \ No newline at end of file diff --git a/papers/ICALEPCS2011/WhiteRabbit.tex b/papers/ICALEPCS2011/WhiteRabbit.tex new file mode 100644 index 0000000000000000000000000000000000000000..4c14992b041dccaab2e3f96e61162aae0ad0918a --- /dev/null +++ b/papers/ICALEPCS2011/WhiteRabbit.tex @@ -0,0 +1,59 @@ + + +\documentclass{../JAC2003} % A4 +%\documentclass[acus]{JAC2003} % US + +\usepackage{graphicx} +\usepackage{booktabs} +\usepackage{color} +\usepackage{multirow} +\usepackage{multicol} +\usepackage[table]{xcolor} +\usepackage{colortbl} +\usepackage{array} + +\setlength{\titleblockheight}{27mm} + +\hyphenation{op-tical net-works semi-conduc-tor} + + + +%\newcommand \modified[1]{{\textcolor{red}{#1}}} +\newcommand \modified[1]{{\textcolor{black}{#1}}} + +\begin{document} + +\title{RELIABILITY IN A WHITE RABBIT NETWORK} + +\input{authors} + +\maketitle + +\input{abstract} + +\input{introduction} + +\input{ReliabilityDefinition} + +\input{FailureStudy} + +\input{Determinism} + +\input{ControlDataDistribution} + +\input{ClockDistribution} + + + +\input{OverallReliability} + +\input{conclusion} + + + +\bibliographystyle{IEEEtran} +\bibliography{IEEEabrv,./biblio} + +\end{document} + + diff --git a/papers/ICALEPCS2011/abstract.tex b/papers/ICALEPCS2011/abstract.tex new file mode 100644 index 0000000000000000000000000000000000000000..aafd2581ff159be34851cbc6380e9637b896d017 --- /dev/null +++ b/papers/ICALEPCS2011/abstract.tex @@ -0,0 +1,27 @@ +\begin{abstract} + +White Rabbit (WR) is a time-deterministic, low-latency Ethernet-based network which enables +transparent, sub-ns accuracy timing distribution. It is being developed to replace +the General Machine Timing (GMT) +%\cite{biblio:GMT} +system currently used at CERN and will become +the foundation for the control system of the Facility for Antiproton and Ion Research (FAIR) +at GSI. High reliability is an important issue in WR's design, +since unavailability of the accelerator's +control system will directly translate into expensive downtime of the machine. +A typical WR network is required to lose not more than a single message per year. +Due to WR's complexity, the translation of this real-world-requirement into +a reliability-requirement constitutes an interesting issue on its own -- a WR network +is considered functional only if it provides all its services to all its clients at any time. +This paper defines reliability in WR and describes how it was addressed by dividing it into +sub-domains: deterministic packet delivery, data +%redundancy, +resilience, +topology redundancy and clock +resilience. The studies show that the Mean Time Between Failure (MTBF) of the WR Network +is the main factor affecting its reliability. Therefore, probability calculations for +different topologies were performed using the "Fault Tree analysis" and analytic estimations. +Results of the study show that the requirements of WR are demanding. Design changes might be needed +and further in-depth studies required, e.g. Monte Carlo simulations. Therefore, a direction +for further investigations is proposed. +\end{abstract} \ No newline at end of file diff --git a/papers/ICALEPCS2011/authors.tex b/papers/ICALEPCS2011/authors.tex new file mode 100644 index 0000000000000000000000000000000000000000..622720ebba8df037601bf330d06811a6181321b0 --- /dev/null +++ b/papers/ICALEPCS2011/authors.tex @@ -0,0 +1,7 @@ + +\author +{ + Maciej Lipi\'{n}ski, Javier Serrano, Tomasz W\l{}ostowski, CERN, Geneva, Switzerland\\ + Cesar Prados, GSI, Darmstadt, Germany +} + diff --git a/papers/ICALEPCS2011/biblio.bib b/papers/ICALEPCS2011/biblio.bib new file mode 100644 index 0000000000000000000000000000000000000000..7375f92ccacdbb90f1aae4cd47c84aa04fe76885 --- /dev/null +++ b/papers/ICALEPCS2011/biblio.bib @@ -0,0 +1,193 @@ +@standard{biblio:IEEE8021D, + title = "IEEE Standard for Local and metropolitan area networks + Media Access Control (MAC) Bridges", + organization = "IEEE", + address = "New York", + number = "802.1D", + year = "2004", +} + + +@standard{biblio:IEEE1588, + title = "IEEE Standard for a Precision + Clock Synchronization Protocol for Networked Measurement and Control Systems", + organization = "IEEE", + address = "New York", + number = "1588-2008", + year = "2008", +} + +@standard{biblio:IEEE8023, + title = "IEEE Standard for + Information Technology--Telecommunications and Information Exchange Between + Systems--Local and Metropolitan Area Networks--Specific Requirements Part 3: + Carrier Sense Multiple Access With Collision Detection (CSMA/CD) Access Method + and Physical Layer Specifications - Section Three", + year = "2008", + organization = "IEEE", + address = "New York", + number = "802.3-2008", +} + +@standard{bilbio:vlan, + title = "{IEEE Standard for Local and metropolitan area networks + Virtual Bridged Local Area Networks}", + year = "2005", + organization = "IEEE", + address = "New York", + number = "802.1Q-2005" +} + + +@standard{biblio:SynchE, + title = "Timing characteristics of a synchronous Ethernet equipment slave clock {(EEC)}", + year = "2007", + number = "G.8262", + organization = "ITU-T", +} + +@inproceedings{biblio:ISPCS2011, + author = "M.Lipinski, T.Wlostowski, J.Serrano and P.Alvarez", + title = "White Rabbit: a {PTP} Application for robust sub-nanosecond synchronization", + booktitle = "Proceedings of ISPCS2011", + address = "Munich, Germany", + year = "2011", +} + +@inproceedings{biblio:GMT, + author = "J.Serrano and P.Alvarez and D.Dominguez, J.Lewis", + title = "Nanosecond Level {UTC} Timng Generation and Stamping in {CERN}'s {LHC}", + booktitle = "Proceedings of ICALEPSC2003", + address = "Gyeongju, Korea", + year = "2003", +} + +@techreport{biblio:FAIRtimingSystem, + author = "T. Fleck and C. Prados and S. Rauch and M. Kreider", + title = "{FAIR} Timing System", + institution = "GSI", + address = "Darmstadt, Germany", + year = "2009", + note = "v1.2", +} + +@inproceedings{biblio:distOscilloscope, + author = "S. Deghaye and D. Jacquet and I. Kozsar and J. Serrano", + title = "{OASIS}: A NEW SYSTEM TO ACQUIRE AND DISPLAY THE ANALOG SIGNALS FOR {LHC}", + booktitle = "Proceedings of ICALEPCS2003", + address = "Gyeongju, Korea", + year = "2003", +} + +@inproceedings{biblio:PAC11, + author = "J.Serrano, P.Alvarez, M.Lipinski and T.Wlostowski", + title = "Accelerator Timing Systems Overview", + booktitle = "Proceedings of PAC11", + address = "New York, USA", + year = "2011", +} + +@Inproceedings{biblio:WRproject, + author = "J. Serrano and P. Alvarez and M. Cattin and E. G. Cota and others", + title = "{The White Rabbit Project}", + booktitle = "ICALEPCS", + address = "Kobe, Japan", + year = "2009", +} + +@Misc{biblio:WRPTP, + author = "E.G. Cota and M. Lipinski and T. Wlostowski and E.V.D. Bij and J. Serrano", + title = "{White Rabbit Specification: Draft for Comments}", + note = "v2.0", + month = "july", + year = "2011", + howpublished = {\url{http://www.ohwr.org/documents/21}} +} + +@Misc{biblio:CERNwrControlAndTiming, + author = "J-C.Bau and M.Lipinski", + title = "{White Rabbit CERN Control and Timing Network}", + month = "July", + year = "2011", + howpublished = {\url{http://www.ohwr.org/documents/85}} +} + +@Misc{biblio:robustness, + author = "C.Prados and M.Lipinski", + title = "{White Rabbit and Robustness}", + month = "March", + year = "2011", + howpublished = {\url{http://www.ohwr.org/documents/103}} +} + +@mastersthesis{biblio:TomekMSc, + author = "T.Wlostowski", + title = "Precise time and frequency transfer in a {White} {Rabbit} network", + month = "may", + year = "2011", + school = "Warsaw University of Technology", + howpublished = {\url{http://www.ohwr.org/documents/80}} +} + +@Inproceedings{biblio:Takahide, + author = "Takahide Murakami and Yukio Horiuchi", + title = "{A Master Redundancy Technique in IEEE 1588 Synchronization with a Link Congestion + Estimation}", + booktitle = "Proceedings of ISPCS", + year = "2010", +} + +@electronic{biblio:whiteRabbit, + title = "{White Rabbit}", + howpublished = {\url{http://www.ohwr.org/projects/white-rabbit}} +} + +@article{biblio:ohl, + author = "M.Giampietro", + title = "Hardware joins the open movement", + journal = "CERN Courier", + address = "CERN, Geneva", + year = "2011", + howpublished = {\url{http://cerncourier.com/cws/article/cern/46054}}, +} +@article{biblio:RSTPperf, + authors = "Pallos, R., Farkas, J., Moldovn, I. and Lukovszki, C.", + title = "Performance of Rapid Spanning Tree Protocol in Access and Metro Networks", + journal = "2nd International ICST Conference on Access Networks", + year = "2007", +} + +@article{biblio:r-s, + author = "I.S.Reed, G.Solomon", + title = "{Polynomial Codes Over Certain Finite Fields}", + journal = "SIAM Journal of Applied Math", + address = "USA", + year = "1960", +} + +@book{biblio:mtbf, + author = "K.Dooley", + title = "Designing Large-Scale LANs", + publisher = "O'Reilly", + year = "2002", +} + +@book{biblio:coding, + author = "S.Lin, D.J.Castello", + title = "Error Control Coding", + publisher = "Pearson Prentice Hall", + year = "2004", +} + +@book{biblio:INF_TECH, + author = "D.J.C. MacKay", + title = "Information Theory, Inference, and Learning Algorithms", + publisher = "Cambridge University Press", + year = "2005", +} + +@misc{biblio:faultTree, + title = "Reliability Workbench, Fault Tree", + publisher = "Isograph", + howpublished = {\url{www.isograph.com}}, +} \ No newline at end of file diff --git a/papers/ICALEPCS2011/conclusion.tex b/papers/ICALEPCS2011/conclusion.tex new file mode 100644 index 0000000000000000000000000000000000000000..7642f225cae0194ce7322f0c83764ed1c048399b --- /dev/null +++ b/papers/ICALEPCS2011/conclusion.tex @@ -0,0 +1,22 @@ +\section{Conclusions} + + +A WRN must be considered as an ordinary Ethernet network with extra optional built-in features +which, when properly used, can make it robust and more reliable. This, however, comes at a price +of topology restrictions and redundant elements (money). The reliability study described in this +article and detailed in \cite{biblio:robustness} presents areas which need to be addressed to +increase the reliability of a WRN. The development of WR is an on-going effort and some of the +suggested solutions have been already properly investigated or developed (FEC, clock distribution) +while the others need further verification (RSTP, cut-through forwarding). +Suggested solutions enable to fulfill the requirements set by CERN and GSI. +However the costs might trigger double-checking and re-justifying of at least two of them: +upper-bound latency by GSI and the number of CMs lost per year. +The former requires additional development efforts to achieve the required 100$\mu s$. +The latter requires a high level of network redundancy (triple or more) which is very costly. +Since the network topology and its reliability calculations turned out to be the greater factor in +the overall system reliability, it is necessary to perform more precise calculations and +simulations to verify the rough estimations. This might include different techniques (e.g. Monte Carlo simulations) +but also more real-life use cases (i.e. of the network layout suggested in +\cite{biblio:CERNwrControlAndTiming}, which was not available at the time of described study). +\modified{Especially, we need to take into account and include into calculations the fact that +not all the nodes connected to the WRN are equally critical in real-life applications.} diff --git a/papers/ICALEPCS2011/introduction.tex b/papers/ICALEPCS2011/introduction.tex new file mode 100644 index 0000000000000000000000000000000000000000..9778668b76b390f36896e32614ca4bf09f3fd9e9 --- /dev/null +++ b/papers/ICALEPCS2011/introduction.tex @@ -0,0 +1,27 @@ +\section{Introduction} + +The WR project is a multi-laboratory, +multi-company, international effort to create a universal fieldbus for control and timing systems +to be used at CERN, GSI and possibly other such facilities. The rationale behind WR, +the choice of the technologies and technical details of its functioning have been already +described in a number of papers \cite{biblio:WRproject}, \cite{biblio:TomekMSc}, +\cite{biblio:WRPTP}. +%, \cite{biblio:ISPCS2011}. +The resilience and robustness is one of the key features of any fieldbus. +This article presents a study on the reliability of a White Rabbit Network (WRN) +assuming a basic knowledge about WR. + +Reliability is defined as the ability of a system to provide its services to clients under both +routine and abnormal circumstances. It can be estimated by calculating the probability of +the system's failure ($P_f$). +% \begin{equation} +% \label{eq:reliability} +% R =1 - P_f +% \end{equation} +The lesser the probability of WRN failure, the higher its reliability. Thus, in this article we +identify critical services of a WRN based on the study of WR's requirements. +Then, we analyze each critical service to identify possible +reasons for their failure and propose targeted counter-measures to increase reliability. +Finally, their impact on the overall system reliability is studied to +identify the highest contributor and the focus for the further studies. +