docs/specs/robustness: use common figures

46137f46 · Grzegorz Daniluk · 73669eb0 · 46137f46 · 46137f46 · 46137f46
Commit 46137f46 authored Sep 03, 2014 by Grzegorz Daniluk
57 changed files
--- a/documents/specifications/robustness/robustness_doc/Makefile
+++ b/documents/specifications/robustness/robustness_doc/Makefile
+all : robustness.pdf
+
+.PHONY : all clean
+
+robustness.pdf : robustness.tex 
+	latex $^
+	latex $^
+	dvips robustness
+	ps2pdf robustness.ps
+
+clean :
+	rm -f *.eps *.pdf *.dat *.log *.out *.aux *.dvi *.ps *.toc
+
--- a/documents/specifications/robustness/robustness_doc/acronyms.tex
+++ b/documents/specifications/robustness/robustness_doc/acronyms.tex
+%\centering
+\paragraph*{List of Acronyms}
+
+\vspace{3 cm}
+
+\normalsize
+\begin{flushleft}
+	\begin{tabular}{lcl} 
+		WR & : & White Rabbit \\ 
+		PTP & : & Precision Time Protocol \\
+		WRPTP & : & White Rabbit extension PTP \\
+		\HP & : & High Priority. Used to indicate special WR Ethernet
+			Frames\\
+		\SP & : & Standard Priority. Used to indicate special WR
+			  Ethernet Frames\\
+		&& \\
+		WRN & : & White Rabbit Node \\
+		WRS & : & White Rabbit Switch \\
+		WRCM & : & White Rabbit Clock Master Node \\
+		WRMM & : & White Rabbit Management Master Node \\
+		WRDM & : & White Rabbit Data Master Node \\
+		&&\\
+		&&\\
+		SNMP & : & Simple Network Management Protocol \\
+		FEC & : & Forward Error Correction\\
+		PFC & : & Priority Flow Control \\
+
+%	     \cc{ } & : & Comment by Cesar \\
+%	     \cm{ } & : & Comment by Maciej \\
+
+	\end{tabular}
+\end{flushleft}
+
+A few of the terms (names) used in this document are a matter of discussion.
+The currently used terms do not seem appropriate, they are sometimes
+confusiong. The names are listed below. They are prone to be changed. A reader
+is kindly asked to indicate a better name, if he/she have something in mind.
+
+\begin{itemize}
+  \item \HighPriority (\HP),
+  \item \StandardPriority (\SP),
+  \item \GranularityWindow (\GW),
+  \item \ControlMessage (\CM).
+\end{itemize}
+
+\newpage
+
--- a/documents/specifications/robustness/robustness_doc/app1.tex
+++ b/documents/specifications/robustness/robustness_doc/app1.tex
+\chapter{Appendix: Reliability measure (Mean Time Between Failures)}
+\label{appA}
+
+
+
+The measurement of reliability by the number of Control Messages lost per year, 
+which has been mentioned in the requirements, is neither standard nor practical.
+Therefore, in this document, we estimate reliability of White Rabbit Network 
+using method which is readily and commonly applied to Large-Scale LANs. We use 
+Mean Time Between Failures of a single network component (be it WR Switch, 
+fibre/copper link, WR Node) to estimate reliability (i.e. MTBF and failure
+probability) of entire White Rabbit Network as described in
+\cite{DesigningLSLANs}. 
+
+Mean Time Between Failures (MTBF) represents a statistical likelihood that half 
+of the number of devices, represented by a given MTBF factor, will not function 
+properly after the period given by MTBF. MTBF does not give the functional 
+relationship between time and number of failures. However, the estimation that 
+the function is linear is assumed to be sufficient (see \cite{DesigningLSLANs}, 
+page 36).
+
+For network MTBF derivation, the probability of failure related to the number of
+failures of N devices per unit time is interesting:
+  \begin{equation}
+     \label{eq:MTBFprob}
+	P_= \frac{N}{2*MTBF}  
+  \end{equation}
+
+However, to use in probability calculations, a net value is required.
+Therefore, the below equation is used. It is a probability of single device
+failing in a single day, where M denotes MTBF 
+per-day:
+
+  \begin{equation}
+      \label{eq:MTBFprobNetto}
+	P_= \frac{1 }{2*M}  
+  \end{equation}
+
+Examples of common values of MTBF of network components and corresponding 
+probability of their failure per-day are presented in 
+Table~\ref{tab:MTBFtable}.
+
+
+\begin{table}[ht]
+\caption{Example MTBFs and probabilities of network units (src: 
+	\cite{DesigningLSLANs})} 
+\centering
+	\begin{tabular}{| l |  c | c |}          \hline
+\textbf{Component} & \textbf{MTBF [hours]} & \textbf{Probability [$\%$]}\\ 
+                   &                       &                           \\ \hline
+Fiber connection   &       1000 000         & 0.0012                    \\
+\hline
+Router             &       200 000         & 0.0060                    \\ \hline
+\end{tabular}
+\label{tab:MTBFtable}
+\end{table}
+
+In order to calculate reliability of entire network, the meaning of network 
+failure needs to be defined. In case of White Rabbit Network, it is critical 
+that all the White Rabbit Nodes connected to the network receive Control 
+Messages. In other words, failure for White Rabbit Network is a failure of any 
+number of its components which prevents any WR Node from receiving Control 
+Messages. If single component causes network failure, such component is called 
+a single point of failure (SPoF).
+
+The probability of entire network failure is calculated by adding probabilities
+of all the components' simultaneous failures combinations which cause failure of
+the entire network. 
+Using equation~\ref{eq:MTBFprobNetto}, MTBF of entire network can be calculated 
+out of network failure probability.
--- a/documents/specifications/robustness/robustness_doc/app10.tex
+++ b/documents/specifications/robustness/robustness_doc/app10.tex
+\chapter{Appendix: Ethernet Frame Delivery Delay Estimation}
+\label{appH}
+
+\begin{center}
+	\includegraphics[scale=0.30]{../../../../figures/robustness/switchRouting.ps}
+	\captionof{figure}{WR Switch routing using Swcore and RTU (not to
+			  scale).}
+	\label{fig:swRouting}
+\end{center}
+
+It is estimated that WR Switch routing ($delay_{sw}$) takes between
+13 $\mu$s and 80 $\mu$s for highest priority traffic (size: 1500bytes), provided
+no traffic congestion occurs and depending on the size of output buffer. In
+order to estimate minimum Granularity Window a few more values needs to be
+introduced and estimated. We define the time it takes for a WR Node to send an
+Ethernet frame as Transmission Delay ($delay_{n\_tx}$) and the time it takes a
+WR Node to receive Ethernet frame as Reception Delay ($delay_{n\_rx}$). The
+delay introduced by physical connection (i.e. the time it takes for a frame to
+travel through the physical medium) is defined as Link Delay ($delay_{f\_link}$
+[$\frac{\mu s}{km}$] and $delay_{c\_link}$ [$\frac{\mu s}{km}$] for fibre and
+copper respectively). Thus, the final equation to estimate the delay of single
+Ethernet frame delivery time can be defined by the following equation:
+
+\begin{equation}
+	Delay_{frame} = D_{f} * delay_{f\_link} + D_{c} * delay_{c\_link} + 
+	N * delay_{sw} + delay_{n\_tx} + delay_{n\_rx}
+\end{equation}	    
+
+where $D_f$ [$km$] is the total length of fibre connection and $D_f$ [$km$] is
+the total length of copper connection. $N$ is the number of WR Switches on
+the way.
+
+In the following estimations, the worst case scenario of Ethernet frame size is
+taken into account. It means that we always assume the size of the frame is
+maximum, i.e. 1500bytes. Transmission of 1500bytes over Gigabit Ethernet takes
+$~13\mu s$. 
+
+
+\paragraph{Ethernet Frame Transmission Delay Estimation.} 
+
+For simplicity, no encoding is assumed. The delay of frame transmission depends
+on the remaining size of the currently sent frame and the number of packages
+already enqueued in the output buffer. Assuming that sending 1500 bytes takes
+$~13\mu s$, $delay_{n\_tx}= [ 0 \mu s \div (13 + B * 13)\mu s]$
+where $B$ is the number of frames in the output buffer (maximum $B$ is the size
+of the output buffer). 
+
+\paragraph{Ethernet Frame Reception Delay estimation.} 
+
+The reception of maximum size Ethernet frame is estimated to take $~13\mu s$
+\footnote{The time of 1500byte Ethernet Frame reception is $12.176\mu s$,
+in the calculations, it is overestimated to $13\mu s$.}  .
+It is assumed that no decoding is performed. Therefore $delay_{n\_rx} = 13\mu
+s$. 
+
+\paragraph{Link Delay estimation.} 
+
+The delay introduced by link is estimated to be 5 [$\frac{\mu s}{km}$] for fibre
+\cite{PropagationDelay} (\cm{i'm not sure about the source}) and 5
+[$\frac{\mu
+s}{km}$] copper \cite{FAIRtiming} link.
+
+\paragraph{Switch Routing Delay estimation.} 
+
+The delay introduced on the switch is more complicated to estimate. The
+reception delay and transmission delay overlap with the delay introduced by
+storing and routing.
+\begin{equation}
+	delay_{sw} =  \delta + delay_{RTU} + delay_{n\_tx}  
+          \; \; \; \; \;
+          for 
+	  \; \; \; \; \;
+	  (delay_{n\_rx} - \delta) < delay_{RTU}
+\end{equation}	 
+\begin{equation}
+	delay_{sw} =  delay_{n\_rx} + delay_{n\_tx}
+        \; \; \; \; \;
+          for 
+	\; \; \; \; \; 
+        (delay_{n\_rx} - \delta) > delay_{RTU}
+\end{equation}	
+
+where  $\delta$ is the time needed to receive frame's header and retrieve
+information necessary for Routing Table Unit, e.g.: VLAN, source and
+destination MAC. In general, it's always true that 
+
+\begin{equation}
+	delay_{sw} \geq  delay_{n\_rx} + delay_{n\_tx}
+\end{equation}	
+
+The Routing Table Unit delay is estimated as $delay_{RTU} = [0.5\mu s \div 3\mu
+s]$ (\cm{i'm not sure about this 3 us, need to make some tests/simulations })
+So  $delay_{sw} = (13 + B * 13)\mu s$ where $B$ is the number of
+frames in the output buffer (maximum $B$ is the size of the output buffer). When
+considering Frame Transmission Delay in the Node, the number of frames in the
+output buffer can be assumed 0 ($B = 0$). However, in the switch's Frame
+Transmission Delay consideration, it is very likely that $B > 0$, since many
+ports can forward frames to the same port simultaneously. Therefore $B$ should
+equal to the size of output buffer. In this document, we
+assume $B=5$. However, the number can be much greater. The final range of the
+delay for consideration is $delay_{sw} = [13 \mu s \div (13 + B * 13)\mu s$
+
+\paragraph{Ethernet Frame Delivery Delay estimation.}
+
+A summary of above estimations is included in the
+Table~\ref{tab:EtherFrameDelayGeneral}. Details of the final frame delivery
+delay estimation for GSI and CERN, taking into account the requirements
+concerning the length of physical links, are depicted in
+Table~\ref{tab:EtherFrameDelayNumbers}.
+
+\newpage
+
+\begin{table}[ht]
+\caption{Elements of Ethernet frame delivery delay estimation.} 
+\centering
+	\begin{tabular}{| l |  c | c | c |}          \hline
+\textbf{Name}&\textbf{Symbol}&\textbf{Value}&\textbf{Value}                  \\
+                                 &                &  Min& Max          \\ \hline
+% Sending node
+Ethernet Frame Transmission Delay&$delay_{n\_tx}$&$0\mu s$&$(13 + B * 13)\mu s$
+\\ \hline
+% Switch
+Switch Routing Delay            &$delay_{n\_sw}$&$13\mu s$&$(13 + B * 13)\mu s$ 
+ 
+\\ \hline
+% Links
+Link Delay                       & $delay_{link}$ &5 [$\frac{\mu
+s}{km}$]&5 [$\frac{\mu s}{km}$]      
+\\ \hline
+% Receivning node
+Ethernet Frame Reception delay   & $delay_{n\_rx}$&$13\mu s$&$13\mu s$
+
+\\ \hline
+\end{tabular}
+\label{app:tab:EtherFrameDelayGeneral}
+\end{table}
+
+\begin{table}[ht]
+\caption{Parameters and numbers used to estimate Ethernet frame delivery delay
+in WR Network} 
+\centering
+	\begin{tabular}{| l |  c | c | c | c |}          \hline
+\textbf{Name}&\textbf{Symbol}&\textbf{Value}&\textbf{Value}&\textbf{ Concenrs} 
+\\
+                                 &                &  (GSI)&(CERN)   &Network el.
+\\ \hline
+% Frame param
+Frame size                       & $f\_size$      &1500 bytes&1500 bytes&Frame
+\\ \hline
+% Sending node
+Number of frames in output buffer& $B_{tx}$       &0         &0         &Tx Node
+\\ \cline{0-3}
+Ethernet Frame Transmission Delay& $delay_{n\_tx}$&13 $\mu s$&13 $\mu s$&      
+ \\ \hline
+% Switch
+Number of frames in output buffer& $B_{sw}$       &5         &5         &Switch
+\\ \cline{0-3}
+Switch Routing Delay             & $delay_{n\_sw}$&78 $\mu s$&78 $\mu s$&     
+\\ \hline
+Number of hops (switches)        & $delay_{n\_sw}$&3         &3         &   
+\\ \hline
+% Links
+Link Length                      & $D$            &2 km      &10 km     & Links
+\\ \cline{0-3}
+Link Delay                       & $delay_{link}$ &10 $\mu s$&50$\mu s$&       
+\\ \hline
+% Receivning node
+Ethernet Frame Reception delay   & $delay_{n\_rx}$&13 $\mu s$&13 $\mu s$&Rx Node
+\\ 
+\hline
+\hline
+Ethernet frame delivery delay    & $Delay_{frame}$&270$\mu s$&310$\mu s$& ALL  
+\\ \hline
+\end{tabular}
+\label{tab:EtherFrameDelayNumbers}
+\end{table}
--- a/documents/specifications/robustness/robustness_doc/app2.tex
+++ b/documents/specifications/robustness/robustness_doc/app2.tex
+\chapter{Appendix: Control Messages Size at CERN}
+\label{appB}
+
+\section{Control Messages Size at CERN}
+The appendix presents how the Control Message size for CERN was estimated.
+Since there is no official document on this yet, the below information is not
+official but can be useful.
+
+Each event consists of
+\begin{itemize}
+  \item Address, estimated size: 32 bits
+  \item Timestamps, estimated size: 64 bits
+  \item Event Header (IDs), size: 32 bits
+  \item Event Payload, size: 64 bits
+\end{itemize}
+
+The total size of single event is estimated to be 192 bits or 24 bytes.
+In the current control system at CERN, there are 7 events generated per each of
+7 distribution networks (per machine). This gives 49 events every Granularity
+Window. 
+
+As a consequence, for the current system, the Control Message size would equal
+1176 bytes. However, the current number of events is not sufficient. Therefore,
+it is suggested to increase it. A desired number of events is not defined, the
+more the better. Four-fold increase, to 200 events, would be very appreciated.
+This gives the Control Message size of 4800 bytes. 
+
+Therefore, the minimum Control Message size given in the document equals 1200
+bytes and the maximum size is rounded to 5000 bytes. 
+
+
+\section{Control Messages Size at GSI}
+
--- a/documents/specifications/robustness/robustness_doc/app3.tex
+++ b/documents/specifications/robustness/robustness_doc/app3.tex
+\chapter{Appendix: HP Bypass Hardware Implementation}
+\label{appC}
+
+
+A method has been proposed to achieve below 3 $\mu s$ switch-over of
+ports' role and state (as understood in RSTP specification \cite{IEEE8021D}),
+and consequently HP Traffic routing. The solution takes advantage of the fact
+that HP Traffic is routed using HP Bypass (see
+Chapter~\ref{jitterDeterminismNetworkDimention}). Hardware implementation of HP
+Bypass and the simplicity of routing, enables extremely fast port switch-over.
+The changes proposed thereafter, shall be integrated into HP Bypass
+implementation.
+
+Two registers arranged in a table shall be available per port (RSTP Port Role
+Table). A table entry is adressed by VLAN ID. Each entry in the table
+represents an association between VLAN number and port's role in this particular
+VLAN. An entry size shall be 4 bits to enable encoding the following roles:
+Root, Designated, Alternate, Backup, Blocked. There shall be $2^10$ entries
+in the register to represent 1024 VLANS. One of the registers stores current
+role of a give port. It is called RSTP Port Current Role Table and is
+used in a routing process of HP Traffic. It can be read-only by software. The
+latter register stores next roles of a given port. It is called RSTP Port Next
+Role Table and can be writen-only by software. If the content of both registers
+differ, the RSTP Port Current Role Table is updated with the Next Role
+Table when no HP Traffic is being forwarded. The timeslot when HP Traffic is
+not being forwarded (is not received) is called a HP Gap. Since HP Packages are
+always sent in burst, HP Gap can be easily detected. It is imporant to change
+port roles while HP Gap takes place to prevent HP Package loose. Such a loose
+can happen when the change is being made between port with longer path to Data
+Master and port with shorter link to Data Master (see Appandix~\ref{appD} for
+use cases analysis.)
+
+
+\textbf{In normal operation}, the role of a given port for a given VLAN is set
+by the software (RSTP daemon) by writting appropriate RSTP Port Next Role
+Register. The register shall be written as soon as ports' roles have been
+established by means of RSTP Algorithm. The HP Bypass algorithm shall verify the
+role of the port on which a HP Package is received for a given VLAN ID (provided
+in the header). If the port's role translates into forwarding state, the
+algorithm checks port roles of all other ports for the given VLAN ID. HP Package
+is forwarded to all the ports whose ports' role translate into forwarding state.
+The transition between port's role and port's state for HP Traffic is
+included in Table~\ref{tab:portRoleStatetrans}.
+
+\begin{table}[ht]
+\caption{Translation between port's role and state for HP Traffic.} 
+\centering
+	\begin{tabular}{| c | c | c | c |}          \hline
+\textbf{Port's Role}& \multicolumn{2}{|c|}{\textbf{Port's State}}  \\
+                    & Incoming   & Outgoing     \\ \hline
+Root                & Forward    & Forward      \\ \hline
+Designated          & Forward    & Forward      \\ \hline
+Alternate           & Block      & Block        \\ \hline
+Backup              & Forward    & Forward      \\ \hline
+Disabled            & Block      & Block        \\ \hline
+\end{tabular}
+\label{tab:portRoleStatetrans}
+\end{table}
+
+\textbf{In case of link failure}, as soon as link failure is detected by the
+Endpoint, it shall notify HP Bypass and the change of ports' roles stored in
+RSTP Port Current Role Tables shall be triggered. The change concerns only VLANs
+for which the the broken port was root or designated. For such VLANs the ports'
+role shall change according to the Table~\ref{tab:portRoleTransition}.
+The process of HP routing and ports' role change in case of link failure is
+presented in Figure~\ref{fig:wrRSTP}.
+
+\begin{table}[ht]
+\caption{Port's role transitions in case of link failure.} 
+\centering
+	\begin{tabular}{| c | c | c | c |}          \hline
+\textbf{Current Role}& \textbf{New Role}  \\
+                     &               \\ \hline
+Root                 & Disabled      \\ \hline
+Designated           & Disabled      \\ \hline
+Alternate            & Root          \\ \hline
+Backup               & Designated    \\ \hline
+\end{tabular}
+\label{tab:portRoleTransition}
+\end{table}
+
+\begin{center}
+	\includegraphics[scale=0.20]{../../../../figures/robustness/wrRSTP.ps}
+	\captionof{figure}{WR RSTP for HP Traffic}
+	\label{fig:wrRSTP}
+\end{center}
--- a/documents/specifications/robustness/robustness_doc/app4.tex
+++ b/documents/specifications/robustness/robustness_doc/app4.tex
--- a/documents/specifications/robustness/robustness_doc/app5.tex
+++ b/documents/specifications/robustness/robustness_doc/app5.tex
+\chapter{Appendix: Potential Modifications to RTU required by RSTP}
+\label{appE}
+
+Potential changes to RTU needed by RSTP: \\
+1. RTU@HW:
+\begin{itemize}
+  \item Blocking of incoming packages per-VLAN (currently only per-port).
+  \item Blocking of outcoming packages per-port (currently only per-VLAN).
+\end{itemize}
+2. RTU@SW:
+\begin{itemize}
+  \item aging of information on a given port - this means queuing Filtering
+Database for the entries belonging to given port, and removing (aging out) this
+entries).
+  \item changes enabling control of new HW parameters.
+\end{itemize}
\ No newline at end of file
--- a/documents/specifications/robustness/robustness_doc/app6.tex
+++ b/documents/specifications/robustness/robustness_doc/app6.tex
--- a/documents/specifications/robustness/robustness_doc/app7.tex
+++ b/documents/specifications/robustness/robustness_doc/app7.tex
+\chapter{Appendix: Timing and Prioritizing of the Ideas Presented} 
+\label{appF}
+
+The number of ideas presented in this document is overwhelming. Not all the
+ideas are necessary for White Rabbit to work as Control Network
+which is carefully controlled, managed and configured. Some of the ideas are
+thought for the more general usage. The below table attempts to prioritize the
+ideas, group it into areas and present planning.
+
+
+\begin{table}[ht]
+\caption{Timing and Prioritizing of the Ideas Presented.} 
+\centering
+\begin{tabular}{| p{3.5cm} | p{1cm} | p{3.5cm} | p{1.5cm} | p{3.5cm} |} \hline
+\textbf{Name}&\textbf{Prio}& \textbf{Approx. finished}&\textbf{Ref
+Chapter} & \textbf{Area}  \\ \hline
+FEC       & 1  & Workshop April 2011 & \ref{chapter:FEC} & Control Data\\ \hline
+HP Bypass & 2  & Workshop after April Workshop & \ref{chapter:FEC} & Control
+Data\\ \hline
+WR RSTP (HP only)   & 2  & Workshop after April Workshop& \ref{chapter:WRRSTP} &
+Control,
+Standard Data and a bit timing     
+\\ \hline
+Monitoring (limited) & 2  & Workshop after April Workshop &
+\ref{chapter:monitoring} &
+Diagnostics, Monitoring   
+\\ \hline \hline
+Congestion/Flow Control & 3  & 2012 & \ref{chapter:monitoring} &
+Standard and Control Data   
+\\ \hline
+Management & 3  & 2012 & \ref{chapter:monitoring} &
+Standard and Control Data   s
+\\ \hline
+Full Monitoring & 3  & 2012 & \ref{chapter:monitoring} &
+Diagnostics
+\\ \hline \hline
+Transparent Clocks PTP & 4  & ? & - & Timing Data
+\\ \hline
+Ring topologies      & 5  & ? & - & Timing and Control Data Earthquake 
+\\ \hline
+%Congestion/flow control, standard and for all the priorities      & 5  & ? & -
+%&
+%Control Data
+%\\ \hline
+Link Aggregation & 5  & ? & - & Timing, Control and Standard Data
+\\ \hline
+WR RSTP (SP traffic and other crazy ideas regarding SP and HP)   & 6  &  ? &
+\ref{chapter:WRRSTP} &
+Control and Standard Data
+\\ \hline
+
+
+\end{tabular}
+\label{tab:RobustnessPrioAndPlan}
+\end{table}
--- a/documents/specifications/robustness/robustness_doc/app8.tex
+++ b/documents/specifications/robustness/robustness_doc/app8.tex
+\chapter{Appendix:Flow Monitor} 
+\label{appSFlow}
+
+\section{sFlow}
+
+sFlow is a multi-vendor sampling technology embedded within switches and
+routers. It provides the ability to continuously monitor application level
+traffic flows at wire speed on all interfaces
+simultaneously~\ref{tab:sflow_info}.
+
+
+sFlow consists of:
+
+\begin{itemize}
+	\item sFlow Agent 
+	\item sFlow Collector
+\end{itemize}
+
+The sFlow Agent is a software process that runs in the White Rabbit Switches.
+ It combines interface Counters and Flow Samples into sFlow datagrams that are
+sent across the network to an sFlow Collector. The Counters and Flow Samples
+will be implemented in hardware in order to increase the
+processing of the data sampling. Flow samples are defined based on a sampling
+rate, an average of 1 out of N packets is randomly sampled. This type of
+sampling provides  quantifiable accuracy. A polling interval defines how often
+the network device sends interface counters. 
+
+The sFlow Agent packages the data into sFlow Datagrams that are sent on the
+network. The sFlow Collector receives the data from the Flow generators, stores
+the information and provides reports and analysis. 
+
+\subsubsection{Configuration}
+
+Every switch capable of sFlow must configure and enable:
+
+\begin{itemize}
+	\item local agent
+	\item sFlow Colector address 
+	\item ports to monitor
+\end{itemize}
+
+In order to acquire a reliable network information in a WR network:
+
+\begin{itemize}
+	\item the statistics shall be collected every ?? (sec,msec..)
+	\item a sample is taken per port every ?? (sec,msec...)
+	\item ?? samples per port shall be sent to the CPU
+
+\end{itemize}
+
+
+\section{Requirements of a Flow Monitor}
+
+General requirements:
+\begin{itemize}
+	\item Network-wide view of usage and active switches. 
+	\item Measuring network traffic, collecting, storing, and analysing
+traffic data.
+	\item Monitor links without impacting the performance of the switches
+without adding significant network load.
+	\item Industrial Standard
+\end{itemize}
+
+\noindent The Flow Monitor shall: 
+
+\begin{itemize}
+	\item Measure the volume and rate of the traffic by QoS level.
+	\item Measure the availability of the network and devices.
+	\item Measure the response time that a device takes to react to a given
+input.
+	\item Measure the throughput of the over the links.
+	\item Measure the latency and jitter of the network.	
+	\item Identify grouping of traffic by logics groups (Master, Node,
+Switch)
+	\item Identify grouping of traffic by protocols.
+	\item Define filters and exceptions associated with alarms and
+notification.
+\end{itemize} 
+
+
+\noindent The measurements shall be carried out either between network devices,
+
+\noindent Per-Link Measurements, and monitor:
+
+		\begin{itemize}	
+			\item number of packet
+			\item bytes
+			\item packet discarded on an interface
+			\item flow or burst of packets
+			\item packets per flow
+		\end{itemize}
+
+\noindent or End-to-End Measurements:
+		\begin{itemize}	
+			\item path delay  
+			\item ....
+			\item ....
+		\end{itemize} 
+
+\noindent The combination of both measurements provides a global picture of the
+network.
+
+
+\vspace{10 mm}
+
+\noindent The monitoring shall performance:
+
+\begin{itemize}
+	\item Active Measurement, injection of network traffic and study the
+reaction to the traffic
+	\item Passive Measurement,  Monitor of the traffic for measurement.
+\end{itemize}
+
+\vspace{10 mm}
+
+\noindent Performance:
+
+\begin{itemize}
+
+	\item Reaction Time ... 
+	\item Sampling...
+	
+\end{itemize}
+
+
+\section{State of the Art of Flow Controller}
+
+Currently there are three main choices for traffic monitoring:
+
+\begin{itemize}
+
+	\item RMON, IETD standard.
+	\item NetFlow, Cisco Systems.
+	\item sFlow, Industry standard
+\end{itemize}
+
+In a nutshell, all them offers the similar features and provides the same
+information, thus the selection criteria is based on the usage of resources by
+the Agent in the switches and the collector of information.
+
+\begin{table}[ht]
+\begin{center}
+    \begin{tabular}{ | c | c | c | c | c | c | c |}
+\hline
+Flow Controllers & CPU & Memory & Bandwidth & RT Statistics & Implementation \\
+\hline
+RMON & high &  very high 8-32 MB & bursty & supported & sw \\ \hline
+NetFlow & high & high 4-8 MB & high bursty & not & sw  \\ \hline
+sFlow & very low & very low akB & low smooth & supported & sw/hw \\ \hline
+    \end{tabular}
+\end{center}
+\caption{Comparison Flow Control}
+\end{table}
+
+As the Table~\ref{tab:flow_controlers} shows that sFlow requires less resources
+either in the Agent, which is placed in the switch, or the Collector. As well
+the usage of bandwidth is more conservative since the gathered information every
+short periods of time, conversely to the others controllers. It seems that
+sFlows becomes a good choice for White Rabbit. Besides sFlows allows the
+implementation of part of Agent in hardware, providing wire-speed to the
+sampling of frames. In addition the license scheme of sFlow's  allows White
+Rabbit project modify and publish our own version.
+
--- a/documents/specifications/robustness/robustness_doc/app9.tex
+++ b/documents/specifications/robustness/robustness_doc/app9.tex
+\chapter{Appendix: WR-specific MIB definitions} 
+\label{appG}
+
+\section{WR PTP}
+\vspace{5 mm}
+\textbf{All applicable data sets} \\
+\vspace{5 mm}
+SYNTAX INTEGER {
+\begin{table}[!ht]
+%\begin{center}
+\scriptsize 
+\begin{tabular}{ l l }
+\textbf{ps1(23),}   & \textbf{-- The time is accurate to 1ps} \\
+\textbf{ps2p5(24),} & \textbf{-- The time is accurate to 2.5ps} \\
+\textbf{ps10(25),}  & \textbf{-- The time is accurate to 10ps} \\
+\textbf{ps25(26),}  & \textbf{-- The time is accurate to 25ps} \\
+\textbf{ps100(27),} & \textbf{-- The time is accurate to 100ps} \\
+\textbf{ps250(28),} & \textbf{-- The time is accurate to 250ps} \\
+\textbf{ns1(29),}   & \textbf{-- The time is accurate to 1ns} \\
+\textbf{ns2p5(30),} & \textbf{-- The time is accurate to 2.5ns} \\
+\textbf{ns10(31),}  & \textbf{-- The time is accurate to 10ns} \\
+ns25(32),   & -- The time is accurate to 25ns \\
+ns100(33),  & -- The time is accurate to 100ns \\
+ns250(34),  & -- The time is accurate to 250ns \\
+us1(35),    & -- The time is accurate to 1us \\
+us2p5(36),  & -- The time is accurate to 2.5us \\
+us10(37),   & -- The time is accurate to 10us \\
+us25(38),   & -- The time is accurate to 25us \\
+us100(39),  & -- The time is accurate to 100us \\
+us250(40),  & -- The time is accurate to 250us \\
+ms1(41),    & -- The time is accurate to 1ms \\
+ms2p5(42),  & -- The time is accurate to 2.5ms \\
+ms10(43),   & -- The time is accurate to 10ms \\
+ms25(44),   & -- The time is accurate to 25ms \\
+ms100(45),  & -- The time is accurate to 100ms \\
+ms250(46),  & -- The time is accurate to 250ms \\
+s1(47),     & -- The time is accurate to 1s \\
+s10(48),    & -- The time is accurate to 10s \\
+s10plus(49) & -- The time is accurate to >10s \\
+\end{tabular}
+%\end{center}
+\end{table}
+}
+\newline
+\vspace{5 mm}
+\textbf{Parent Data Set}  \\
+\vspace{5 mm}
+wrptpGrandmasterWrPortMode OBJECT-TYPE \\
+SYNTAX INTEGER \{ \\
+\tab NON\_WR (0), \\
+\tab WR\_SLAVE (1), \\
+\tab WR\_MASTER(2), \\
+\} \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"Determines predefined function of the PTP grandmaster."  \\
+REFERENCE \\
+"WR Spec: Clause 6.2, Table 1" \\
+::= { ptpParentDataSet 10 } \\
+\\
+wrptpGrandmasterDeltaTx OBJECT-TYPE \\
+SYNTAX INTEGER \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+Grandmaster's $\Delta_{tx}$ measured in picoseconds and multiplied by $2^{16}$ .
+REFERENCE  \\
+"WR Spec: Clause 6.2, Table 1" \\
+::= { ptpParentDataSet 11 } \\
+\\
+wrptpGrandmasterDeltaRx OBJECT-TYPE \\
+SYNTAX INTEGER \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"Grandmaster's $\Delta_{rx}$ measured in picoseconds and multiplied by
+ $2^{16}$."  \\
+REFERENCE  \\
+"WR Spec: Clause 6.2, Table 1" \\
+::= { ptpParentDataSet 12 } \\
+\\
+wrptpGrandmasterDeltaRx OBJECT-TYPE
+SYNTAX TruthValue
+MAX-ACCESS read-only
+STATUS current
+DESCRIPTION
+"If TRUE, the grandmaster is working in WR mode."
+REFERENCE
+"WR Spec: Clause 6.2, Table 1"
+::= { ptpParentDataSet 13 }
+\vspace{5 mm}
+\textbf{Port Data Set} \\
+\vspace{5 mm}
+
+wrptpPortState OBJECT-TYPE \\
+SYNTAX INTEGER \{ \\
+\tab idle(1), \\
+\tab present(2), \\
+\tab m\_lock(3), \\
+\tab s\_lock(4), \\
+\tab locked(5), \\
+\tab req\_calibration(6), \\
+\tab calibrated(7), \\
+\tab resp\_calib\_req(8), \\
+\tab wr\_link\_on(9) \} \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"White Rabbit State Machine." \\
+REFERENCE \\
+"WR Spec: Clause 6.5.2.1" \\
+DEFVAL { idle } \\
+::= { ptpPortDataSet 11 } \\
+\\
+wrptpPortState OBJECT-TYPE  \\
+SYNTAX INTEGER \{ \\
+\tab NON\_WR (0), \\
+\tab WR\_SLAVE (1), \\ 
+\tab WR\_MASTER(2), \\
+\} \\
+MAX-ACCESS read-only \\ 
+STATUS current \\
+DESCRIPTION \\
+"Determines predefined function of WR port (static)." \\
+REFERENCE \\
+"WR Spec: Clause 6.2, Table 1" \\
+::= { ptpPortDataSet 12 } \\
+\\
+wrptpCalibrated OBJECT-TYPE \\
+SYNTAX TruthValue \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"Indicates whether fixed delays of the given port are known." \\
+REFERENCE \\
+"WR Spec: Clause 6.2, Table 1" \\
+::= { ptpPortDataSet 13 } \\
+\\
+wrptpDeltaTx OBJECT-TYPE  \\
+SYNTAX INTEGER \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"Port's $\Delta_{tx}$ measured in picoseconds and multiplied by $2^{16}$." \\
+REFERENCE \\
+"WR Spec: Clause 6.2, Table 1" \\
+::= { ptpPortDataSet 14 } \\
+\\
+wrptpDeltaRx OBJECT-TYPE \\
+SYNTAX INTEGER \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"Port's $\Delta_{rx}$ measured in picoseconds and multiplied by $2^{16}$." \\
+REFERENCE \\
+"WR Spec: Clause 6.2, Table 1" \\
+::= { ptpPortDataSet 15 } \\
+\\
+wrptpCalPeriod OBJECT-TYPE \\
+SYNTAX INTEGER \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"Calibration period in microseconds." \\
+REFERENCE \\
+"WR Spec: Clause 6.2, Table 1" \\
+::= { ptpPortDataSet 16 } \\
+\\
+wrptpCalPattern OBJECT-TYPE \\
+SYNTAX INTEGER \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"Medium specific calibration pattern." \\
+REFERENCE \\
+"WR Spec: Clause 6.2, Table 1" \\
+::= { ptpPortDataSet 17 } \\
+\\
+wrptpCalPatternLen OBJECT-TYPE \\
+SYNTAX INTEGER \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"Number of bits of calPattern to be repeated." \\
+REFERENCE \\
+"WR Spec: Clause 6.2, Table 1" \\
+::= { ptpPortDataSet 18 } \\
+\\
+wrptpWrMode OBJECT-TYPE \\
+SYNTAX TrueValue \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"If TRUE, the port is working in WR mode." \\
+REFERENCE \\
+"WR Spec: Clause 6.2, Table 1" \\
+::= { ptpPortDataSet 19 } \\
+\\
+wrptpWrAlpha OBJECT-TYPE \\
+SYNTAX INTEGER \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"Medium correlation parameter as described in section 3.1.1." \\
+REFERENCE \\
+"WR Spec: Clause 6.2, Table 1" \\
+::= { ptpPortDataSet 20 } \\
+
+\section{SyncE}
+
+wrSynceUplink1State OBJECT-TYPE \\
+SYNTAX INTEGER \{ \\
+\tab UNSYNC (0), \\
+\tab PRIMARY (1), \\
+\tab SECONDARY(2), \\
+\} \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"States SyncE-wise state of uplink 1"  \\
+REFERENCE \\
+"not available" \\
+::= { syncE 1 } \\
+\\
+
+wrSynceUplink2State OBJECT-TYPE \\
+SYNTAX INTEGER \{ \\
+\tab UNSYNC (0), \\
+\tab PRIMARY (1), \\
+\tab SECONDARY(2), \\
+\} \\
+MAX-ACCESS read-only \\
+STATUS current \\
+DESCRIPTION \\
+"States SyncE-wise state of uplink 2"  \\
+REFERENCE \\
+"not available" \\
+::= { syncE 2 } \\
+\\
+
+\section{\HP Traffic}
+
+do we want to control \HP Bypass from Network Management Node ????
+
+\section{Control Data Statistics}
+
+
+We need to define MIBs for \textbf{Control Data Distribution Monitoring}
--- a/documents/specifications/robustness/robustness_doc/biblio.tex
+++ b/documents/specifications/robustness/robustness_doc/biblio.tex
+
+\begin{thebibliography}{9}
+
+\bibitem{IEEE1588}
+  IEEE Std 1588-2008
+  \emph{IEEE Standard for a Precision Clock Synchronization Protocol for
+Networked Measurement and Control Systems}.
+  IEEE Instrumentation and Measurement Society, New York,
+  2008,
+  http://ieee1588.nist.gov/.
+
+\bibitem{IEEE8021D}
+  IEEE Std 802.1D-2004
+  \emph{IEEE Standard for Local and metropolitan area networks, Virtual Bridged
+Local Area Networks}.
+  LAN/MAN Standards Committee, New York,
+  2006.
+
+\bibitem{IEEE8021Q}
+  IEEE Std 802.1Q-2005
+  \emph{IEEE Standard for Local and metropolitan area networks Media Access
+  Control (MAC) Bridges}.
+  IEEE Computer Society, New York,
+  2004.
+
+\bibitem{IEEE8023}
+  IEEE Std 802.3-2008
+  \emph{IEEE Standard for Information technology - Telecommunications and
+information exchange between systems - Local and metropolitan area networks -
+Specific requirements}.
+  IEEE Computer Society, New York,
+  2008.
+
+\bibitem{UplinkFast}
+  CISCO Document ID: 10575
+  \emph{Understanding and Configuring the Cisco UplinkFast Feature}.
+  http://www.cisco.com.
+
+\bibitem{SynchE}
+  ITU-T G.8262/Y.1362
+  \emph{Timing characteristics of a synchronous
+  Ethernet equipment slave clock}.
+  TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU, 
+  07/2010.
+
+\bibitem{WRPTP}
+ Emilio G. Cota, Maciej Lipinski, Tomasz Wostowski, Erik van der Bij, Javier
+ Serrano
+  \emph{White Rabbit Specification: Draft for Comments}.
+  CERN, Geneva
+  09/2010.
+
+\bibitem{FAIR}
+  R.Bar
+   \emph{The FAIR Accelerator Control System}
+   The excerpt from the updated FAIR Technical Design Report,
+   Hamburg,
+   2008.
+   
+\bibitem{DesigningLSLANs}
+  Kevin Dooley
+  \emph{Designing Large-Scale LANs}
+  O'REILLY,
+  2002.
+
+\bibitem{HWpresentation}
+  Tomasz Wlostowski
+  \emph{White Rabbit HW status}
+  White Rabbit Developers Meeting, Geneva, CERN
+  December 2010,
+  http://www.ohwr.org/attachments/404/hw\_pres.odp
+
+\bibitem{PropagationDelay}
+  P.P.M. Jansweijer,
+  H.Z. Peek
+  \emph{Measuring propagation delay over a 1.25 Gbps
+         bidirectional data link}
+  National Institute for Subatomic Physics, Amsterdam
+  May 31, 2010.
+
+\bibitem{FAIRtiming}
+   T. Fleck 
+   R. Bar 
+  \emph{FAIR Accelerator Control System 
+        Baseline Technical Report }
+  DRAFT,
+  Hamburg,
+  2009.
+
+\bibitem{CERNtiming}
+  Mr. XXX
+  \emph{I need some nice doc here :) }
+  Current source: Javier + Julian,
+  CERN, Geneva,
+  xxx.
+
+\bibitem{ciscoRSTP}
+  Mr. XXX
+  \emph{I read the doc, I cannot find it at the moment }
+  CISCO,
+  somewhere,
+  xxxx.
+
+\bibitem{FAIRtimingSystem}
+  T. Fleck, C.Prados, S.Rauch, M.Kreider
+  \emph{FAIR Timing System}
+  GSI,
+  v1.2,
+  12.05.2009.
+
+\bibitem{The All-New Switch Book: The Complete Guide to LAN Switching Technology}
+ Rich Seifert, James Edwards
+ \emph{The All-New Switch Book: The Complete Guide to LAN Switching Technology}
+ Wiley Pusblishing, Inc. 
+
+\bibitem{IEEE8021Qbb} 
+ \emph{IEEE 802.1Qbb/D2.3. Draft Standard for Local and Metropolitan Area
+      Networks - Virtual Bridged Local Area Networks - Amendment XX:
+Priority-based Flow Control.}  
+   June 9,
+  2009.
+
+\bibitem{atm_traffic} 
+ \emph{Network Testing Solutions, ATM Traffic Management White paper}
+
+\bibitem{FlowControllers} 
+ \emph{... missing citation...}
+
+\bibitem{reed_solomon} 
+ \emph{RFC 5510 Reed-Solomon Forward Error Correction (FEC) Schemes}
+J. Peltotalo
+S. Peltotalo
+Tampere University of Technology
+April 2009
+
+\bibitem{reed_solomon_theory}
+\emph{An Introduction to Galois Fields and Reed-Solomon Coding}
+James Westall
+James Martin
+School of Computing
+Clemson University
+October 4, 2010
+
+\bibitem{hamming_Codes}
+\emph{Hamming Codes}
+Charles B. Cameron
+Electrical Engineering Department, 
+United States Naval Academy
+Department of Electrical Engineering .
+April 19, 2005
+
+\bibitem{TomekMSc} 
+ \emph{Precise time and frequency transfer in a White Rabbit netwokr, MSc
+Thesis}
+  Tomasz Wlostowski
+  Warsaw University of Technology
+  To be published.
+
+\bibitem{WRdemo} 
+ \emph{White Rabbit DEMO(2)}
+  Tomasz Wlostowski,
+  Maciej Lipinski
+  CERN, Geneva,
+  11/2010.
+
+\bibitem{FaultTree} 
+ \emph{Reliability Workbench, FaultTree}
+  www.Isograph.com.
+
+\end{thebibliography}
--- a/documents/specifications/robustness/robustness_doc/chap1.tex
+++ b/documents/specifications/robustness/robustness_doc/chap1.tex
+\chapter{Introduction}
+\label{introduction}
+
+\section{What is White Rabbit?} 
+
+White Rabbit is intended to be the next-generation deterministic network based
+on synchronous Ethernet, allowing for low-latency deterministic packet routing
+and transparent, high precision timing transmission. The network consists of
+White Rabbit Nodes, White Rabbit Switches and supports integration of nodes
+and/or switches that are not White Rabbit, however with restrictions.
+
+
+The resilience and robustness is one of the key features of any
+fieldbus, specially, in safety Ethernet-based fieldbuses for critical systems
+like White Rabbit. The reliability of the WR falls on the deterministic delivery
+of Ethernet frames through a switching network and the synchronization of the
+network devices. In order to provide a service with low error, it is necessary
+to propose methods and techniques to overcome problems caused by the
+imperfection of the physical medium, dropped packets in WR switches and
+breakdowns of the network devices.
+
+
+A White Rabbit Network, consisting of White Rabbit Switches (WRS) connected
+by fibre or copper, is meant to transport information between White Rabbit
+Nodes (WRN). In this document three types of information transported over White
+Rabbit Network are distinguished:
+
+\begin{itemize}
+  \item  Timing Information - includes frequency and Coordinated Universal
+  Time (UTC), it is sent from Timing Master to White Rabbit Switches and Nodes.
+  \item  Control Data - includes \ControlMessage s (\CM), it is broadcast from
+Data Master to  White Rabbit Nodes. 
+  \item  Standard Data - all other Ethernet traffic sent between nodes and
+  switches.
+\end{itemize}
+Timing Information and Control Data are considered to be critical. 
+The types of information are closely related to their source. A Timing Master is
+a White Rabbit Switch or Node which is connected to GPS receiver. A Data Master
+is a White Rabbit Node which is responsible for \ControlMessage\   distribution.
+
+
+The main component of White Rabbit Network is a White Rabbit Switch.
+It is a Layer~2 network bridge which supports Synchronous Ethernet
+(SyncE) \cite{SynchE} and implements White Rabbit Protocol (WRPTP) \cite{WRPTP}
+which is an extension of PTP standard (IEEE1588 \cite{IEEE1588}). WRPTP, along
+with SyncE, enables to distribute common notion of time and frequency over the
+entire White Rabbit Network with sub-nanosecond precision. 
+
+\section{WR Network Requirements} 
+
+The requirements for the WR have been defined by GSI and CERN, since the
+Control Sytems of their accelerator are going to be based on WR. A Control
+System for accelerator facility requires high reliability, high precision and
+determinism. 
+
+Unreliability is translated into the number of \ControlMessage s not delivered
+to one or more designated nodes within one year. 
+
+In terms of reliability, it is expected to lose along the way from WR Data
+Master to receiving WR Nodes only one \ControlMessage\ per year. 
+
+Determinism in this chapter is understood as delivery of \ControlMessage\ to all
+designated nodes within the required time regardless of the node's location in
+the network. If the \ControlMessage\ delivery time is exceeded, the message is
+considered undelivered (lost).
+
+At GSI, White Rabbit is foreseen to control new FAIR accelerator complex. The
+requirements for the new FAIR machines \cite{FAIRtiming} state that the Control
+Message should be delivered from Data Master Node to all receiving devices
+(Nodes) within 100$\mu s$ given the maximum distance between Data Master and a
+Node of 2km \cite{FAIRtimingSystem}.
+
+At CERN, the required time to delivered the message is 1ms but the distance 
+between Data Master and a Node is substantially larger (max of 10km)
+\cite{CERNtiming}.
+
+Knowing how often a \ControlMessage\ is sent (every 100$\mu s$ at GSI and 
+1000$\mu s$ at CERN), we can calculate the maximum acceptable failure rate
+of WR Network ($\lambda_{WRN_{max}}$)
+
+\begin{equation}
+ \label{eq:failureProbability}
+	\lambda_{WRN_{max}} = \frac{number\_of\_\ControlMessage
+s\_lost\_per\_year}{number\_of\_\ControlMessage s\_sent\_per\_year}
+\end{equation}
+
+
+The requirements concerning \ControlMessage\ size differ between FAIR and CERN
+as well. While in GSI facility a \ControlMessage\ of 1500 bytes (maximum
+Ethernet Frame payload) is more than sufficient, for CERN facility it is an
+acceptable value, but the expectations are higher, e.g.: 4800bytes (see
+Appendix~\ref{appB}). Table~\ref{tab:requirements} summarizes the GSI's and
+CERN's requirements.
+
+\begin{table}[ht]
+\caption{GSI's and CERN's requirements summary.} 
+\centering
+	\begin{tabular}{| c | c | c |}          \hline
+\textbf{Requirement name}& \multicolumn{2}{|c|}{\textbf{Value(s)}}  \\
+                         & GSI              & CERN          \\ \hline
+\GranularityWindow       & 100$\mu s$       & 1000$\mu s$   \\ \hline
+max Failure rate ($\lambda_{WRN_{max}}$) & $3.170979198*10^{-12}$ &
+$3.170979198*10^{-11}$  \\ \hline
+%min Reliability ($R_{WRN}$)& 0.999 999 999 997
+% & 0.999 999 999 968\\ \hline
+Maximum Link Length      & 2km              & 10km          \\ \hline
+\ControlMessage Size     & 300-1500 bytes   & 1200 - 5000 bytes     \\ \hline
+Synchronization accuracy & probably 8ns   & most nodes 1$\mu s$ \\
+                         &                & few nodes  ~2ns \\
+\hline
+
+\end{tabular}
+\label{tab:requirements}
+\end{table}
+
+\section{The Goals of This Document} 
+
+This document introduces methods and techniques which increase 
+reliability, robustness and determinism of White Rabbit Network so that the
+requirements listed in the previous chapter are met. It also introduces
+techniques to guard the safety of the network, monitor and 
+diagnose. However, the presented solutions are considered optional. Their usage
+should be considered on individual bases and according to actual needs of 
+a particular use case.
+
+
+
+Determinism, as understood in this document, can be achieved by a precise
+knowledge of maximum delays introduced by each component of the network and 
+optimisation of the delay to meet the requirements. 
+
+\textbf{Chapter 2} discusses determinism of White Rabbit Network. In particular
+it focuses on the \ControlMessage\ maximum delivery time from the Data Master
+to WR Nodes. 
+
+Unreliability of the system is caused by physical imperfection of network's
+components. In particular, by data corruption on the physical links and failure
+of network components. Reliability of data distribution in a network can be
+increased by redundancy of data (special encoding) and components
+(switches, links). In terms of redundancy, two types of information
+distributed over White Rabbit Network are considered separately: Timing
+Information and Data (Control and Standard). 
+
+Timing Information is conveyed from a Timing Master to all nodes. Data is
+sent from any node (Data Master, in case of Control Data) to one or many nodes.
+Timing Information and Data are distributed along so called Paths. A Path is
+understood as the cables and switches by which information traverse from the
+sender to the receiver of the information. Consequently, there are two types of
+paths:
+\begin{itemize}
+  \item Clock Path - path from Timing Master to the nodes.
+  \item Data Path - path from sending node to receiving node(s).
+\end{itemize}
+A layout of all possible Paths in a network is called topology. Therefore we
+define in White Rabbit Network:
+
+\begin{itemize}
+  \item Clock Path Topology - Layout of all possible Clock Paths in the
+network.
+  \item Data Path Topology - Layout of all possible Data Paths in the
+network.
+\end{itemize}
+
+One of the unique features of White Rabbit Network is a precise synchronization
+and syntonization of all the White Rabbit network components. Failure to deliver
+this information to any White Rabbit component, or deterioration of its quality
+(instability) might have sever consequences and render the network unreliable.
+\textbf{Chapter 3} describes how the quality of Timing Information is
+guarded, its reliability increased by introducing redundancy of Clock Path and
+its stability ensured during switchover between redundant Clock Paths.
+
+\textbf{Chapter 4} describes techniques to achieved required reliability of
+Control Data delivery and substantially increase reliability of Standard Data.
+This is done by introducing redundancy of network components (topology) and
+redundancy of data (Forward Error Correction, EFC). Since, congestion of network
+traffic is undesired and dangerous for reliable message delivery, this chapter
+deals with flow control and congestion problems as well. 
+
+\textbf{Chapter 5} focuses on techniques which allows for WR Network
+monitoring as well as fast and efficient diagnostics of the network.
+
--- a/documents/specifications/robustness/robustness_doc/chap2.tex
+++ b/documents/specifications/robustness/robustness_doc/chap2.tex
--- a/documents/specifications/robustness/robustness_doc/chap3.tex
+++ b/documents/specifications/robustness/robustness_doc/chap3.tex
+\chapter{Clock Path Resilience} 
+
+Clock Path Resilience translates into continuous and stable syntonisation and
+synchronization of all the WR devices in entire WR Network. This results in very
+accurate common notion of UTC in all the devices. White Rabbit has proved to
+achieve sub-nanosecond accuracy over a single fibre of 10km and is expected to
+achieve the accuracy of 30ns over copper \cite{TomekMSc}. The requirement by
+CERN are in the range of 1$\mu s$ for most of the nodes, but a few need
+accuracy of 2ns. 
+
+A loss of UTC in WR Node can be caused by link or switch failure -- break of
+clock path between the WR Timing Master and a WR Node. In order to prevent
+such situation, redundancy of WR devices is introduced ensuring redundant clock
+paths. However, switch-over might cause UTC instability. It is
+important to minimize (eliminate) instability of UTC caused by switch-over
+between redundant clock paths to avoid accuracy deterioration. The stability of
+UTC is guarded in WR network by taking countermeasures to the following
+phenomena:
+\begin{itemize}
+  \item variable external conditions, e.g. variation of temperature,
+  \item temporary instability of frequency during switch-over,
+  \item loss of Ethernet frames with timing information.
+\end{itemize}
+  
+\section{Clock Distribution in WR}
+
+Timing Information is transmitted in White Rabbit Network over Clock Path by
+the means of:
+\begin{itemize}
+  \item Synchronous Ethernet (SyncE,\cite{SynchE}) - Physical Layer of OSI
+Model.
+  \item Precision Time Protocol \cite{IEEE1588} extended for White Rabbit
+(WRPTP,\cite{WRPTP}) - Application Layer of OSI Model
+\end{itemize}
+While WRPTP uses Ethernet frames to distribute common notion of time, SyncE
+uses the physical layer interface to distribute common notion of frequency. This
+fact imposes the following restrictions on the Clock Path:
+\begin{itemize}
+  \item A White Rabbit Switch needs to have at least one of its uplinks
+connected to a downlink of another White Rabbit Switch (except the Timing
+Master)
+  \item Timing Information (i.e. UTC notion) is sent in one direction only : 
+    \begin{itemize}
+      \item Network-wise : from Timing Master to the nodes.
+      \item Switch-wise: from active uplink to all downlinks.
+    \end{itemize}
+\end{itemize}
+
+White Rabbit Switch is designed to support Timing Path redundancy. Each switch
+has two uplinks \footnote{WR Switch V3 has hardware possibility of supporting
+greater number of uplinks} which can be connected to sources of Timing
+Information - downlinks of another WR Switch or WR Node. A WR Switch (Node)
+being the source of Timing Information is called WR Timing Master Switch (Node).
+A WR Switch (Node) receiving Timing Information is called WR Timing Slave Switch
+(Node). A WR Switch can be Timing Slave and Timing Master at the same time. As
+mentioned before, WR Timing Slave Switch can be connected to up to two links
+to up to two WR Timing Master Switches.
+
+\begin{center}
+	\includegraphics[scale=0.30]{../../../../figures/robustness/timePaths.ps}
+	\captionof{figure}{Possible Timing Paths between WR Switches}
+	\label{fig:timePaths}
+\end{center}
+
+Figure~\ref{fig:timePaths} depicts possible connections of a WR Switch. Clock
+Path redundancy can include redundant link and switch. This happens when each
+uplink of WR Timing Slave Switch is connected to independent WR Timing Master
+Switch (Figure~\ref{fig:timePaths}, a). In such case we assume that independent
+sources of Timing Information are synchronized with sub-nanosecond precision 
+(i.e. receive the same frequency and time from GPS). It is also possible to
+introduce only link redundancy as in Figure~\ref{fig:timePaths}, b). Since
+redundant Timing Path is optional, the
+White Rabbit network will work normally without redundancy
+(Figure~\ref{fig:timePaths}, c).
+
+
+\begin{center}
+	\includegraphics[scale=0.20]{../../../../figures/robustness/layer1redundancy.ps}
+	\captionof{figure}{Clock Path Redundancy}
+	\label{fig:clockRedundancy}
+\end{center}
+
+\section{Clock Path Switch-over}
+
+As shown in Figure~\ref{fig:clockRedundancy}, uplinks retrieve frequency
+sent over a link by Timing Master Switches using physical layer (SyncE). The
+delay and time offset are measured by WRPTP. At any give moment,
+timing and frequency from single uplink are used for syntonization and
+synchronization of the local clock to Timing Master's UTC.
+
+Since two separate technologies are used to retrieve UTC, there are two
+possible sources of instability during clock path switch-over: SyncE and WRPTP.
+
+\subsubsection{SyncE}
+
+A detailed description of frequency recovery in WR Switch (i.e. description of
+Helper PLL and Main PLL) can be found in \cite{TomekMSc}. The most important
+feature of the implementation is the fact that at any given time, phase is
+measured and compensated on all the uplinks simultaneously. As a result, in
+theory, the switch-over between redundant links should be unnoticeable
+frequency-wise introducing no accuracy deterioration. This, however, needs to be
+proved by extensive tests.
+
+\subsubsection{WRPTP} 
+
+In principle, the values of offset and delay are measured by WRPTP on all
+uplinks at any time. The values from an arbitrary uplink, which is called
+Primary Link, are used to synchronize the local clock. But, the values from the
+backup uplink(s) are always ready. The Primary Links should be the same
+for SyncE and WRPTP. If a failure of Primary Link is detected, the values of
+offset and delay available for the Secondary Linkare used. Therefore, the
+switch-over WRPTP-wise is considered seemless, however, tests must be conducted
+to confirm this.
+
+The choice of the Primary Link is arbitrary. As soon as it is detected that the
+Primary Link is down, the Secondary Link becomes Primary.
+
+
+\section{Variable external conditions vs. stability}
+
+The stability of UTC in WR Timing Slaves is mainly endangered by variation of
+temprature which causes changes of signal propagation speed in physical medium. 
+The propagation delay is measured using WRPTP which updates the values of delay
+and offset with each PTP message exchange. The responsiveness of the system to
+temperature variation can be controlled with frequency of PTP message exchange.
+Since the gradient of temperature changes, in normal circumstances, is low (few
+degrees per hour) and the frequency of PTP messages exchange much higher, it
+shall not introduce deterioration of UTC accuracy. Test of propagation delay
+variation is described in \cite{TomekMSc} and shown in \cite{WRdemo}. 
+
+\section{Loss of Ethernet frames with timing information}
+
+PTP is designed to tolerate loss of PTP-specific messages on the
+communication channel. It is done through timeouts. If an operation (e.g.:
+delay and offset measurement) is disrupted due to PTP message loss, the
+operation is repeated after time interval elapsed -- the message is
+re-sent.
+
+White Rabbit extension to PTP (WRPTP) employs the same strategy during
+WR-specific operation of White Rabbit Link Setup (see \cite{WRPTP}). In the case
+of message loss, operation is repeated and the lost message is re-sent (up to a
+number of times). Additionally, WRPTP is much more tolerant to a loss of
+multiple messages exchanged to measure delay and offset. Unlike in
+standard PTP, these measurements are used only for synchronization
+(syntonization is done through SyncE). Therefore, once the synchronization is
+achieved (delay and offset measured) at the beginning of the connection (e.g.:
+after plugging in the physical link), the values change almost solely due
+to external conditions. The rate of measurements (exchange of messages) is
+supposed to be much higher then the changes of physical parameters caused by
+external conditions. Therefore, even a loss of a few consecutive PTP messages
+should have no influence on the WRPTP performance.
--- a/documents/specifications/robustness/robustness_doc/chap4.tex
+++ b/documents/specifications/robustness/robustness_doc/chap4.tex
--- a/documents/specifications/robustness/robustness_doc/chap5.tex
+++ b/documents/specifications/robustness/robustness_doc/chap5.tex
--- a/documents/specifications/robustness/robustness_doc/chap6.tex
+++ b/documents/specifications/robustness/robustness_doc/chap6.tex
+%\chapter{Flow and Congestion Control}
+\section{Flow and Congestion Control}
+\label{chap:flow_congestion}
+
+As a part of reliable network implementation, Flow Control and Congestion
+Control are responsible for ensuring that data is transmitted at a rate coherent
+with the capacities of both receiver and switches. Flow Control aims at
+preventing congestion in the network while the Congestion Control provides the
+mechanism to overcome the congestion.
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+\subsection{Flow Control}
+It provides a mechanism for the receiver to control the transmission, so that
+the receiving node is not overwhelmed with data from transmitting node. 
+\cite{atm_traffic}. 
+
+\vspace{10 mm}
+
+\subsubsection{White Rabbit Flow Control}
+
+Since in WR we distinguish two types of traffic (\HP\ and \SP) and the most
+important traffic falls on the \HP\ which is treated in a special way, two
+different levels of flow control are needed. The configuration of flow control
+is gathered in Flow Control Policy.
+
+In a White Rabbit network, the \HP\ will flow from the Data Master Node to all
+White Rabbit Nodes. DMN is the only node that can send \HighPriority\ 
+frames \footnote{Recommended configuration}. 
+
+There are two situations regaring the flow of the \HP\ Traffic that could point
+out a malfunction or wrong configuration of a Node and cause congestion in the
+network:
+\begin{itemize}
+    \item Data Master sents more frames that it should send,
+    \item Non-Data Master Node sends \HP\ frames.
+\end{itemize}
+Therefore a simple but effective Flow Control mechanism is proposed for this
+situation. In the first case, the Data Master shall be notified so that it can
+perform appripriate action to resume propre \HP\ packages sending rate.
+Destructive consequences on the \HP\ Traffic of the second problem are prevented
+by blocking all ports connected to non-Data Master Nodes for \HP\ Traffic. The
+Appendix ~\ref{flow_control} presents a proposal for the Flow Control of the
+\HP\ traffic.
+ 
+For the \SP Traffic, the Ethernet Flow Control described in the IEEE 802.3
+\cite{IEEE8023} standard is used. The downside of this scheme is the lack of CoS
+criteria, all the priorities of CoS are treated equally.  The authors of
+this document will follow the development of IEEE 802.1qbb \cite{IEEE8021Qbb}
+specification where the different level of the CoS are taken into account and
+see the suitability of the standard in WR. 
+
+\vspace{10 mm}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\subsection{Congestion Control}
+
+Congestion control is responsible for the control and regulation of the traffic 
+into WR Network. The goal is to avoid saturating or overloading switches
+in the network. The incoming traffic in a switch $\lambda_in$ should be equal to
+the outgoing traffic $\lambda_out$ . When $\lambda_out \leq \lambda_in$, there
+is a situation of congestion and the symptom are:
+\begin{itemize}
+	\item Lost packets, buffer overflow,
+	\item Long delays, queueing in buffers
+\end{itemize}
+
+\noindent and it causes:
+
+\begin{itemize}
+	\item Increased delay,
+	\item Packet loss.
+\end{itemize}
+
+%\subsubsection{WR Explicit Congestion Signalling for \HP}
+\paragraph{WR Explicit Congestion Signalling for \HP}
+
+Among the different schemes for Congestion Control, the Explicit Congestion 
+Signalling is the scheme that fulfils the responsiveness and reliability
+that \HP requires since the scheme avoids the congestion, consequently the loss
+of frames due to buffer overflow.
+
+The aim of an explicit signalling is to stop a device of sending traffic to
+avoid the congestion. 
+
+
--- a/documents/specifications/robustness/robustness_doc/chap7.tex
+++ b/documents/specifications/robustness/robustness_doc/chap7.tex
+\chapter{Network Monitoring and Diagnostic}
+\label{chapter:monitoring}
+We have presented so for explicit mechanisms and techniques to provide
+reliability to WR Network. It tolerates (to some extend) component failures and
+data corruption. Although, these are the best mechanisms to guarantee continuity
+of the message delivery, Network Monitoring and Diagnostic can provide early
+detection of future malfunction decreasing the number of failures.
+A White Rabbit network provides special features, i.e. precise time
+synchronization, deterministic traffic and high reliability of package
+delivery, which needs to be carefully monitored. Of course, common parameters 
+for standard Ethernet networks, are also important to monitor in order to
+obtain full picture of the WR Network performance. This chapter describes the
+Monitoring and Diagnostic strategies used in WR Network.
+
+
+\section{WR-specific Diagnostics}
+
+White Rabbit Network is designed to achieve very demanding requirements in terms
+of reliability and determinism of Critical Data Delivery. In such network it is
+vital that:
+\begin{itemize}
+	\item network failure (i.e. not meeting the requirements) can be
+precisely diagnosed so that the cause of failure can be immediately fixed,
+	\item any suspicious behaviour of the network which might create
+potential problem can be early detected and precisely targeted.
+\end{itemize}
+Therefore, it is important to monitor the WR-specific network characteristics. 
+In particular, it is important to know
+precise performance of:
+\begin{itemize}
+	\item \textbf{Timing Data distribution}: UTC clock stability (WR
+	      PTP), frequency distribution (SyncE),
+	\item \textbf{Control Information distribution} - (\HP Packages and
+	      Control Messages lost , \HighPriority),
+\end{itemize}
+
+
+\subsection{Timing Data Distribution Monitoring}
+
+As defined in the IEEE 802.3 \cite{IEEE8023} standard, loosing of three
+consecutive symbols on an uplink is interpreted as link-down. It is also
+possible to compare phase retrieved from all uplinks and detect instability of
+retrieved frequency on an uplink, in such case the stable uplink will be chosen.
+This means that monitoring of frequency distribution is limited to  indicate
+whether recovery of frequency is working on a given uplink, or not. 
+
+
+WR PTP offers much more useful parameters regarding the performance than the IEEE 1588
+standard \cite{IEEE1588}, which defines the following performance monitoring features:
+\begin{itemize}
+    \item status,
+    \item observed parent offset variance,
+    \item observed parent clock phase change rate,
+\end{itemize}
+
+White Rabbit will add:
+\begin{itemize}
+    \item link asymmetry,
+    \item port type (uplink/downlink) and mode (WR or non-WR mode)
+    \item Rx/Tx delay,
+    \item link length,
+    \item observed delays variance,
+\end{itemize}
+
+
+\subsection{Control Data Distribution Monitoring}
+\label{chap:CTRLdataMonitoring}
+The reception of each Control Message by all the WR Receiving Nodes is crucial
+for WR Network. Therefore, Data Master shall provide each \ControlMessage\ with
+unique ID
+number. This enables the Receiving Nodes to identify the fact that
+\ControlMessage\ has not been delivered. 
+
+
+Each \ControlMessage\ is FEC-encoded into a number of \HP\ Packages. FEC allows
+to retrieve \ControlMessage\ even if one of the \HP\ Packages is lost. However,
+the fact that \HP\ Package was lost might indicate malfunctioning of a component
+of the network and upcoming more sever problems. This is why, the FEC encoder
+shall provide ot each \HP\ Package with an unique ID number. This ID number shall
+consist of:
+\begin{itemize}
+    \item \ControlMessage\ ID,
+    \item ID of \HP\ Package unique within a single \ControlMessage.
+\end{itemize}
+
+The redundancy of \ControlMessage\ ID is intentional. It shall be in the
+\ControlMessage\ header and \HP\ package header (added by FEC) as the Figure
+~\ref{fig:fec_header} shows. It allows to precisely measure delay of a \HP\
+packages and easily calculate the delay of a \ControlMessage\ in different
+points of the network (a timestamp of sending the first \HP\ minus a timestamp
+of receiving the last \HP). 
+
+Each WR Switch shall verify the \HP\ Packages' ID sequence and identify any
+unusual behaviour, i.e.:
+\begin{itemize}
+    \item lost packages,
+    \item wrong ID sequence.
+\end{itemize}
+If a fault is detected by a WR Switch, the Management Node shall
+be notify. This enables to precisely locate the cause of a problem, e.g.:
+malfunctioning port or link.
+
+As a consequence the following functionalities shall be provided by WR Network:
+\begin{itemize}
+    \item A WR Management Node shall be able to gather information about
+	  timestams of a given \ControlMessage\ (represented by ID) from all the
+	  switches and nodes. Such monitoring might be conducted on-demand
+	  or/and periodically (polling). 
+    \item A WR Management Node shall be notified by WR Switch/Node if a \HP\
+	  Package or \ControlMessage\ sequence error is detected.
+\end{itemize}
+
+
+\begin{center}
+	\includegraphics[scale=0.35]{../../../../figures/robustness/delayMonitoring.ps}
+	\captionof{figure}{\ControlMessage and \HP Package delivery delay
+monitoring.}
+	\label{fig:pathDelayMonitoring}
+\end{center}
+ 
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+\section{Flow Monitoring}
+
+Flow monitoring is a scalable technique for measuring network traffic,
+collecting, storing, and analysing traffic data. As explained in
+Chapter~\ref{chapter:cos}, traffic with different priorities and functionalities
+will flow within the White Rabbit Network. Therefore it's of vital importance to
+detect diagnose and fix network problems, specially for the \HighPriority\
+Traffic \cite{FlowControllers}.
+
+Monitoring traffic flows on the interfaces of the WR Switches
+provides visibility, which replaces guesswork of how the network is performing
+and provides: 
+
+\begin{itemize}
+	\item \textbf{Troubleshooting} Network problems are often first
+detectable in abnormal traffic, a flow monitor makes these abnormal traffic
+patterns visible to enable rapid identification, diagnosis, and correction.
+
+	\item \textbf{Controlling Congestion} By monitoring the traffic in the 
+ports, congested links can be identified and communicated to the Congestion
+Control.
+
+	\item \textbf{Routing Profiling} A traffic profile of a network can 
+help to understand the bottlenecks and hotspots in the network. 
+
+\end{itemize}
+
+\vspace{10 mm}
+
+A Flow Monitor is based on packet counters with a statistical sampling of the
+state of traffic. The sampled information in the switches is immediately sent to
+a central collector for analysis. Either the WR Nodes or the Switches will be
+endowed with the sFlow Monitor. In the Appendix~\ref{appSFlow} we present the
+main characteristics of the monitor.
+
+The sFlow will measure the following parameters of the traffic between
+network devices,
+
+\noindent Per-Link:
+		\begin{itemize}	
+			\item number of packet,
+			\item bytes,
+			\item packet discarded,
+			\item flow or burst of packets,
+			\item packets per flow.
+		\end{itemize}
+
+\noindent It will also perform End-to-End Measurements of:
+		\begin{itemize}	
+			\item path delay  
+			\item ....
+			\item ....
+		\end{itemize} 
+
+\noindent The combination of both measurements provides a global picture of the
+network.
+
+
+\vspace{10 mm}
+
+\noindent sFlow shall performance:
+
+\begin{itemize}
+	\item Active Measurement - injection of network traffic and study the
+reaction to the traffic,
+	\item Passive Measurement -  Monitor of the traffic for measurement.
+\end{itemize}
+
+\vspace{10 mm}
+
+\noindent The sFlow will configured to achieve:
+\begin{itemize}
+
+	\item Reaction Time of ... 
+	\item Sampling...
+	
+\end{itemize}
+
+
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\subsection{Architecture}
+
+
+The Figure ~\ref{fig:archi} shows how sFlow, the Flow Control Policy and the
+Congestion Control works in every device of a WR network.
+
+The Management Node houses the the sFlow Collector and the DDBB where it will
+store the statistic gathered. The Flow Control Policy will be defined and
+distributed from this node as well. And as another networking device, the node
+is endowed with the Congestion Control mechanism.
+
+As explained, the sFlow agent will monitor the traffic in the switch and will
+propagate this statistics to the MMN, but it will also watch over the Flow
+Control Policy. In case the sFlow reports a traffic that doesn't respect the
+policy the Congestion Control mechanism define for the Critical Broadcast and
+the Non-Critical will carry on the actions explained in this chapter.
+
+\begin{center}
+
+        \includegraphics[scale=0.60
+]{../../../../figures/robustness/architecture_management_flow_congestion_control.ps}
+        \captionof{figure}{Architecture of Flow Monitoring, Congestion and Flow
+Control}
+		\label{fig:archi}
+\end{center}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+
--- a/documents/specifications/robustness/robustness_doc/chap8.tex
+++ b/documents/specifications/robustness/robustness_doc/chap8.tex
+\chapter{Summary}
+
+
+This document helps to understand issues related to determinism and robustness
+in a White Rabbit Network. The final system's performance is a result of
+connecting a number of existing technologies/techniques/standards, extending
+them and providing hardware support. These are depicted in
+Figure~\ref{fig:osiLayers} with reference to the OSI Model. 
+
+
+\begin{center}
+	\includegraphics[scale=0.35]{../../../../figures/robustness/osiLayers.ps}
+	\captionof{figure}{Placing methods used in WR according to OSI Layers.}
+	\label{fig:osiLayers}
+\end{center}
+
+The topics which this document shall bring under discussion are:
+\begin{itemize}
+    \item We might need to consider dropping frames when \HP\ Package arrives in
+order to decrease \ControlMessage\ jitter. Influence of such solution
+on \SP\ Traffic's throughput needs to be tested.
+    \item GSI's requirement of 100$\mu s$ should be more thoroughly justified as
+it requires extensive efforts to achieve (e.g. \HP\ Bypass).
+    \item The estimations of reliability presented in
+Chapter~\ref{WRnetworkTopologyExamples} indicates that it is required to
+provide triple or higher redundancy of the network in order to meet the
+reliability requirement. Therefore:
+   \begin{itemize}
+      \item the implementation of $N > 2$ uplink ports in V3 Switch is desired,
+      \item thorough calculations of reliability for various topologies need to
+	    be conducted this.
+   \end{itemize}
+ \item Calculation of the overall WR Network
+reliability turned out to be
+much harder then anticipated. The current estimations need to be verified and
+more precise calculations provided in further versions of the document.
+\end{itemize}
+
+
+
--- a/documents/specifications/robustness/robustness_doc/new_index
+++ b/documents/specifications/robustness/robustness_doc/new_index
+1.  Introduction (ML):
+    - explanatin of WR
+    - introduce different kinds of info:Timing Info, Control Data, Standard Data
+    - we need to point out that increase of robustness in WR is optional, it
+will work with no redundancy the same good (as long as all components work)-
+intention - not to scare potential clients
+1.1 WR Network Requirements (regarding Robustness and Determinism)
+    - Control Messages :
+       * one message lost in one year
+       * small GW
+    - Timing Info:
+       * received by all Nodes
+       * if note received by node, no sense for it to received Control Messsage
+       * reliability of Timing must be greater or equal that of Control Message
+1.2 Reliability: MTBT
+    - we need to introduce how we are going to "measure/estimate" reliability
+    - short introduction of Mean Time Between Failures
+
+2. Physical Medium and BER (Layer 1)[CP]
+   
+3 Forward Error Correction (upper layer)
+3.1. Brief introduction (overview) of the concepts used with loads of references
+3.2. FEC in WR  
+
+4. QoS and Traffic Prioritize (CP)
+   - say that Control Messages are to be broadcast pririty 7
+   - say that non-Control Messages are called SP
+
+5 Jitter, Determinism (ML)
+   - estimate normal routing time, it's not enough to meet requirements
+   - introduce the idea of "bypass" for broadcast priority 7, call it HP
+   - say how HP improves things, make estimations regarding the number of
+     hops vs jitter
+   - say that SP traffic can be also deterministec if used with brains,
+estimate GW for SP
+
+6. Redundancy
+   - define redundancy
+   - say that we measure it's effectiveness with relibility
+   - introduce terms:
+     * clock path
+     * data path
+
+6.1 Clock Path Redundancy
+6.1.1 Layer 1 (SynchE)
+  - explain how it works
+  - explain restrictions
+6.1.2 Layer 2 (WRPTP)
+  - explain how it works
+6.1.3 Clock Path Topologies
+  - show possible topologies
+6.2 Data Path Redundancy
+6.2.1 Rapid Spanning Tree Protocol (RSTP)
+  - explain how it works
+  - say how it helps
+6.2.2 RSTP in WR
+6.2.2.1 SP Traffic
+  - hardware link down detection
+  - use ide from CISCO (UplinkFast) - I will investigate if it's possible
+6.2.2.1 HP Traffic
+  - port change in HW
+  - how we want to avoid loosing HP Packages (wait for gap between burst)
+6.2.3 Possible Topologies for WR RSTP
+  - pros/cons
+  - costs
+  - we need to consider many levels of redundancy (reliability), so analysing
+different topologies is perfect
+  - need to show topology without any redundancy and that it works
+  - 
+6.3 Data and Clock Path redundancy strategy
+  - so we now consider Data and Clock Paths togother
+  - the ultimate strategy for the Redundancy in WR
+
+7 Network Dimentions
+  - network dimention vs Granuality Window (some nice formula)
+  - Network Dimention vs topology (some nice pic)
+
+8 Diagnostic with White Rabbit Switch
+  - management IP
+    * option to have HP traffic or not to have
+    * option to have FEC in SP or not
+    * etc
+  - giving seq_numbers to HP Packages to check number of lost ones
+  - giving seq_numbers to Control Messages to check number of lost ones
+  - measuring latency in switches ???? posibble, maybe optional (forward
+messages to control port, have dummy packates to measure latency...)
+   
+x Flow Control
+x.1 Flow control in Layer 1
+x.2 Flow control in Layer 2
+
+
+x = i'm not sure which chapter, but it's quite self contained, so can be put
+somewher later Ou would say between 5 and 6
+
+ 
\ No newline at end of file
--- a/documents/specifications/robustness/robustness_doc/revision_history_table.tex
+++ b/documents/specifications/robustness/robustness_doc/revision_history_table.tex
+%\centering
+\paragraph*{Revision History Table}
+
+\begin{center}
+\begin{tabular}{|p{1.5 cm}|p{2 cm}|p{1.5 cm}|p{6 cm}|} \hline
+
+\textbf{Version} & \textbf{Date} & \textbf{Authors} & \textbf{Description} \\
+\hline
+0.1 & 1/09/2010 & C.P & first draft \\ \hline
+0.2 & 3/02/2011 & M.L. & made a lot of mess \\ \hline
+0.3 & 23/02/2011 & M.L. & Change of doc's strucutre based on feedback (yet more
+mess...)\\ \hline
+0.4 & 15/03/2011 & C.P \& M.L. & Minor, and less minor changes to make it look
+ better and be more readable. \\ \hline
+\end{tabular}
+\end{center}
+
+
--- a/documents/specifications/robustness/robustness_doc/robustness.tex
+++ b/documents/specifications/robustness/robustness_doc/robustness.tex
+\documentclass[a4paper,11pt]{report}
+%
+%--------------------   start of the 'preamble'
+%
+\usepackage{amssymb,amstext,amsmath}
+\usepackage{pdfpages} 
+\usepackage[latin1]{inputenc}
+\usepackage{fullpage}
+\usepackage{caption}
+\usepackage{color}
+\usepackage{epstopdf}
+\usepackage{graphicx}
+\usepackage{todonotes}
+\usepackage{graphics}
+\usepackage[pdftex]{epsfig}
+\usepackage{lscape}
+\usepackage{rotating}
+\usepackage{enumerate}
+\usepackage{multirow}
+\usepackage{array}
+
+
+%
+%%    homebrew commands -- to save typing
+\newcommand\etc{\textsl{etc}}
+\newcommand\eg{\textsl{eg.}\ }
+\newcommand\etal{\textsl{et al.}}
+\newcommand\Quote[1]{\lq\textsl{#1}\rq}
+\newcommand\fr[2]{{\textstyle\frac{#1}{#2}}}
+\newcommand\miktex{\textsl{MikTeX}}
+\newcommand\comp{\textsl{The Companion}}
+\newcommand\nss{\textsl{Not so Short}}
+\newcommand{\HRule}{\rule{\linewidth}{0.5mm}}
+\newcommand \cc[1]{\textcolor{red}{\textsl{-CESAR-}}{\textcolor{red}{#1}}} %comments from cesar
+\newcommand \cm[1]{\textcolor{blue}{\textsl{-MACIEJ-}}{\textcolor{blue}{#1}}}
+
+
+\newcommand{\tab}{\hspace*{2em}}
+
+%=============== Temporary solutions to naming problem ======================
+
+% full names
+% if you need space after the command,use "\" i.e. "
+% \HighPriority is" ==>> will change into "High Priorityis"
+% \HighPriority\ is" ==>> will change into "High Priority is"
+\newcommand \HighPriority[0]{High Priority}
+\newcommand \StandardPriority[0]{Standard Priority}
+\newcommand \GranularityWindow[0]{Granularity Window}
+\newcommand \ControlMessage[0]{Control Message} %White Rabbit Information Block
+
+
+% abreviations
+\newcommand \HP[0]{HP}
+\newcommand \SP[0]{SP}
+\newcommand \GW[0]{GW}
+\newcommand \CM[0]{CM} % WRIB
+
+%=============================================================================
+
+%comments from cesar
+%
+\graphicspath{{fig/}}
+%\graphicspath{{.}}
+%---------------------   end of the 'preamble'
+%
+
+%TO REMOVE THE WRITTEN CHAPTER FROM THE CHAPTER
+\makeatletter
+\renewcommand{\@makechapterhead}[1]{%
+\vspace*{0pt}%
+{\setlength{\parindent}{0pt} \raggedright \normalfont
+\bfseries\huge\thechapter.\ #1
+\par\nobreak\vspace{40 pt}}}
+\makeatother
+
+
+
+\begin{document}
+%-----------------------------------------------------------
+\title{Robustness and Determinism in White Rabbit}
+\author{Cesar Prado GSI}
+\author{Maciej Lipinski CERN}
+\date
+\maketitle
+
+\begin{titlepage}
+ 
+\begin{center}
+
+% Upper part of the page
+%\includegraphics[scale=0.50]{fig/white_rabbit.jpg}\\[1cm]
+%\includegraphics[height=80mm]{white_rabbit.ps}\\[1cm]
+\includegraphics[height=70mm]{../../../../figures/logo/WRlogo.ps}\\[1cm]
+
+% Title
+\HRule \\[0.4cm]
+{ \huge \bfseries White Rabbit and Robustness \\ [0.8cm]
+  \Large Draft for Comments}\\[0.4cm]
+\HRule \\[1.0cm]
+ 
+\textsc{\normalsize  GSI, Helmholtzzentrum fur Schwerionenforschung GmbH}
+\newline
+\textsc{\normalsize  CERN, Organisation Europeenne pour la Recherche Nucleaire}
+\\[0.5cm]
+
+% Author and supervisor
+%\begin{minipage}{0.4\textwidth}
+\begin{flushright} \large
+Cesar Prados, Maciej Lipinski
+\end{flushright}
+%\end{minipage}
+ 
+\vfill
+ 
+% Bottom of the page
+\begin{flushright}
+{\large \today}
+\end{flushright}
+ 
+\end{center}
+ 
+\end{titlepage}
+
+
+%-----------------------------------------------------------
+%\begin{abstract}\centering
+%\end{abstract}
+%-----------------------------------------------------------
+\tableofcontents
+%-----------------------------------------------------------
+\include{revision_history_table}
+\include{acronyms}
+\include{chap1}
+\include{chap2}
+\include{chap3}
+\include{chap4}
+\include{chap5}
+\include{chap6}
+\include{chap7}
+\include{chap8}
+%\include{chap9}
+%-----------------------------------------------------------
+\addcontentsline{toc}{chapter}{\numberline{}Bibliography}
+\include{biblio}
+%-----------------------------------------------------------
+\appendix
+\include{app1}
+\include{app2}
+\include{app3}
+\include{app4}
+\include{app5}
+\include{app6}
+\include{app7}
+\include{app8}
+\include{app9}
+\include{app10}
+%-------------------------------------------------
+\end{document}
--- a/figures/network/hierarchy2.pdf
+++ b/figures/network/hierarchy2.pdf
--- a/figures/robustness/CMdelayHP.pdf
+++ b/figures/robustness/CMdelayHP.pdf
--- a/figures/robustness/P_error_control_msg_CERN.pdf
+++ b/figures/robustness/P_error_control_msg_CERN.pdf
--- a/figures/robustness/P_error_control_msg_GSI.pdf
+++ b/figures/robustness/P_error_control_msg_GSI.pdf
--- a/figures/robustness/VLAN_Tag_GigaPeek.pdf
+++ b/figures/robustness/VLAN_Tag_GigaPeek.pdf
--- a/figures/robustness/WRRSTPcase1.pdf
+++ b/figures/robustness/WRRSTPcase1.pdf
--- a/figures/robustness/WRRSTPcase2.pdf
+++ b/figures/robustness/WRRSTPcase2.pdf
--- a/figures/robustness/WRRSTPcase3.pdf
+++ b/figures/robustness/WRRSTPcase3.pdf
--- a/figures/robustness/WRRSTPcase4.pdf
+++ b/figures/robustness/WRRSTPcase4.pdf
--- a/figures/robustness/WRRSTPcase5.pdf
+++ b/figures/robustness/WRRSTPcase5.pdf
--- a/figures/robustness/WRRSTPforHP.pdf
+++ b/figures/robustness/WRRSTPforHP.pdf
--- a/figures/robustness/WRRSTPforHP2.pdf
+++ b/figures/robustness/WRRSTPforHP2.pdf
--- a/figures/robustness/architecture_management_flow_congestion_control.pdf
+++ b/figures/robustness/architecture_management_flow_congestion_control.pdf
--- a/figures/robustness/channels.pdf
+++ b/figures/robustness/channels.pdf
--- a/figures/robustness/deliveryDelayChart.pdf
+++ b/figures/robustness/deliveryDelayChart.pdf
--- a/figures/robustness/dual_link.pdf
+++ b/figures/robustness/dual_link.pdf
--- a/figures/robustness/fec_header.pdf
+++ b/figures/robustness/fec_header.pdf
--- a/figures/robustness/fullyRedundantTopologies.pdf
+++ b/figures/robustness/fullyRedundantTopologies.pdf
--- a/figures/robustness/hamming.pdf
+++ b/figures/robustness/hamming.pdf
--- a/figures/robustness/hpRouting.pdf
+++ b/figures/robustness/hpRouting.pdf
--- a/figures/robustness/indirect_change_explamation.pdf
+++ b/figures/robustness/indirect_change_explamation.pdf
--- a/figures/robustness/layer1redundancy.pdf
+++ b/figures/robustness/layer1redundancy.pdf
--- a/figures/robustness/network_beginning.pdf
+++ b/figures/robustness/network_beginning.pdf
--- a/figures/robustness/network_spanning.pdf
+++ b/figures/robustness/network_spanning.pdf
--- a/figures/robustness/osiLayers.png
+++ b/figures/robustness/osiLayers.png
--- a/figures/robustness/overhead_cern.pdf
+++ b/figures/robustness/overhead_cern.pdf
--- a/figures/robustness/overhead_gsi.pdf
+++ b/figures/robustness/overhead_gsi.pdf
--- a/figures/robustness/switchRouting.pdf
+++ b/figures/robustness/switchRouting.pdf
--- a/figures/robustness/threeTopologies.pdf
+++ b/figures/robustness/threeTopologies.pdf
--- a/figures/robustness/timePaths.pdf
+++ b/figures/robustness/timePaths.pdf
--- a/figures/robustness/topologyConsideration.pdf
+++ b/figures/robustness/topologyConsideration.pdf
--- a/figures/robustness/wrRSTP.pdf
+++ b/figures/robustness/wrRSTP.pdf
--- a/figures/robustness/wrRSTPtopologies.png
+++ b/figures/robustness/wrRSTPtopologies.png