Commit ea3f148a authored by Grzegorz Daniluk's avatar Grzegorz Daniluk

docs/specs/robustness: copy the rest of robustness materials from old svn repo

parent 46137f46
DATA NETWORK CONGESTION
***********************
When packets being transmitted through a network approach the packet-handling capacity of the network
Network nodes are overloaded –> queues start filling up
Arrival Rate to the Switch = a
Departure Rate to the Switch = b
Stability Condition a/b = p < 1
CONGESTION CONTROL TECHNIQUES
******************************
Backpressure
– Request from destination to source to reduce rate
– Propagates hop-by-hop backward along path
– Suitable for virtual circuits
Choke Packet
- Could be good for Switching network
- A host receiving a choke packet should reduce the traffic to the
specified destination. A variation (Hop-by-Hop Choke Packets) operate similarly but take
effect at each hop while choke packets travel back to the source.
Implicit congestion signaling -> TCP
Explicit congestion signaling
– Network responsibility to alert end systems of growing congestion
– Direction
• Backward: network notifies the source
• Forward: network notifies the destination
– Approaches
• Binary: on/off approach
• Credit-based: how much data can be sent
• Rate-based: how fast data can be transmitted
Fairness
Congestion effects should be distributed equally to traffic flows. Last-in-first-discarded may not be fair
Quality of Service
Differentiation based on application requirements
• Voice, video: delay sensitive, loss insensitive
• File transfer, mail: delay insensitive, loss sensitive
• Interactive computing: delay and loss sensitive
Reservation scheme
To provide guaranteed services
• Traffic policing: excess traffic discarded or handled on besteffort basis
CONCLUSION
**********
Chocke or Explicit by now are the most interesting.
Implementation:
In nodes, where? not WR Core, then in the general purpose CPU in SW
Alternative to this options to the Congestion control would be Switch Scheduling... it makes the switch way more difficult.
http://www.stanford.edu/class/ee384x/papers/PIM.pdf
IDEAS
*****
We should combine for the highest priority and the rest different strategies...
E.g. HP blocking port if someone is (except the data master) sending them....
Documentation
Book (I didn't find it but it seems to be the bible for this topics)
High-Speed Networks and Internets Performance and Quality of Service, Second Edition
Web Site
http://williamstallings.com/HsNet2e.html
http://www.citidel.org/bitstream/10117/1021/1/Congestion+Control.pdf
Flow and congestion control in data networks
http://www.sis.pitt.edu/~wcerroni/Lecture03.pdf
http://www.sis.pitt.edu/~wcerroni/wans.html
The Network Layer Functions: Congestion Control
http://meseec.ce.rit.edu/eecc694-spring2000/694-3-30-2000.pdf
Modeling the Interactions of Congestion Control and Switch Scheduling
http://webee.technion.ac.il/~isaac/p/tr08-03_modeling.pdf
Techniques in Internet Congestion Control
http://www.ee.mu.oz.au/research/cubin/downloads/cubin_BartekWydrowski_Thesis.pdf
http://tools.ietf.org/html/rfc2914
Approaches to Congestion Control in Packet Networks
http://utopia.duth.gr/~emamatas/jie2007.pdf
http://en.wikipedia.org/wiki/Congestion_control#Classification_of_congestion_control_algorithms
------------------------------------------------------------------------------------------------------------------------------------------
FLOW CONTROL
************
Mistake the Flow control in Ethernet, is in Layer 2---
Ethernet FLow Control....
is a mechanism for temporarily stopping the transmission of data on an Ethernet computer network. End-to-End, the switches involved.
Article, it doesn't applies to us but:
http://www.smallnetbuilder.com/index2.php?option=com_content&task=view&id=30212&pop=1&page=0&Itemid=54
Alternatives
************
Standard
Priority-based Flow Control, as defined by the standard IEEE 802.1Qbb provides a link level flow control mechanism that can be controlled independently for each Class of Service (CoS), as defined by IEEE 802.1p. The goal of this mechanism is to ensure zero loss under congestion in Data Center Bridging (DCB) networks.
Flow Control Techniques
• Stop-and-wait flow control
• Sliding-window protocol
I don't like them for WR Network.....
CONCLUSION
Study of the Priority Based Flow Control, see what the status.
Think about policy for White Rabbit Network. Defined by the user, set it from SNMP and supervised by the Network analyzer....
LEVELS OF FLOW CONTROL
**********************
HP
SP
etc.....
DOCUMENTATION
Priority Flow Control: Build Reliable Layer 2 Infrastructure
http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-542809_ns783_Networking_Solutions_White_Paper.html
http://en.wikipedia.org/wiki/Ethernet_flow_control
http://nptel.iitm.ac.in/courses/Webcourse-contents/IIT%20Kharagpur/Computer%20networks/pdf/M3L3.pdf
Flow Control Techniques for Multicsatis in Gigabit Networks
http://www.ieee-icnp.org/1996/papers/1996-12.pdf
http://www.cs.virginia.edu/~zaher/classes/CS457/lectures/flow-control.pdf
http://drona.csa.iisc.ernet.in/~deepakd/verification-common/spin-book-ch4.pdf
DATA NETWORK CONGESTION
***********************
When packets being transmitted through a network approach the packet-handling capacity of the network
Network nodes are overloaded –> queues start filling up
Arrival Rate to the Switch = a
Departure Rate to the Switch = b
Stability Condition a/b = p < 1
CONGESTION CONTROL TECHNIQUES
******************************
Backpressure
– Request from destination to source to reduce rate
– Propagates hop-by-hop backward along path
– Suitable for virtual circuits
Choke Packet
- Could be good for Switching network
- A host receiving a choke packet should reduce the traffic to the
specified destination. A variation (Hop-by-Hop Choke Packets) operate similarly but take
effect at each hop while choke packets travel back to the source.
Implicit congestion signaling -> TCP
Explicit congestion signaling
– Network responsibility to alert end systems of growing congestion
– Direction
• Backward: network notifies the source
• Forward: network notifies the destination
– Approaches
• Binary: on/off approach
• Credit-based: how much data can be sent
• Rate-based: how fast data can be transmitted
Fairness
Congestion effects should be distributed equally to traffic flows. Last-in-first-discarded may not be fair
Quality of Service
Differentiation based on application requirements
• Voice, video: delay sensitive, loss insensitive
• File transfer, mail: delay insensitive, loss sensitive
• Interactive computing: delay and loss sensitive
Reservation scheme
To provide guaranteed services
• Traffic policing: excess traffic discarded or handled on besteffort basis
CONCLUSION
**********
Chocke or Explicit by now are the most interesting.
Implementation:
In nodes, where? not WR Core, then in the general purpose CPU in SW
Alternative to this options to the Congestion control would be Switch Scheduling... it makes the switch way more difficult.
http://www.stanford.edu/class/ee384x/papers/PIM.pdf
IDEAS
*****
We should combine for the highest priority and the rest different strategies...
E.g. HP blocking port if someone is (except the data master) sending them....
Documentation
Book (I didn't find it but it seems to be the bible for this topics)
High-Speed Networks and Internets Performance and Quality of Service, Second Edition
Web Site
http://williamstallings.com/HsNet2e.html
http://www.citidel.org/bitstream/10117/1021/1/Congestion+Control.pdf
Flow and congestion control in data networks
http://www.sis.pitt.edu/~wcerroni/Lecture03.pdf
http://www.sis.pitt.edu/~wcerroni/wans.html
The Network Layer Functions: Congestion Control
http://meseec.ce.rit.edu/eecc694-spring2000/694-3-30-2000.pdf
Modeling the Interactions of Congestion Control and Switch Scheduling
http://webee.technion.ac.il/~isaac/p/tr08-03_modeling.pdf
Techniques in Internet Congestion Control
http://www.ee.mu.oz.au/research/cubin/downloads/cubin_BartekWydrowski_Thesis.pdf
http://tools.ietf.org/html/rfc2914
Approaches to Congestion Control in Packet Networks
http://utopia.duth.gr/~emamatas/jie2007.pdf
http://en.wikipedia.org/wiki/Congestion_control#Classification_of_congestion_control_algorithms
------------------------------------------------------------------------------------------------------------------------------------------
FLOW CONTROL
************
Mistake the Flow control in Ethernet, is in Layer 2---
Ethernet FLow Control....
is a mechanism for temporarily stopping the transmission of data on an Ethernet computer network. End-to-End, the switches involved.
Article, it doesn't applies to us but:
http://www.smallnetbuilder.com/index2.php?option=com_content&task=view&id=30212&pop=1&page=0&Itemid=54
Alternatives
************
Standard
Priority-based Flow Control, as defined by the standard IEEE 802.1Qbb provides a link level flow control mechanism that can be controlled independently for each Class of Service (CoS), as defined by IEEE 802.1p. The goal of this mechanism is to ensure zero loss under congestion in Data Center Bridging (DCB) networks.
Flow Control Techniques
• Stop-and-wait flow control
• Sliding-window protocol
I don't like them for WR Network.....
CONCLUSION
Study of the Priority Based Flow Control, see what the status.
Think about policy for White Rabbit Network. Defined by the user, set it from SNMP and supervised by the Network analyzer....
LEVELS OF FLOW CONTROL
**********************
HP
SP
etc.....
DOCUMENTATION
Priority Flow Control: Build Reliable Layer 2 Infrastructure
http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-542809_ns783_Networking_Solutions_White_Paper.html
http://en.wikipedia.org/wiki/Ethernet_flow_control
http://nptel.iitm.ac.in/courses/Webcourse-contents/IIT%20Kharagpur/Computer%20networks/pdf/M3L3.pdf
Flow Control Techniques for Multicsatis in Gigabit Networks
http://www.ieee-icnp.org/1996/papers/1996-12.pdf
http://www.cs.virginia.edu/~zaher/classes/CS457/lectures/flow-control.pdf
http://drona.csa.iisc.ernet.in/~deepakd/verification-common/spin-book-ch4.pdf
NETFLOW
http://neye.unsupported.info/
http://trac.netflowdashboard.com/netflowdashboard/
http://en.wikipedia.org/wiki/NetFlow
http://en.wikipedia.org/wiki/Network_monitoring
http://en.wikipedia.org/wiki/Network_traffic_measurement
http://www.caida.org/tools/measurement/netramet/ntm-site/
*******************************
NETWORK MONITORING (Diagnostic)
*******************************
Monitoring Requirements
***********************
- Guarantee the availability of the function
- Maintenance
- Automatic reaction on operation anomalies:
- Real-time confi. modification in case of error
- Activation of redundant components
- Dynamic reactions to changes on the network and environments
- Dynamic reaction to changes on the network
- Dynamic adaption
- Network Control
- Collection of information
- Definition of a database of network configurations
Traffic Analysis Applications requirements
- Identify growth and abnormal occurrences in the network
Common Measurement Metrics
***********************
Performance measurement
- Availability: percentage of time that a network system, component or application is available for a user. It is based on the reliability of the individual component of a network.
Availability = MTBF / (MTBF +MTTR)
MTBF .- mean time between failures
MTTR .- mean time to repair following a failure
- Response time: the time it takes a system to react to a given input.
- Accuracy:
- Throughput: quantity of data that can be sent over a link in a specified amount of time.
- Utilization: a more fine-grained measure than throughput. It refers to determining the percentage of time that a resource is in use over a given period of time.
- Latency and Jitter: amount of time it takes a packet from source to destination.
• Jitter: variance of intra-packet delay on a monodirectional link.
Per-Link Measurements
*********************
• Metrics available on a link
— # packets, # bytes, # packets discarded on a specific interface over the last minute
— # flows, # of packets per flow
• It does not provide global network statistics.
• Useful to ISPs for traffic measurements.
• Examples:
— SNMP MIBs
— RTFM (Real-Time Flow Measurement)
— Cisco NetFlow
End-to-End Measurements
***********************
• Network performance != Application performance
• Most of network measurements are end-to-end.
• Per path statistics
• How does the network behave with long/short probe packets?
Monitoring
**********
Active Measurement
— To inject network traffic and study how the network reacts to the traffic (e.g. ping).
- Active measurements are often end-to-end
Passive Measurement
— To monitor network traffic for the purpose of measurement (e.g. use the TCP three way handshake to measure network round-trip time).
- passive measurements are limited to the link where the traffic is captured
• There is no good and bad. Both approaches are good, depending on the case:
— Passive monitoring on a switched network can be an issue.
— Injecting traffic on a satellite link is often doable only by the satellite provider.
• Usually the best is to combine both approaches and compare results.
Remote Monitoring with SNMP + Cisco NetFlow or RMON-1 or sFLOW (Layer 1 and Layer 2)
NetFlow / RMON / sFlow -> "flow-based" monitoring
SNMP -> "deviced-based" monitoring
SNMP Vs RMON
************
• The SNMP protocol is used to control and configure a probe. Usually GUI managers mask the complexity of SNMP-based configuration.
• Statistics and saved traffic are retrieved using SNMP by management applications to record statistics on a network and, possibly selected portions of the network traffic.
SNMP and RMON differ in the way they gather traffic statistics:
— SNMP is a periodic poll-request process: it requires a query of the SNMP device to get network statistics (the network status is kept by the manager).
— RMON, on the other hand, reduces the stress of the manager by gathering and storing the statistics in counters or buckets for retrieval by a management station.
TRAFFIC MONITORING
******************
RMON
****
Collect data and periodically report it to a more central management station, which potentially reduces traffic on WAN links and polling overhead on the management station.
• Report on what hosts are attached to the LAN, how much they talk, and to whom
• "see" all LAN traffic, full LAN utilization, and not just the traffic to or through the router.
• Filter and capture packets (so you don't have to visit a remote LAN and attach a LAN Analyzer) : it is basically a remote sniffer that can capture real-time traffic (until the integrated memory buffer is full).
• Automatically collect data, compare to thresholds, and send traps to your management station -- which offloads much of the work that might bog down the management station.
The RMON1 MIB consists of ten groups:
1. Statistics: real-time LAN statistics e.g. utilization, collisions, CRC errors
2. History: history of selected statistics
3. Alarm: definitions for RMON SNMP traps to be sent when statistics exceed defined thresholds
4. Hosts: host specific LAN statistics e.g. bytes sent/received, frames sent/received
5. Hosts top N: record of N most active connections over a given time period
6. Matrix: the sent-received traffic matrix between systems
7. Filter: defines packet data patterns of interest e.g. MAC address or TCP port
8. Capture: collect and forward packets matching the Filter
9. Event: send alerts (SNMP traps) for the Alarm group
10. Token Ring: extensions specific to Token Ring
• RMON usefull to compute network utilization.
• Network Utilization can be calculated for all the ports of a given switch at regular intervals. This information can be gathered over the course of a day and be used to generate a network utilization profile of a switch or hub.
NetFlow
*******
From Cisco, which menas: we cannot used it, though part of the protocol has been standardized.
SFlow
*****
(From wiki copy and paste)
sFlow uses sampling to achieve scalability[7] and is, for this reason, applicable to high speed networks (gigabit per second speeds and higher).[8] sFlow is supported by multiple network device manufacturers[9] and network management software vendors.[10]
An sFlow system consists of multiple devices performing two types of sampling: random sampling of packets[1] or application layer operations,[3] and time-based sampling of counters.[1] The sampled packet/operation and counter information, referred to as flow samples and counter samples respectively, are sent as sFlow datagrams to a central server running software that analyzes and reports on network traffic; the sFlow collector.[11]
Flow samples
Based on a defined sampling rate, an average of 1 out of N packets/operations is randomly sampled. This type of sampling does not provide a 100% accurate result, but it does provide a result with quantifiable accuracy.[12]
Counter samples
A polling interval defines how often the network device sends interface counters. sFlow counter sampling is more efficient than SNMP polling when monitoring a large number of interfaces.[13]
PERFORMANCE AND FEATURES
************************
Read the documentation
Conclusion
**********
Apparently SFlow suits perfect for us, it can bee implemented in hardware, low bandwidth usage, memory usage, scalability and server load.
That would be in the switches... Now if we speak about the server, MAster Management Node... well take a look at the last doc... they are using a server... we should think if the Master Management is going to take care of the protocols and then storage everything in a server...it looks like to me.
Documentation
*************
http://www.sflow.org/sFlowOverview.pdf
http://www.inmon.com/pdf/EmbeddedTM.pdf
http://www.uknof.org.uk/uknof9/Hobden-LINX-sflow.pdf
\documentclass[a4paper,11pt]{report}
\usepackage{multirow}
\usepackage{lscape}
\usepackage{longtable}
\usepackage{amsmath}
\usepackage[a4paper]{geometry}
%\usepackage{fullpage}
\newcommand\pfeil{$\rightarrow$}
\newcommand\visto{$\surd$}
\geometry{top=1.0in, bottom=1.0in, left=0.5in, right=0.5in}
\begin{document}
\begin{landscape}
\thispagestyle{empty}
\begin{center}
\LARGE{Statuts of Timing and Data Resilience and Management in WR Networs}
\end{center}
\begin{table}[ht]
\centering
\begin{tabular}{ c | c | c | c | c | c | c | c |} \cline{2-7}
&Protocol / Technology & Software & Hardware & WR Device & Soft. Develop. & HW Develop. \\ \cline{1-7}
\multicolumn{1}{|c|}{\multirow{3}{*}{WR Data}} &
\multicolumn{1}{|c|}{VLAN} & \visto&\visto &Switch/MMN &100 \% &80 \% \\ \cline{2-7}
\multicolumn{1}{|c|}{} &
\multicolumn{1}{|c|}{GVRP} &\visto&$\surd$ &Switch/MMN &Open Implementation, TBT &0\% \\ \cline{2-7}
\multicolumn{1}{|c|}{} &
\multicolumn{1}{|c|}{FEC} &\visto &\visto&Switch/MMN/Node &30 \% &30 \% \\ \cline{2-7}
\multicolumn{1}{|c|}{} &
\multicolumn{1}{|c|}{STP} &\visto &\visto &Switch/MMN &Kernel Support, TBT & 0 \% \\ \cline{2-7}
\multicolumn{1}{|c|}{} &
\multicolumn{1}{|c|}{Link Aggregation}&\visto &\visto&Switch/MMN &Linux Support, TBT & 0 \% \\
\hline
\hline
\multicolumn{1}{|c|}{\multirow{3}{*}{WR Timing}} &
\multicolumn{1}{|c|}{PTP} & \visto &\visto &Switch/MMN/Node& 0 \% &0 \% \\ \cline{2-7}
\multicolumn{1}{|c|}{} &
\multicolumn{1}{|c|}{GVRP} &-- &$\surd$&Switch/MMN& 0 \% &0\% \\ \cline{2-7}
\hline
\hline
\multicolumn{1}{|c|}{\multirow{3}{*}{Resilience \& Management}} &
\multicolumn{1}{|c|}{Traffic Monitor} & \visto&\visto&Switch & SFlow, TBT & 0 \% \\ \cline{2-7}
\multicolumn{1}{|c|}{} &
\multicolumn{1}{|c|}{Flow Control} &\visto&$\surd$ &Switch/MMN/Node &0 \%, TBT &0\% \\ \cline{2-7}
\multicolumn{1}{|c|}{} &
\multicolumn{1}{|c|}{Congestion Control} &\visto &\visto&Switch/MMN &0 \% &0 \% \\ \cline{2-7}
\multicolumn{1}{|c|}{} &
\multicolumn{1}{|c|}{LLDP} &\visto & -- &Switch/MMN &LLDPd, TBT & -- \\ \cline{2-7}
\multicolumn{1}{|c|}{} &
\multicolumn{1}{|c|}{SNMP}&\visto &\visto& Switch/MMN/Node& SNMPd,TBT & -- \\
\hline
\end{tabular}
\end{table}
MMN \pfeil Master Management Node
TBT \pfeil To Be Tested
\end{landscape}
\end{document}
\documentclass[a4paper,11pt]{article}
\usepackage{multirow}
\usepackage{rotating}
\usepackage{lscape}
\usepackage{rotating}
\usepackage{lscape}
\usepackage{longtable}
\usepackage{amsmath}
\usepackage[a4paper]{geometry}
\usepackage{fullpage}
\usepackage{color}
\usepackage{pdfpages}
\newcommand\pfeil{$\rightarrow$}
\newcommand\visto{$\surd$}
\usepackage{rotating}
\begin{document}
{\Large{\textbf{White Rabbit a Resilient Network}}}
White Rabbit is network meant to convey data in a deterministic way and high accuracy timing over Ethernet.
Redundancy will be applied there where a single point of failure could be not avoided and bring down the complete functionality of the network:
\begin{itemize}
\item Data Master Node
\item Cabling to critical WR Nodes of the Network etc....
\end{itemize}
Redundancy increases cost of deployment, maintenance and management, therefore White Rabbit aims reliable failover mechanism to provided a truly resilient network that provides the maximum network uptime.
The equation to achieve and robust and resilient network is:
\begin{center}
Robust and resilient network = Hardware + Network Design + Networking Protocols
\end{center}
\textbf{Hardware}
The WR network devices should remain available or up during a failure and provide the mechanism for diagnostics. Hardware elements like core routing and power supplies need to have redundant physical attributes as well as hot swappable card. Hardware support to acquire
real time information of the function of the hardware provides the means to prevent failures.
\textbf{Network Design}
The complexity of a network design contributes positively or negatively to its resiliency. Too many redundant connections and network elements can create a solution that can be difficult to troubleshoot and maintain. Too few connections or network elements may create single points of failure or traffic
bottlenecks. Not all the layers of a networks needs the same robustness, since the influence of a failure in a layer can affect only same receivers or thousand of them. Therefore a well define hierarchy of the network helps to identify where the effort should be invest. The Network Core Layer is a prime target, followed by the Distributed and Access Layer.
{\color{blue}{I'm reading constantly the "five 9s reliability" we could use the same}}
\textbf{Networking Protocol}
The logical approach to ensuring network resiliency are protocols since provides the means for avoiding failure and re-covering after them. Those protocols used in the network are on top of the Hardware support and the network design, thus a thoroughly definition and design of the previous points are vital to be able to fit the protocols demanded for the network and the use-case.
\end{document}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment