Jumbo frames dropped after a particular rate

Reported by Antonio:

version wr-switch-sw-v5.0.1-20170825_binaries.tar works fine, but wr-switch-sw-v6.0.1-20210705_binaries.tar fails. With MTU <=1800 works fine, (wrsw_pstat outputs): But with MTU>= 2000, it starts to fail:

Just for the record: the problem seems to be a short lost of synchronization on the receiving link, in Pstats you see Rsyn-l which tells that the endpoint on rx lost synchronization with the incoming data wr-cores/modules/wr_endpoint/ep_rx_pcs_16bit.vhd:

  U_SYNC_DET : ep_sync_detect_16bit
    port map (
      rst_n_i  => rst_n_rx,
      rbclk_i  => phy_rx_clk_i,
      en_i     => rx_sync_enable,
      data_i   => phy_rx_data_i,
      k_i      => phy_rx_k_i,
      err_i    => phy_rx_enc_err_i,
      synced_o => rx_synced, -> Rsyn-l
      cal_i    => d_is_cal);

This seems to happen only on the ports that support LPDC (1-12). I do not think you would see this error when transferring data between two WR switches.

Reported by Antonio:

these are the results using Topology="2>1", only 1 sender and 1 receiver, with version 6.0.1. The switch loses packets. 4 tests: (pdf test_WR_6_20230405.pdf)

-MTU 8972 and bandwidth 90 MB/s, it loses > 91% packets.
-MTU 8972 and bandwidth 43 MB/s, it loses > 89% packets.
-MTU 8972 and bandwidth 17 MB/s, it loses > 83% packets.
-MTU 2000 and bandwidth 4 MB/s, it loses  > 16% packets.

these are the measurements obtained with one WRS to check the ports (pdf with full wrsw_pstats WR_Tests_FW6_20230406.pdf):

-You are right, ports affected are 1-12. Ports 13-18 work fine.
-Full duplex loses more packets than half duplex.

Here is a proposed experiment scenarios: KM3Net_debugging_v2.pptx

The presentation includes commands to configure VLANs, I did not test this configuration and I might have made mistakes, please review carefully.

Please, note that WR switch is optimized for broadcast and multicast traffic within VLAN. The broadcast optimization is enabled by default, the multicast addresses need to be explicitly configured.

mentioned in issue wr-switch-sw#277 (closed)

These are the results for the scenario with broadcast packets from WRS2.port18:

**MTU 1400: 20000 UDP broadcast packets from WRS2.port18 ** WRS1: WRS2:

**MTU 2000: 20000 UDP broadcast packets from WRS2.port18 ** WRS1: WRS2:

**MTU 4000: 20000 UDP broadcast packets from WRS2.port18 ** WRS1: WRS2:

**MTU 8000: 20000 UDP broadcast packets from WRS2.port18 ** WRS1: WRS2:

It was possible to reproduce the frame lost using xena tester with pre 6.1 WRS firmware. VLANs were not used.

The generated traffic was from wri1 to wri18, frame size 4000B. With jumbo frames enabled (EP_RFCR_A_GIANT set in RFCR register), but MRU not changed (0x800), at the rate of ~10pkt/s it was possible to observe link down/link up every 10-20seconds.

With the increase of the packet rate the time between link restarts is decreased.

With the MRU set to 9000B (devmem 0x10030008 32 0x02328002), link dowm/up is not observed at 10pkt/s and 100pkt/s, but is observed at the 1000pkt/s (values between 100 and 1000 were not tested). Conclusion is that changing the MRU helps a little, but does not solve the problem.

Notifications about link down/up in syslog:

2023-05-09T11:11:10.611734+00:00 ctdwa-774-cxnatest1.cern.ch kernel: wri1: Link up, lpa 0x4020.
2023-05-09T11:11:10.616078+00:00 ctdwa-774-cxnatest1.cern.ch hald: <30>Info    (/wr/bin/wrsw_hal):wri1: Link state change detected: was down, is up
2023-05-09T11:11:29.493004+00:00 ctdwa-774-cxnatest1.cern.ch hald: <30>Info    (/wr/bin/wrsw_hal):port_fsm_state_link_down:wri1: bitslide= 0 [ps]
2023-05-09T11:11:35.210656+00:00 ctdwa-774-cxnatest1.cern.ch kernel: wri1: Link down.

Traffic between non-LPDC ports (e.g. from wri18 to wri17) show no problems (90% of utilization link (4000B*27985pkt/s)).

added Done label

Fixed with commit 35e9419 to wr-cores: wr-cores@35e94190

closed

added v7.0 label

Jumbo frames dropped after a particular rate

Child items

Activity

Jumbo frames dropped after a particular rate

Child items

Linked items

Activity