Overwritten eeprom on SFPs when CONFIG_READ_SFP_DIAG_ENABLE=y
When the monitoring of SFPs is enabled with the CONFIG_READ_SFP_DIAG_ENABLE=y
in dot-config
, it can happen that some bytes of SFP's EEPROM are overwritten. It was seen that two first bytes were written to 0xA0 0x00
.
The occurrence of this bug is unlikely, but was seen during the sequence of restarts.
Potential cause:
When the periodic read of monitoring data from a SFP is performed, it can be interrupted by a reset of a switch. In this case the i2c transfer is not finished. I2C slave is not aware about the reset of a switch, so it treats new i2c transfers as a continuation of the previous transfer. The problem is that the original transfer can be interrupted in the middle leading to the unexpected results including the write of SFP's eeprom. The problem can only manifests on SFPs that have writable EEPROM (e.g. FS).
Proposed solution:
Implement "I2C slave reset". Described as "Solution 1: Clocking Through the Problem" in AN-686 (https://www.analog.com/media/en/technical-documentation/application-notes/54305147357414AN686_0.pdf). Or use I2C master in GW, which will finish i2c transfer independently of wrsw_hald
process.