stuck at insmod spec-fmc-carrier.ko
After some acquisitions were started printouts like the following were seen in the dmesg:
[33941.487485] fmc_tdc fmc-tdc.24.auto: DMA failed for channel 1: can't arm ZIO trigger
[33941.488484] fmc_tdc fmc-tdc.24.auto: DMA failed for channel 1: can't arm ZIO trigger
[33941.489486] fmc_tdc fmc-tdc.24.auto: DMA failed for channel 1: can't arm ZIO trigger
After rem_reset was performed, it is not possible to load spec-fmc-carrier driver.
insmod spec-fmc-carrier.ko
(or modprobe) stuck with ~100% CPU utilization.
dmesg shows (related to SPEC with fine delay):
[ 58.816416] fpga_manager fpga0: gn412x-fcl.1.auto registered
[ 58.817865] spec-fmc-carrier 0000:02:00.0: SPRI_DONE 1
[ 58.818104] i2c i2c-1: Added multiplexed i2c bus 2
[ 58.818243] i2c i2c-1: Added multiplexed i2c bus 3
[ 58.818396] at24 2-0050: 256 byte 24c02 EEPROM, writable, 8 bytes/write
[ 58.818420] spec-fmc-carrier spec-0000:02:00.0: Not able to find DMA engine: platform_device missing
(related to SPEC with TDC)
[ 58.821847] fpga_manager fpga1: gn412x-fcl.6.auto registered
[ 58.823302] spec-fmc-carrier 0000:03:00.0: SPRI_DONE 1
Further remote resets with rem_reset does not fix the problem. Full Power cycle (unplug power cable) fixed the problem.
NOTEs:
acquisition command (for personal reference)
python -d -m pytest --usr-acq-count=10000 --usr-acq-period-ns=100000 test-01-functionalities/test_fmctdc_acquisition.py::TestFmctdcAcquisition::test_acq_timestamp_multiple_hist -s --tdc-id-ch 0x18:1 --fd-id-ch 0x1a:1 --samples-file=test_sample --bin-min-ps -5000 --bin-max-ps 5000 --bins-num 1000 --histogram-file=test_hist --tdc-wr-on
Solution: There are two problems:
- It is possible to make SPEC/TDC to constantly generate interrupts
- SPEC is not power cycled during the FEC reset
When TDC library for python is used, if an acquisition is not explicitly stopped, it might happen that interrupts are still generated after a program is closed. It can lead SPEC to generate interrupts all the time.
It can happen that the FEC is restarted when SPEC keeps generating interrupts, then after the boot it is not possible to load the spec-fmc-carrier driver.
Restart of a FEC does not power cycle the SPEC, so it keeps generating interrupts even after FEC reset. IMHO, SPEC driver (spec-fmc-carrier) cannot be loaded probably because the interrupts are still being generated by SPEC. To bring back the system the sane state it is necessary to power cycle SPEC. It can be done by power reset of FEC.
Workarounds:
- When logs like
can't arm ZIO trigger
are printed to the dmesg, it is possible to unload tdc and zio drivers, and load them again. It will stop SPEC from generating interrupts.