Simple PCIe FMC carrier SPEC - Software issues
https://ohwr.org/project/spec-sw/issues
2019-02-12T10:13:40Z
https://ohwr.org/project/spec-sw/issues/4
inappropriate module initialization/cleanup
2019-02-12T10:13:40Z
Piotr Miedzik
inappropriate module initialization/cleanup
inappropriate module initialization/cleanup when parameters
*test\_irq** or **use\_msi** are used
[root@sdapc009 ~]# modprobe spec test_irq=1
[root@sdapc009 ~]# dmesg -c
[ 0.000000] Linux version 3.10.0-327.4.5.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Mon Jan 25 22:07:14 UTC 2016
...
[ 320.873923] fmc: module verification failed: signature and/or required key missing - tainting kernel
[ 320.897785] spec 0000:03:00.0: probe for device 0003:0000
[ 320.948298] spec 0000:03:00.0: got file "fmc/spec-init.bin", 1485236 (0x16a9b4) bytes
[ 321.139869] spec 0000:03:00.0: FPGA programming successful
[ 321.556011] spec 0000:03:00.0: received interrupt 16
[ 321.556015] spec 0000:03:00.0: Interrupts work as expected
[ 321.556020] spec 0000:03:00.0: mezzanine 0
[ 321.556022] Manufacturer: CERN
[ 321.556024] Product name: FmcAdc100m14b4cha
[root@sdapc009 ~]# rmmod spec
[root@sdapc009 ~]# modprobe spec use_msi=1
[root@sdapc009 ~]# dmesg -c
[ 452.371586] spec 0000:03:00.0: probe for device 0003:0000
[ 452.371909] spec 0000:03:00.0: irq 32 for MSI/MSI-X
[ 452.372593] spec 0000:03:00.0: got file "fmc/spec-init.bin", 1485236 (0x16a9b4) bytes
[ 452.564164] spec 0000:03:00.0: FPGA programming successful
[ 452.931262] spec 0000:03:00.0: mezzanine 0
[ 452.931269] Manufacturer: CERN
[ 452.931270] Product name: FmcAdc100m14b4cha
[root@sdapc009 ~]# rmmod spec
[root@sdapc009 ~]# modprobe spec test_irq=1
Message from syslogd@sdapc009 at Feb 2 18:51:12 ...
kernel:do_IRQ: 1.224 No irq handler for vector (irq -1)
[root@sdapc009 ~]# dmesg -c
[ 485.935712] spec 0000:03:00.0: remove
[ 491.951932] spec 0000:03:00.0: probe for device 0003:0000
[ 491.952742] spec 0000:03:00.0: got file "fmc/spec-init.bin", 1485236 (0x16a9b4) bytes
[ 492.144313] spec 0000:03:00.0: FPGA programming successful
[ 492.511266] do_IRQ: 1.224 No irq handler for vector (irq -1)
[ 492.561012] spec 0000:03:00.0: received interrupt 32
[ 492.561017] spec 0000:03:00.0: Interrupts work as expected
[ 492.561021] spec 0000:03:00.0: mezzanine 0
[ 492.561023] Manufacturer: CERN
[ 492.561025] Product name: FmcAdc100m14b4cha
After reboot:
[root@sdapc009 ~]# rmmod spec
[root@sdapc009 ~]# modprobe spec use_msi=1 test_irq=1
[root@sdapc009 ~]# dmesg -c
[ 230.692595] spec 0000:03:00.0: remove
[ 234.013838] spec 0000:03:00.0: probe for device 0003:0000
[ 234.013853] ------------[ cut here ]------------
[ 234.013859] WARNING: at drivers/pci/msi.c:955 pci_enable_msi_block+0x98/0xb0()
[ 234.013860] Modules linked in: spec(OE+) fmc(OE) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter coretemp kvm iTCO_wdt iTCO_vendor_support dcdbas i2c_i801 ppdev pcspkr snd_hda_codec_analog snd_hda_codec_generic snd_hda_intel snd_hda_codec sg lpc_ich mfd_core snd_hda_core snd_hwdep shpchp snd_seq snd_seq_device x38_edac parport_pc snd_pcm snd_timer snd parport soundcore edac_core binfmt_misc nfsd nfs_acl lockd grace
[ 234.013895] auth_rpcgss sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common sr_mod cdrom nouveau ahci libahci video mxm_wmi wmi i2c_algo_bit drm_kms_helper serio_raw ttm libata tg3 drm ptp pps_core i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: spec]
[ 234.013913] CPU: 1 PID: 2100 Comm: modprobe Tainted: G OE ------------ 3.10.0-327.4.5.el7.x86_64 #1
[ 234.013915] Hardware name: Dell Inc. Precision WorkStation T3400 /0TP412, BIOS A08 08/14/2008
[ 234.013916] 0000000000000000 00000000cad93fd3 ffff88005ca1fb00 ffffffff8163515c
[ 234.013919] ffff88005ca1fb38 ffffffff8107b200 ffff88007c3b0000 0000000000000001
[ 234.013921] ffff88007c3b0000 ffff88007c3b0098 ffff88007c3b0098 ffff88005ca1fb48
[ 234.013924] Call Trace:
[ 234.013929] [<ffffffff8163515c>] dump_stack+0x19/0x1b
[ 234.013932] [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
[ 234.013934] [<ffffffff8107b34a>] warn_slowpath_null+0x1a/0x20
[ 234.013936] [<ffffffff813423c8>] pci_enable_msi_block+0x98/0xb0
[ 234.013940] [<ffffffffa079e545>] spec_probe+0x275/0x2c0 [spec]
[ 234.013944] [<ffffffff81327d75>] local_pci_probe+0x45/0xa0
[ 234.013946] [<ffffffff81329065>] ? pci_match_device+0xe5/0x120
[ 234.013948] [<ffffffff813291d9>] pci_device_probe+0xf9/0x150
[ 234.013951] [<ffffffff813f6417>] driver_probe_device+0x87/0x390
[ 234.013953] [<ffffffff813f67f3>] __driver_attach+0x93/0xa0
[ 234.013955] [<ffffffff813f6760>] ? __device_attach+0x40/0x40
[ 234.013957] [<ffffffff813f4183>] bus_for_each_dev+0x73/0xc0
[ 234.013959] [<ffffffff813f5e6e>] driver_attach+0x1e/0x20
[ 234.013961] [<ffffffff813f59c0>] bus_add_driver+0x200/0x2d0
[ 234.013963] [<ffffffff813f6e74>] driver_register+0x64/0xf0
[ 234.013966] [<ffffffff81328d15>] __pci_register_driver+0xa5/0xc0
[ 234.013970] [<ffffffffa07a5000>] ? 0xffffffffa07a4fff
[ 234.013973] [<ffffffffa07a501e>] spec_init+0x1e/0x1000 [spec]
[ 234.013976] [<ffffffff810020e8>] do_one_initcall+0xb8/0x230
[ 234.013979] [<ffffffff810ed4ae>] load_module+0x134e/0x1b50
[ 234.013982] [<ffffffff81316800>] ? ddebug_proc_write+0xf0/0xf0
[ 234.013985] [<ffffffff810e9743>] ? copy_module_from_fd.isra.42+0x53/0x150
[ 234.013987] [<ffffffff810ede66>] SyS_finit_module+0xa6/0xd0
[ 234.013990] [<ffffffff816458c9>] system_call_fastpath+0x16/0x1b
[ 234.013992] ---[ end trace 5c4efa379a592a47 ]---
[ 234.015037] spec 0000:03:00.0: irq 33 for MSI/MSI-X
[ 234.015634] spec 0000:03:00.0: irq 34 for MSI/MSI-X
[ 234.015642] ------------[ cut here ]------------
[ 234.015647] WARNING: at fs/sysfs/dir.c:526 sysfs_add_one+0xa5/0xd0()
[ 234.015648] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:1c.0/0000:03:00.0/msi_irqs'
[ 234.015649] Modules linked in: spec(OE+) fmc(OE) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter coretemp kvm iTCO_wdt iTCO_vendor_support dcdbas i2c_i801 ppdev pcspkr snd_hda_codec_analog snd_hda_codec_generic snd_hda_intel snd_hda_codec sg lpc_ich mfd_core snd_hda_core snd_hwdep shpchp snd_seq snd_seq_device x38_edac parport_pc snd_pcm snd_timer snd parport soundcore edac_core binfmt_misc nfsd nfs_acl lockd grace
[ 234.015682] auth_rpcgss sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common sr_mod cdrom nouveau ahci libahci video mxm_wmi wmi i2c_algo_bit drm_kms_helper serio_raw ttm libata tg3 drm ptp pps_core i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: spec]
[ 234.015699] CPU: 1 PID: 2100 Comm: modprobe Tainted: G W OE ------------ 3.10.0-327.4.5.el7.x86_64 #1
[ 234.015700] Hardware name: Dell Inc. Precision WorkStation T3400 /0TP412, BIOS A08 08/14/2008
[ 234.015702] ffff88005ca1f948 00000000cad93fd3 ffff88005ca1f900 ffffffff8163515c
[ 234.015704] ffff88005ca1f938 ffffffff8107b200 00000000ffffffef ffff88005dc7b2a0
[ 234.015706] ffff88005ca1f9e8 ffff88005d95d000 0000000000000000 ffff88005ca1f9a0
[ 234.015708] Call Trace:
[ 234.015712] [<ffffffff8163515c>] dump_stack+0x19/0x1b
[ 234.015715] [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
[ 234.015717] [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
[ 234.015719] [<ffffffff8125a6b5>] sysfs_add_one+0xa5/0xd0
[ 234.015721] [<ffffffff8125a8ac>] create_dir+0x7c/0xe0
[ 234.015723] [<ffffffff8125abac>] sysfs_create_subdir+0x1c/0x20
[ 234.015726] [<ffffffff8125c4fd>] internal_create_group+0x6d/0x290
[ 234.015728] [<ffffffff8125c94a>] sysfs_create_groups+0x4a/0xa0
[ 234.015731] [<ffffffff81341bdd>] populate_msi_sysfs+0x1cd/0x210
[ 234.015733] [<ffffffff81342259>] msi_capability_init+0x179/0x250
[ 234.015736] [<ffffffff81342396>] pci_enable_msi_block+0x66/0xb0
[ 234.015740] [<ffffffffa079e545>] spec_probe+0x275/0x2c0 [spec]
[ 234.015742] [<ffffffff81327d75>] local_pci_probe+0x45/0xa0
[ 234.015745] [<ffffffff81329065>] ? pci_match_device+0xe5/0x120
[ 234.015747] [<ffffffff813291d9>] pci_device_probe+0xf9/0x150
[ 234.015749] [<ffffffff813f6417>] driver_probe_device+0x87/0x390
[ 234.015752] [<ffffffff813f67f3>] __driver_attach+0x93/0xa0
[ 234.015754] [<ffffffff813f6760>] ? __device_attach+0x40/0x40
[ 234.015755] [<ffffffff813f4183>] bus_for_each_dev+0x73/0xc0
[ 234.015758] [<ffffffff813f5e6e>] driver_attach+0x1e/0x20
[ 234.015759] [<ffffffff813f59c0>] bus_add_driver+0x200/0x2d0
[ 234.015762] [<ffffffff813f6e74>] driver_register+0x64/0xf0
[ 234.015764] [<ffffffff81328d15>] __pci_register_driver+0xa5/0xc0
[ 234.015768] [<ffffffffa07a5000>] ? 0xffffffffa07a4fff
[ 234.015771] [<ffffffffa07a501e>] spec_init+0x1e/0x1000 [spec]
[ 234.015773] [<ffffffff810020e8>] do_one_initcall+0xb8/0x230
[ 234.015775] [<ffffffff810ed4ae>] load_module+0x134e/0x1b50
[ 234.015777] [<ffffffff81316800>] ? ddebug_proc_write+0xf0/0xf0
[ 234.015780] [<ffffffff810e9743>] ? copy_module_from_fd.isra.42+0x53/0x150
[ 234.015782] [<ffffffff810ede66>] SyS_finit_module+0xa6/0xd0
[ 234.015785] [<ffffffff816458c9>] system_call_fastpath+0x16/0x1b
[ 234.015786] ---[ end trace 5c4efa379a592a48 ]---
[ 234.015862] spec 0000:03:00.0: spec_probe: enable msi block: error -17
[ 234.017045] spec 0000:03:00.0: got file "fmc/spec-init.bin", 1485236 (0x16a9b4) bytes
[ 234.208615] spec 0000:03:00.0: FPGA programming successful
[ 234.575195] spec 0000:03:00.0: invalid msi control: 0x0084
[ 234.625012] spec 0000:03:00.0: received interrupt 32
[ 234.625016] spec 0000:03:00.0: Interrupts work as expected
[ 234.625021] spec 0000:03:00.0: mezzanine 0
[ 234.625023] Manufacturer: CERN
[ 234.625024] Product name: FmcAdc100m14b4cha
in addition kernel is not able to reboot
https://ohwr.org/project/spec-sw/issues/3
Cannot compile WR NIC with kernel >=4.5
2019-02-12T10:13:40Z
Dimitris Lampridis
Cannot compile WR NIC with kernel >=4.5
Running make on a host computer with Linux kernel version \>=4.5
produces:
```
CC [M] spec-sw/kernel/wr-nic-gpio.o
spec-sw/kernel/wr-nic-gpio.c: In function ‘gc_to_fmc’:
spec-sw/kernel/wr-nic-gpio.c:20:25: error: ‘struct gpio_chip’ has no member named ‘dev’
struct device *dev = gc->dev;
^
spec-sw/kernel/wr-nic-gpio.c: In function ‘wrn_gpio_init’:
spec-sw/kernel/wr-nic-gpio.c:75:4: error: ‘struct gpio_chip’ has no member named ‘dev’
gc->dev = &fmc->dev;
^
```
This is due to a change in struct gpio\_chip introduced in 4.5, where
the "dev" field was renamed to
"parent":
http://lxr.free-electrons.com/diff/include/linux/gpio/driver.h?v=4.4;diffvar=v;diffval=4.5
I believe that the attached patch (against current master) solves this.
### Files
* [0001-wr_nic-update-to-new-4.5-kernel-struct-gpio_chip.patch](/uploads/2b8ce038a8c84fa8fa7c494ab67d6256/0001-wr_nic-update-to-new-4.5-kernel-struct-gpio_chip.patch)
https://ohwr.org/project/spec-sw/issues/2
uninstall option for kernel modules
2019-02-12T10:13:39Z
Tjeerd Pinkert
uninstall option for kernel modules
would it be possible to add an 'uninstall' target in the Makefile to
clean up the old installed files?
That would be good to keep our systems clean. For an update we can then
uninstall, update-source, compile and install
https://ohwr.org/project/spec-sw/issues/1
kernel crashes upon multiple (five to ten) module reloads.
2019-02-12T10:13:38Z
Tjeerd Pinkert
kernel crashes upon multiple (five to ten) module reloads.
When I start reloading the kernel modules to do our usual restart tests
and after a few, maybe five to ten or so reloads, the thing breaks.
I'm using the wrpc-v3.0 from the provided binaries.
Nov 24 17:58:01 white-rabbit kernel: \[ 975.714609\] spec 0000:03:00.0:
remove
Nov 24 17:58:01 white-rabbit kernel: \[ 975.717550\] spec 0000:03:00.0:
probe for device 0003:0000
Nov 24 17:58:01 white-rabbit kernel: \[ 975.718124\] spec 0000:03:00.0:
firmware: direct-loading firmware fmc/spec-init.bin
Nov 24 17:58:01 white-rabbit kernel: \[ 975.718131\] spec 0000:03:00.0:
got file "fmc/spec-init.bin", 1485236 (0x16a9b4) bytes
Nov 24 17:58:02 white-rabbit kernel: \[ 975.909666\] spec 0000:03:00.0:
FPGA programming successful
Nov 24 17:58:02 white-rabbit kernel: \[ 976.339285\] spec 0000:03:00.0:
mezzanine 0
Nov 24 17:58:02 white-rabbit kernel: \[ 976.339287\] Manufacturer:
CERN
Nov 24 17:58:02 white-rabbit kernel: \[ 976.339288\] Product name:
FmcDIO5chTTLa
Nov 24 17:58:02 white-rabbit kernel: \[ 976.340368\] fmc
FmcDIO5chTTLa-0300: Driver has no ID: matches all
Nov 24 17:58:02 white-rabbit kernel: \[ 976.340407\] spec 0000:03:00.0:
reprogramming with fmc/wrpc\_v3.0.bin
Nov 24 17:58:02 white-rabbit kernel: \[ 976.340749\] spec 0000:03:00.0:
firmware: direct-loading firmware fmc/wrpc\_v3.0.bin
Nov 24 17:58:02 white-rabbit kernel: \[ 976.532347\] spec 0000:03:00.0:
FPGA programming successful
Nov 24 17:58:02 white-rabbit kernel: \[ 976.575620\] fmc\_trivial
FmcDIO5chTTLa-0300: Can't find SDB at address 0x0
Nov 24 17:59:01 white-rabbit CRON\[1266\]: (root) CMD ( modprobe -r
fmc-trivial spec fmc; modprobe spec; modprobe fmc-trivial
gateware=fmc/wrpc\_v3.0.bin)
Nov 24 17:59:01 white-rabbit kernel: \[ 1035.568446\] spec 0000:03:00.0:
remove
Nov 24 17:59:01 white-rabbit kernel: \[ 1035.571587\] spec 0000:03:00.0:
probe for device 0003:0000
Nov 24 17:59:01 white-rabbit kernel: \[ 1035.572087\] spec 0000:03:00.0:
firmware: direct-loading firmware fmc/spec-init.bin
Nov 24 17:59:01 white-rabbit kernel: \[ 1035.572093\] spec 0000:03:00.0:
got file "fmc/spec-init.bin", 1485236 (0x16a9b4) bytes
Nov 24 17:59:02 white-rabbit kernel: \[ 1035.763627\] spec 0000:03:00.0:
FPGA programming successful
Nov 24 17:59:02 white-rabbit kernel: \[ 1035.803920\] spec 0000:03:00.0:
Can't find SDB magic
The machine crashes. It appears as though the SDB is broken, but I
suspect something different is going on. When I reboot the machine I can
again reload for 5 to 10 times, then the same error. My conclusion is
that something is broken, not so much in the v3.0 spec-init.bin binary,
but in the kernel modules?
At the LM32 terminal, the output simply stops, so no (visible) trace of
booting and crashing.
This might be a duplicate from the colleague on the tracker...
We learn:
0-59/1 \* \* \* \* root modprobe -r fmc-trivial spec fmc; modprobe spec;
modprobe fmc-trivial gateware=fmc/wrpc\_v3.0.bin
in /etc/crontab is great for tracking down all kinds of issues...
Yours, Tjeerd