google/periph

mfrc522 constantly returns "IRQ error" after first read

moqmar opened this issue · 11 comments

Describe the bug
After calling *Dev.ReadCard() for the first time, subsequent calls always immediately return the error mfrc522 lowlevel: IRQ error (which is thrown in low_level.go:326 for unexplained reasons - seems like the timer interrupt is enabled according to the datasheet?!). If a card is present during the call to ReadCard, it will be read normally, but the interrupt is not working anymore.

To Reproduce
Steps to reproduce the behavior:

  1. Create program
package main

import (
	"fmt"
	"periph.io/x/periph/conn/gpio/gpioreg"
	"periph.io/x/periph/conn/spi/spireg"
	"periph.io/x/periph/experimental/devices/mfrc522"
	"periph.io/x/periph/experimental/devices/mfrc522/commands"
	"periph.io/x/periph/host"
	"time"
)

func assertNil(v interface{}) { if v != nil { panic(v) } }
func assertNotNil(v interface{}) { if v == nil { panic(v) } }

func main() {
	_, err := host.Init()
	assertNil(err)

	port, err := spireg.Open("SPI0.0")
	assertNil(err)

	reset := gpioreg.ByName("149")
	assertNotNil(reset)

	irq := gpioreg.ByName("200")
	assertNotNil(irq)

	conn, err := mfrc522.NewSPI(port, reset, irq)
	assertNil(err)

	ver, err := conn.LowLevel.DevRead(commands.VersionReg)
	assertNil(err)
	fmt.Printf("Chip Version: %v\n", ver)

	for {
		data, err := conn.ReadCard(-1, commands.PICC_AUTHENT1A, 1, 0, mfrc522.DefaultKey)
		if err == nil {
			fmt.Printf("Card Data: %x\n", data)
		} else {
			fmt.Printf("Card Error: %s\n", err)
		}
		time.Sleep(500 * time.Millisecond)
	}
}
  1. Run it.
  2. Touch a card against the reader -> will be read
  3. Remove the card -> IRQ errors start showing up:
    image
  4. Restart the program -> "fixes" the error, we're now at step 2 again

Expected behavior
When running ReadCard multiple times, I should get the same result as the first time (i.e. waiting for an interrupt to occur).

Platform (please complete the following information):

  • OS: Poky from Yocto Project (with sysfs GPIO)
  • Board: Jetson Nano Developer Kit

Closing & reopening the port after each read is a workaround (see below), but I couldn't find any code called by NewSPI() that could solve this problem and is not also called from ReadCard()...

assertNil(port.Close())
port, err = spireg.Open("SPI0.0")
assertNil(err)
conn, err = mfrc522.NewSPI(port, reset, irq)
assertNil(err)

Thanks a lot for detailed report! Can you confirm the interrupt line with an oscilloscope? If not, we'll probably need help with debugging this.

I too have encountered this issue on a Raspberry Pi 3. Reinitializing the driver (halting & recreating the PortCloser and Dev) solve the issue.

Thanks a lot for detailed report! Can you confirm the interrupt line with an oscilloscope? If not, we'll probably need help with debugging this.

I am facing the same issue, here's a trace of the IRQ line:
DS2_QuickPrint1

It's reading the UID 3 times then you see the issue trailing off after that, so the IRQ line is falling but I'm unsure if that's before or after the error is seen.

Code

package main

import (
	"encoding/hex"
	"log"
	"time"

	"periph.io/x/periph/conn/spi/spireg"
	"periph.io/x/periph/experimental/devices/mfrc522"
	"periph.io/x/periph/host"
	"periph.io/x/periph/host/rpi"
)

func main() {
	if _, err := host.Init(); err != nil {
		log.Fatal("Failed to init periphio: ", err)
	}

	p, err := spireg.Open("")
	if err != nil {
		log.Fatal("Failed to open spireg: ", err)
	}

	nfc, err := mfrc522.NewSPI(p, rpi.P1_22, rpi.P1_18)
	if err != nil {
		log.Fatal("Failed to init MFRC522: ", err)
	}
	defer nfc.Halt()

	for {
		uid, err := nfc.ReadUID(10 * time.Second)
		log.Printf("UID: %s, err: %s", hex.EncodeToString(uid), err)
	}
}

Log
alan@brackenpi:~ $ ./mfrc522test 
2021/03/06 15:50:00 UID: f76783d97e, err: %!s(<nil>)
2021/03/06 15:50:00 UID: f76783d97e, err: %!s(<nil>)
2021/03/06 15:50:00 UID: f76783d97e, err: %!s(<nil>)
2021/03/06 15:50:00 UID: , err: mfrc522: back data expected 5, actual 1
2021/03/06 15:50:00 UID: , err: mfrc522 lowlevel: IRQ error
2021/03/06 15:50:00 UID: , err: mfrc522 lowlevel: IRQ error
2021/03/06 15:50:00 UID: , err: mfrc522 lowlevel: IRQ error
2021/03/06 15:50:00 UID: , err: mfrc522 lowlevel: IRQ error
2021/03/06 15:50:00 UID: , err: mfrc522 lowlevel: IRQ error
2021/03/06 15:50:00 UID: , err: mfrc522 lowlevel: IRQ error
2021/03/06 15:50:00 UID: , err: mfrc522 lowlevel: IRQ error
2021/03/06 15:50:00 UID: , err: mfrc522 lowlevel: IRQ error
2021/03/06 15:50:01 UID: , err: mfrc522 lowlevel: IRQ error
... etc

Happy do do more debug.

This is interesting:

periph/host/sysfs/gpio.go

Lines 191 to 192 in 2d212de

// TODO(maruel): According to pigpio, the correct way to consume the
// interrupt is to call Seek().

Yeah that's a separate work item. It's possible that edges are missed. It's fairly rare on RPi but on AllWinner based CPUs it's sadly totally unreliable.

I see. Do you think It's the MFRC522 interrupting immediately (even without a tag) and the code reading that as an error somehow, or do you think the interrupt line falling is just an artefact of the repeated read attempts? I suppose we'd need to instrument the code with GPIO's to tell for sure (my scope is only 2-channel).

I have no idea, I never used this device myself.

DS2_QuickPrint2

When reading a UID once there are always two device interrupts, as seen above. There is only one call to gpio.PinIn.WaitForEdge() at the start of selectCard, this means that the next call to WaitForEdge will return immediately as there has been an edge since the last call. So we either need to prevent the second device interrupt from happening or service the host interrupt appropriately.


The device interrupt selected (above) interrupts when "receiver has detected the end of a valid data stream". We will get many of these during any transaction, and apparently at least one if we try to read with no card present as this condition doesn't clear itself after one failed read. There doesn't appear to be a more fine grained interrupt available, so it looks like we have to pick a sensible point to service the host GPIO interrupt.

I think we can call WaitForEdge(0) before line 198 in low_level.go: https://github.com/google/periph/blob/master/experimental/devices/mfrc522/commands/low_level.go#L196-L209
This will clear the host GPIO interrupt before we asynchronously call WatForEdge(timeout), we don't expect a device interrupt until after ComIrqReg is cleared by line 207. The docstring for this function already says it is waiting for IRQ to "strobe", not for it to have previously "strobed".

In theory we could prevent the second device interrupt by not clearing ComIrqReg in Init() which is called after WaitForEdge() in selectCard(), but doing so prevents any UID being read, I'm not sure why.

I also realise this whole conversation should probably move to this repo? https://github.com/periph/devices

Yes, I didn't transfer the issues to the new repos yet. I just realized it's actually fairly easy, it's just a bit tedious as I have to do it for individual issues. Will try on an inactive issue first to confirm it works.

Attempting to migrate an issue results in a 500 (!) so let's stick it here for now.