pimoroni/mlx90640-library

Mlx90640 Jetson Nano Support

S4WRXTTCS opened this issue · 11 comments

Are there any plans on adding support for the NVIDIA Jetson Products?

It seems to me that the Jetson Nano is the perfect host for this kind of device because of the ability to run Neural Networks really quickly. Here is a great example of what I'd like to do with the Pimoroni MLX90640 Thermal Camera breakout board.

https://towardsdatascience.com/detecting-people-with-a-raspberrypi-a-thermal-camera-and-machine-learning-376d3bbcd45c

I didn't very far with this as I get a I2C read errors on when I run any of the examples. In googling I could only find one other person struggling to get it to work on a Jetson platform. I posted a more detailed version of what I tried, and what I'm going to try there.

https://devtalk.nvidia.com/default/topic/1065576/jetson-tx2/i2c-read-error-mlx90640-thermal-sensor/post/5410830/#5410830

Sorry, we just don't have the people-power to support anything beyond the Pi/Raspbian.

I'm concerned that this library is causing the MLX90640 to become inoperative when its run on an NVIDIA Jetson device.

As example I tested out a narrow angle MLX90640 on the Raspberry PI using this library, and it worked fine the rawrgb one, and the sdlscale one. Then I tried a python library for it located here https://github.com/a-kore/mlx90640-python and that worked fine as well. At least it executed fine, and it was displaying data from the sensor.

Then I connected it to a Jetson Nano. This time around I started with the python library from above, and that worked fine on the Jetson Nano once I modified the I2C port #.

Then I tried this library where I tried the test example, and the video example. Neither of them worked, and the video one reports back the I2C read errors. I did check the I2C port, and I believe I had to modify it (on mine its /dev/i2c-0).

Then I went back to the Python library, and that no longer worked.

So then I turned off the Jetson Nano, and moved the Sensor back over to the Raspberry PI and booted it. When tried the sdlscale app it started in that it switched the screen, but it didn't show a window like it normally does. Then I tried rawrgb and it reported back the "frameData timeout error waiting for dataReady.

Now I did save the EEPROM contents before starting this experiment so I do believe I can somehow restore this sensor. My first step is to compare the content values to see what's getting messed up.

I can totally understand not having people-power to support anything beyond the PI/Raspbian, but there needs to be some kind of notice telling people not to try it with the NVIDIA Jetson lineup.

So far it's made three of the sensors I have inoperative, and the person from the post had a few as well. It's not an electrical issue as the Jetson Nano has a 3.3VDC port.

I'm going to try to contact Melexis to see if they know anything about the issue, and what it could be.

I've verified that on the Jetson Nano this library is writing to the EEPROM contents of the Sensor. I don't see any reason that it should be doing that, and I'm assuming its unintentional behavior as the result of the I2C interface being different. The Jetson Nano is after all an unsupported device.

My recommendation would be to add bulletproofing to the code to verify a register write or something like that. So that instead of bricking the sensor it simply displayed some error message.

I don't know how often the bricking is going to happen, but lots of vendors have MLX90640 boards and most end-users like myself are going to use the Pimoroni fork because as far as I know it's the best starting point. Where it has examples, etc.

I've attached an ZIP file of the EEPROM contents before, and after using this library on the Jetson Nano.
EEPROM_Comparison.zip

Here is a list of the changes:
The value at 0x018 changed from 0x01 to 0xFF
The value at 0x019 changed from 0x19 to 0xFF
The value at 0x0EA changed from 0xEE to 0xFF
The value at 0x0EC changed from 0xEE to 0xFF
The value at 0x0ED changed from 0xFB to 0xFF
The value at 0x1BC changed from 0x50 to 0xFF
The value at 0x1BD changed from 0xF8 to 0xFF
The value at 0x1C0 changed from 0x12 to 0xFF
The value at 0x1C1 changed from 0x14 to 0XFF
The value at 0x290 changed from 0xF0 to 0xFF
The value at 0x291 changed from 0x0F to 0XFF
The value at 0x294 changed from 0x30 to 0xFF
The value at 0x295 changed from 0x23 to 0xFF
The value at 0x362 changed from 0x50 to 0xFF
The value at 0x363 changed from 0xF4 to 0xFF
The value at 0x368 changed from 0x02 to 0xFF
The value at 0x369 changed from 0xFC to 0xFF
The value at 0x436 changed from 0xD0 to 0xFF
The value at 0x437 changed from 0xE4 to 0xFF
The value at 0x43C changed from 0x22 to 0xFF
The value at 0x43D changed from 0x00 to 0xFF
The value at 0x508 changed from 0x20 to 0xFF
The value at 0x509 changed from 0xF8 to 0XFF
The value at 0x510 changed from 0x30 to 0xFF
The value at 0x511 changed from 0xF8 to 0xFF
The value at 0x5DC changed from 0x6E to 0xFF
The value at 0x5DD changed from 0xFC to 0xFF
The value at 0x5E4 changed from 0x7E to 0xFF
The value at 0x5E5 changed from 0xF8 to 0xFF

Hello everyone, I would like to any progress regarding this like why this is happening ?
@S4WRXTTCS have you found any way to resolve this ?

@S4WRXTTCS thanks for your sleuthing here-

First and foremost do you know if it's possible to restore a corrupted EEPROM on an MLX90640?

Have you arrived at any conclusions about why the library might be affecting the EEPROM at all? And, further to this, why changing the EEPROM (which as far as I'm aware is just calibration values?) would cause communication with the device to fail?

I don't intend to diverge this codebase too far from the upstream C library which it wraps, so perhaps (unless it's really something in this library causing the issue) the bulletproofing you suggest is something Melexis should be implementing across the board in https://github.com/melexis/mlx90640-library?

Without a Jetson Nano, time, or any real insight into the mlx90640-library (since this is just a minimal Python wrapper) I don't really have a hope at finding a fix, but I would be remiss not to post a warning for MLX90640/Jetson Nano users as you suggest.

I think a warning works well, but I'd say bricked versus damaged. It might be helpful to link to the following python project for people who want to use the MLX90640 with the Jetson Nano. As far as I can tell it works, but I haven't had a chance to full vet it.

https://github.com/a-kore/mlx90640-python

I did start a dialog with Melexis to figure out how to restore the EEPROM. On one of them I saved the prior EEPROM (before the corruption), and on the other one I didn't (it was before I knew what was going on).

They're also looking into why the corruption happens. They say its most likely related to the repeated start which is required to read data. They didn't have a Jetson Nano, but ordered one after I let them know about this issue. This sensor is really attractive to people like myself who are building "AI things" that utilize thermal sensors along with the Jetson Nano.

Has Melexis provided a fix for this? I'm in the same boat, have two bricked sensors. Still have one working one so I can grab the EEPROM from it if needed.

Might be worth raising an issue here: https://github.com/melexis/mlx90640-library/issues

Albeit it may require reproducing the problem using just their C++ library (which, given that the outcome is a dead sensor, may not be particularly favourable).

I don't think it's clear cut that the problem is with Melexis code and not our integration of it, either, but I can't do anything much other than spectulate here.

I'll raise an issue with MLX as well, may be worth a try. After playing around with the C++ examples, I did manage to get some output from both bricked sensors. What I did was run the "step" example first (which failed with a seg fault error), but it put the sensor into measurement mode and let me run the "rawrgb" example. One sensor showed a nearly normal picture but with incorrect temp values, another seemed to output more garbage so it may have been more damaged. I think what did it was the "MLX90640_StartMeasurement(MLX_I2C_ADDR, 0);" function that the step example uses. After adding it to the rawrgb code and recompiling, I was able to get output from it right away, without having to run the step example first.

I'm having the same issue, two bricked sensors but it happened on a rpi module 3. I am trying to make things work with bout a pi camera and the IR camera.
Anybody knows how I can restore de EEPROM data?

Same issue for me. Sensor worked fine with the python backend but got corrupted EEPROM after using the C API. Would love a way of restoring EEPROM values.