
WeatherChimes devices crash after variable length of time

Closed this issue · 8 comments

Describe the bug
Devices have been shown to crash after what could range from a few days to a week

Hardware in Use
Standard WeatherChimes hardware (Tipping Bucket)

Expected behavior
These device should not be crashing


 * In lab use case example for the WeatherChimes project
 * This project uses SDI12, TSL2591 and an SHT31 sensor to log environment data and logs it to both the SD card and also MQTT/MongoDB
#include <Loom_Manager.h>
#include <Logger.h>

#include <Hardware/Loom_Hypnos/Loom_Hypnos.h>

#include <Sensors/Loom_Analog/Loom_Analog.h>
#include <Sensors/I2C/Loom_SHT31/Loom_SHT31.h>
#include <Sensors/I2C/Loom_TSL2591/Loom_TSL2591.h>
#include <Sensors/I2C/Loom_MS5803/Loom_MS5803.h>

#include <Internet/Logging/Loom_MQTT/Loom_MQTT.h>
#include <Internet/Connectivity/Loom_LTE/Loom_LTE.h>

// Pin to have the secondary interrupt triggered from
#define INT_PIN A0

volatile bool sampleFlag = true; // Sample flag set to 1 so we sample in the first cycle, set to 1 in the ISR, set ot 0 end of sample loop
volatile bool tipFlag = false;
volatile int counter = 0;

// Used to track timing for debounce
unsigned long tip_time = 0;
unsigned long last_tip_time = 0;

Manager manager("Chime", 3);

// Create a new Hypnos object
Loom_Hypnos hypnos(manager, HYPNOS_VERSION::V3_3, TIME_ZONE::PST, true);

// Analog for reading battery voltage
Loom_Analog analog(manager);

// Create sensor classes
Loom_SHT31 sht(manager);
Loom_TSL2591 tsl(manager);
Loom_MS5803 ms_water(manager, 119); // 119(0x77) if CSB=LOW external, 118(0x76) if CSB=HIGH on WC PCB
Loom_MS5803 ms_air(manager, 118); // 118(0x76) if CSB=HIGH on WC PCB

Loom_LTE lte(manager, "hologram", "", "");
Loom_MQTT mqtt(manager, lte.getClient());

/* Calculate the water height based on the difference of pressures*/
float calculateWaterHeight(){
  // ((Water Pressure - Air Pressure) * 100 (conversion to pascals)) / (Water Density * Gravity)
  return (((ms_water.getPressure()-ms_air.getPressure()) * 100) / (997.77 * 9.81));

// Called when the interrupt is triggered 
void isrTrigger(){
  sampleFlag = true;

void tipTrigger() {
  hypnos.shouldPowerUp = false;
  tipFlag = true;

void setup() {

  // Enable debug SD logging and function summaires

  // Set the interrupt pin to pullup

  // Wait 20 seconds for the serial console to open

  // Enable the hypnos rails

  // Read the MQTT creds file to supply the device with MQTT credentials

  // Initialize all in-use modules

  // Register the ISR and attach to the interrupt

  attachInterrupt(INT_PIN, tipTrigger, FALLING);

void loop() {

    // Measure and package the data

    manager.addData("Tip_Bucket", "Tip_Count", counter);

    // Add the water height calculation to the data
    manager.addData("Water", "Height_(m)", calculateWaterHeight());
    // Print the current JSON packet

    // Log the data to the SD card              

    // Publish the collected data to MQTT

    // Set the RTC interrupt alarm to wake the device in 15 min
    hypnos.setInterruptDuration(TimeSpan(0, 0, 15, 0));

    // Reattach to the interrupt after we have set the alarm so we can have repeat triggers
    attachInterrupt(INT_PIN, tipTrigger, FALLING);
    attachInterrupt(INT_PIN, tipTrigger, FALLING);
    sampleFlag = false;

    digitalWrite(LED_BUILTIN, HIGH);
    tipFlag = false;
    attachInterrupt(INT_PIN, tipTrigger, FALLING);
    attachInterrupt(INT_PIN, tipTrigger, FALLING);
    digitalWrite(LED_BUILTIN, LOW);
  // Put the device into a deep sleep, operation HALTS here until the interrupt is triggered

EDIT 8/11/2023 - Updated code to latest version where issue is still present

Additional context
Mainly opening this issue just show I have a place to log why it was crashing, device just crashed and so I will look at where it crashed by pulling the SD card when I am in the lab next.

Serial log showed that the device crashed before the first print of the next cycle

Upon further review, it appears that the power_up call to one of the sensors is initiating the hang

Unsure what the problem is currently, in the interest of the chimes working in Alaska I moved the watchdog timer to turn on as soon as the device exits sleep mode to prevent hangs, currently testing

Issue #67 may have something to do with this

Getting better log output to determine cause of failure

I'm almost certain this was caused by the strings memory leakage issue that plagued all of loom, I need to run duration tests to be sure

Nope the strings overhaul seemed to fix most other devices but the weatherchimes hangs in what appears to be the SD card initialization

Bump, we have narrowed it down to the likely culprit being the additional interrupt for the tipping bucket causing the device to hang. Still unsure why

At the current moment in time it appears as though standard WeatherChimes functionality is stable on main, closing.... finally