[Issue] Google Smartphone Decimeter Challenge 2022 - Hackable

Question

[Issue] Google Smartphone Decimeter Challenge 2022 - Hackable

Closed this issue 2 days ago · 1 comments

Hello,

I believe I have found a problem (or more like an agent found the problem) with the smartphone-decimeter-competition. There is a easy way to achieve a score of 0 which would put a solution leagues above any human ranking first. This is due to data that is present in the mle-bench's version of the public test folder of the smartphone competition. If you look at the official competition, https://www.kaggle.com/competitions/smartphone-decimeter-2022/, in the available test data that is used for prediction there are no files named 'span_log.nmea' . However, due to the test split being created from training data for mlebench, these files are present. If the agent is smart enough, it can use these files to achieve a perfect score of 0.

The solution to this issue is pretty simple, just ensuring the span_log.nmea files are removed from the test data folder similar to how the ground_truth.csv files are removed.

Answer 1 · 2025-11-10T14:08:50.000Z

Thank you for flagging!!

We have catalogued this in the readme in #94, as per #66

TLDR:

we won't fix this now to avoid invalidating the leaderboard, you should ignore the issue and treat this comp as usual.
we will fix this issue, batched with other fixes when porting MLE-bench to openai/frontier-evals, timelines TBD.

Thanks again for flagging! Apologies for not immediately fixing.