tilezen/mapbox-vector-tile

Travis build failing with seg fault

rmarianski opened this issue ยท 6 comments

Travis CI build has started failing with a seg fault.

https://travis-ci.org/tilezen/mapbox-vector-tile/builds/293683076

I think it's worth noting that the tests pass OK, and the segfault happens when it tries to run the tests a second time for coverage. That being said, I ran the coverage locally without a problem. Perhaps the coverage run uses more memory and an allocation ends up failing?

Perhaps the coverage run uses more memory and an allocation ends up failing?

That's certainly a possibility, or if anything is multithreaded, it could change order of operations with timing changes that don't happen locally.

Unfortunately, working with travis and core dumps is near-impossible.

๐Ÿ‘‹ How can I help move this along to publishing @rmarianski @zerebubuth? Would be great to get these changes #97

Hi! Thanks for getting in touch.

Looks like we're still stuck with test failures, although I'm not seeing the segfault any more, we've now got a protobuf-related problem (from #108 (comment)):

As far as I can tell, it seems to be something to do with unittest "manually" loading modules and ending up loading something from protobuf twice, which it doesn't like. Perhaps related to protocolbuffers/protobuf#3276. In any case, seems like an upstream bug, and I haven't figured out a work-around other than running the tests individually.

Fixing the tests would be my preferred solution. But I don't really understand why it's failing, so I've got no clues to help fix it ๐Ÿ˜• @rmarianski, @iandees - have you seen anything like this before? Any idea where to start looking for a fix?

Plan B might be to change the travis.yml to run the tests (and the coverage?) individually... but that kludge would feel quite wrong to me...

Any chance it's just a simple relative import issue: https://stackoverflow.com/a/51688380?

Good idea! I tried that and pushed it up to a branch, but unfortunately it's still failing. Please take a look at the branch - perhaps there's something I missed or messed up.

I also tried (on a different branch) to update the protobuf generated code, but got the same error again. ๐Ÿ˜–

Finally, I tried running as PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python python setup.py test, which didn't throw the same protobuf-sourced error, but instead:

test_dev_errors (test_polygon.TestPolygonMakeValid) ... Segmentation fault (core dumped)

Which also passes OK when it's run as a single test. ๐Ÿ˜ฉ

Something odd is definitely going on, as it seems to be running some of the tests a second time?