trace_events: when enabling via inspector after startup, process abort
vmarchaud opened this issue · 10 comments
- Version: 10.10.0
- Platform:
Linux 4.4.0-122-generic #146-Ubuntu SMP Mon Apr 23 15:34:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
- Subsystem: trace_events + NodeTracing Inspector Domain
I'm playing around with the new NodeTracing domain (cf nodejs/diagnostics#220), trying to collect trace events at runtime in our APM (implementation is still wip so don't mind the code).
Everything was working great in test but when i deployed in our staging environment (and putting a some load with wrk-node
), i apparently hit a race condition that cause that make the process abort : https://github.com/nodejs/node/blob/master/deps/v8/src/api.cc#L1059
I only get the debug line outputted by V8 before the abort happen, so not so much information here.
I can reproduce almost every time with our API, i assume i will be able to make a sample so everyone can reproduce on their own, in the mean time i'm already open the issue to discuss it.
I can recompile node with any patch if needed and test directly on my staging env.
@eugeneo if you have any idea, i'm able to put some hours into debugging it the coming week
@ofrobots Do you have any idea where i should look at to find the root cause of this bug ?
cc @nodejs/diagnostics
@vmarchaud if you could share a sample code without 3rd party libraries to reproduce this failure, or a core dump
(if there's no sensitive information in your application) / node-report
, I can take a look into it.
@vmarchaud Are you still experiencing this problem?
@Trott I believe its still a problem but i need to verify against the latest 10.x or 11.x version.
I'm able to reproduce the crash with 10.13.0, i have a core dump that i can send you @mmarchini , do you have any email i can contact you ?
I will try to make a simple example that reproduce the problem
And there is a sample that reproduce the problem for me :
const http = require('http')
const port = 3000
const inspector = require('inspector')
const fs = require('fs')
const session = new inspector.Session()
session.connect()
const opts = {
includedCategories: [
'node'
],
recordContinuously: false
}
const requestHandler = (request, response) => {
response.end('Hello Node.js Server!')
}
const server = http.createServer(requestHandler)
server.listen(port, (err) => {
if (err) {
return console.log('something bad happened', err)
}
console.log(`server is listening on ${port}`)
// launching the tracing
session.post('NodeTracing.start', {
traceConfig: opts
})
console.log(`Starting NodeTracing`)
// lets stop after 10 sec
setTimeout(_ => {
const events = []
session.on('NodeTracing.tracingComplete', () => {
console.log(`Finished with success`)
const data = events.reduce((agg, events) => {
agg = agg.concat(events.value)
return agg
}, [])
fs.writeFileSync('./trace.json', JSON.stringify(data))
console.log(`Saved file with trace events`)
return process.exit(0)
})
session.on('NodeTracing.dataCollected', (event) => {
events.push(event.data)
})
console.log(`Stopping NodeTracing`)
session.post('NodeTracing.stop', {
traceConfig: opts
})
}, 5000)
})
You just need to make a few request to the server, for example i'm using wrk-node
:
wrk -c 10 -t 3 -d 5 http://localhost:3000
I can reproduce the abort with the provided test. Taking a look....
(lldb) bt
* thread #2, stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
* frame #0: 0x0000000100cab9f2 node`v8::base::OS::Abort() + 18
frame #1: 0x00000001000f7714 node`node::inspector::(anonymous namespace)::JSBindingsConnection::JSBindingsSessionDelegate::SendMessageToFrontend(v8_inspector::StringView const&) + 44
frame #2: 0x00000001000f2491 node`node::inspector::(anonymous namespace)::ChannelImpl::sendMessageToFrontend(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 41
frame #3: 0x00000001000f2450 node`non-virtual thunk to node::inspector::(anonymous namespace)::ChannelImpl::sendProtocolNotification(std::__1::unique_ptr<node::inspector::protocol::Serializable, std::__1::default_delete<node::inspector::protocol::Serializable> >) + 44
frame #4: 0x0000000100143bf3 node`node::inspector::protocol::NodeTracing::Frontend::sendRawNotification(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 81
frame #5: 0x0000000100100e6e node`node::inspector::protocol::(anonymous namespace)::InspectorTraceWriter::Flush(bool) + 550
frame #6: 0x00000001000dfeba node`node::tracing::Agent::Flush(bool) + 116
frame #7: 0x00000001000e176a node`node::tracing::NodeTraceBuffer::NonBlockingFlushSignalCb(uv_async_s*) + 66
frame #8: 0x0000000100a000a3 node`uv__async_io + 318
frame #9: 0x0000000100a0fa78 node`uv__io_poll + 1739
frame #10: 0x0000000100a0052a node`uv_run + 336