XiaoMi/open-falcon

Agent panic

IveCode opened this issue · 1 comments

在网络波动的情况下,引发的异常:
panic(0x780420, 0xc4200120c0)
/usr/local/go/src/runtime/panic.go:500 +0x1a1
github.com/open-falcon/agent/g.(*SingleConnRpcClient).Call(0x0, 0x7ead8c, 0xf, 0x758620, 0xc42012c000, 0x7771c0, 0xc420334570, 0x0, 0x0)
/home/sofeng/gowork/src/github.com/open-falcon/agent/g/rpc.go:58 +0x65
github.com/open-falcon/agent/g.updateMetrics(0xc420138540, 0x13, 0xc420398000, 0x21, 0x40, 0xc420334570, 0xc420016000)
/home/sofeng/gowork/src/github.com/open-falcon/agent/g/transfer.go:50 +0x15c
github.com/open-falcon/agent/g.SendMetrics(0xc420398000, 0x21, 0x40, 0xc420334570)
/home/sofeng/gowork/src/github.com/open-falcon/agent/g/transfer.go:24 +0x14e
github.com/open-falcon/agent/g.SendToTransfer(0xc420398000, 0x21, 0x40)
/home/sofeng/gowork/src/github.com/open-falcon/agent/g/var.go:60 +0xd9
github.com/open-falcon/agent/cron.collect(0x3c, 0xc42013a038, 0x1, 0x1)
/home/sofeng/gowork/src/github.com/open-falcon/agent/cron/collector.go:73 +0x3df
created by github.com/open-falcon/agent/cron.Collect
/home/sofeng/gowork/src/github.com/open-falcon/agent/cron/collector.go:30 +0xb2

是由于TransferLock锁的问题,问题在SendMetrics函数中。
分析原因:
如果两个goroutine都获取到同一个addr,一个goroutine在updateMetrics函数处获取TransferLock.RLock(),另一个goroutine在closeTransferClient函数处获取TransferLock.Lock()。
closeTransferClient函数先执行完,那么updateMetrics函数就会引发异常。所以在updateMetrics函数中需要判断,addr是否存在map中。
func updateMetrics(addr string, metrics []*model.MetricValue, resp *model.TransferResponse) bool {
TransferLock.RLock()
defer TransferLock.RUnlock()
if _, ok := TransferClients[addr]; ok {
err := TransferClients[addr].Call("Transfer.Update", metrics, resp)
if err != nil {
log.Println("call Transfer.Update fail", addr, err)
return false
}
}

return true

}

yubo commented

用新版试一试,updateMetrics()没有找到