定时器提前执行了
Opened this issue · 8 comments
imxyb commented
代码如下:
func (m *Manager) addTimer(t *task.Task) (timer *timingwheel.Timer, err error) {
taskConfig, ok := m.taskConfigs[t.Name]
if !ok {
return nil, errors.Errorf("taskConfig %s not found", t.Name)
}
fn, err := taskConfig.GeneratorFunc(t)
if err != nil {
return nil, err
}
diff, err := util.CalcDiffWithNow(t.ExecDateTime)
if err != nil {
return nil, err
}
if diff < 0 {
log.Warnf("timer diff is less the zero")
return nil, nil
}
// 计算当前时间差(单位是秒)
timer = m.tw.AfterFunc(time.Duration(diff)*time.Second, fn)
log.Infof("add timer successfully, diff seconds:%d, task id:%s, object id:%s ", diff, t.ID)
return timer, nil
}
我打印的日志:
add timer successfully, diff seconds:703726, task id:ObjectID("5e8 1fba9b8d1fca6422ec4c2")
也就是说会在703726秒后执行,大概是8天左右,但是这个任务第二天就执行了。。而且这是偶发的问题,是因为我这个时间间隔过大导致的吗
imxyb commented
RussellLuo commented
@imxyb 你好,谢谢关注!
- 请问
m.tw
是如何初始化的?tick 和 wheelSize 的值是多少? - 另外
m.tw
里只有这一个 703726s 的任务吗?如果还有别的任务,数量和时间间隔分布大概是怎样的? - 异常情况下,这个 703726s 的任务最终是间隔多久执行的?
imxyb commented
- 代码如下:
tw := timingwheel.NewTimingWheel(time.Millisecond, 3600)
tw.Start()
- 不是的,大概有200个任务左右,基本都是一些比较久远的延迟任务,比如8天后,几个月后,甚至几年后都有。
- 异常情况下,这个任务是3-30号凌晨添加的,但是3-31凌晨的时候就已经执行了
RussellLuo commented
好的。这两天比较忙,我周末分析一下。
RussellLuo commented
复现失败
我用如下代码尝试复现:
package main
import (
"fmt"
"time"
"github.com/RussellLuo/timingwheel"
)
func main() {
tw := timingwheel.NewTimingWheel(time.Millisecond, 3600)
tw.Start()
defer tw.Stop()
exitC := make(chan time.Time, 1)
tw.AfterFunc(703726 * time.Second, func() {
exitC <- time.Now().UTC()
})
fmt.Printf("Task created at: %v\n", time.Now().UTC())
fmt.Printf("Task executed at: %v\n", <-exitC)
}
目前为止(2020-04-04 03:10:00 +0000 UTC),任务还没有执行:
$ go run main.go
Task created at: 2020-04-02 13:09:30.229318917 +0000 UTC
时间间隔为几个月或几年的任务,由于时间成本太高,短时间很难尝试。
原因猜想
- 因为 AfterFunc 会计算 expiration 绝对时间(其中 UnixNano 可以表示 1678 年 ~ 2262 年)。所以只要绝对时间不超过 2262 年,那应该跟时间间隔无关。
- 按照 “时间单位 tick 为 1s,大小 wheelSize 为 3600” 的规格,第一层时间轮可以表示 1s x 3600 = 1h,第二层时间轮可以表示 1h x 3600 = 3600h,第三层时间轮可以表示 3600h x 3600 = 12960000h。假设最长的时间间隔按 10 年计算,对应 24h x 365 x 10 = 87600h,最多只会用到 3 层时间轮。所以时间轮的层数也并不多。
- 每次 Poll 延迟队列 的时候,会 根据当前时间查询到期的任务。如果上述 expiration 的计算没有问题,则有可能是当前时间出了问题。所以建议加日志打印下每次 Poll 中的 时间 now,确认下时间是否有问题?
如果有更好的思路和排查手段,欢迎一起探讨 @imxyb
imxyb commented
@RussellLuo 日志有时间的,时间确实是3-11日,我这边可以在poll的时候加个日志观察下。感谢你的帮忙,我这边有进展也会继续在这同步。
FJSDS commented
遇到类似的问题,两个连接同时连上之后分别调用 ScheduleFunc
func (this_ *Session) Next(prevTime time.Time) time.Time {
if this_.isClose() {
return time.Time{}
}
return prevTime.Add(time.Second * 2)
}
然后出现调用延迟的状况
2020-09-23T14:59:52.772+0800 DEBUG tcp/acceptor.go:93 session connected {"local": "127.0.0.1:9999", "remote": "127.0.0.1:55229"}
2020-09-23T14:59:52.772+0800 DEBUG tcp/acceptor.go:93 session connected {"local": "127.0.0.1:9999", "remote": "127.0.0.1:55230"}
2020-09-23T14:59:54.771+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55229", "session name": "", "session id": ""}
2020-09-23T14:59:56.773+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55229", "session name": "", "session id": ""}
2020-09-23T14:59:58.772+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55229", "session name": "", "session id": ""}
2020-09-23T15:00:00.772+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55229", "session name": "", "session id": ""}
2020-09-23T15:00:02.401+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55230", "session name": "", "session id": ""}
2020-09-23T15:00:02.401+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55230", "session name": "", "session id": ""}
2020-09-23T15:00:02.401+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55230", "session name": "", "session id": ""}
2020-09-23T15:00:02.401+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55230", "session name": "", "session id": ""}
2020-09-23T15:00:02.772+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55230", "session name": "", "session id": ""}
2020-09-23T15:00:04.772+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55230", "session name": "", "session id": ""}
2020-09-23T15:00:06.772+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55230", "session name": "", "session id": ""}
2020-09-23T15:00:08.771+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55230", "session name": "", "session id": ""}
2020-09-23T15:00:10.400+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55229", "session name": "", "session id": ""}
2020-09-23T15:00:10.400+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55229", "session name": "", "session id": ""}
2020-09-23T15:00:10.400+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55229", "session name": "", "session id": ""}
2020-09-23T15:00:10.400+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55229", "session name": "", "session id": ""}
2020-09-23T15:00:10.771+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55229", "session name": "", "session id": ""}
2020-09-23T15:00:12.772+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55229", "session name": "", "session id": ""}
2020-09-23T15:00:14.772+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55229", "session name": "", "session id": ""}
2020-09-23T15:00:16.772+0800 DEBUG tcp/session.go:196 send ping msg success {"remote": "127.0.0.1:55229", "session name": "", "session id": ""}
Debugger finished with exit code 0
正常情况应该是 "remote": "127.0.0.1:55229"和"remote": "127.0.0.1:55230" 交叉出现,但是现在是成堆出现,间隔时间应该是2s.时间出现得也很奇怪
FJSDS commented
@RussellLuo 空了看一下,我感觉是同一毫毛内不能有多个任务,从目前观察来看,同一MS只会执行一个任务,剩下得任务会隔N个间隔时间之后批量出来