floraison/flor

quickstart fails waiting for execution

Closed this issue · 5 comments

Hi,

I used to use ruote (Thanks a lot, btw), and now I am checking flor. Unfortunately I hit an issue right away. Figured I report it now in case I can't dig deeper.

irb(main):024> payload: { log: [ "started at #{Time.now}" ] })
=> "domain0-u0-20231219.0706.jibowesoba"
irb(main):025> r = flor.wait(exid)
/home/emogul/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/flor-1.6.2/lib/flor/unit/waiter.rb:109:in `block in wait': timeout for #<Flor::Waiter:0x000074c7149646f0{:exid=>"domain0-u0-20231219.0706.jibowesoba", :timeout=>5}>, msg_waiter? true, row_waiter? false (RuntimeError)

Hello,

thanks for having used ruote and thanks for reporting this.

I just tried the quickstart and it works for me (ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-openbsd7.3]).

Could you please try again with ENV['FLOR_DEBUG'] = 'dbg,stdout' commented in at (https://github.com/floraison/flor/blob/master/doc/quickstart0/quickstart.rb#L5)?

That should tell us what is going wrong.

Hi, here is the output

irb(main):017> flor.start
=> 
#<Flor::Scheduler:0x000078446e8f76d8
 @archive=nil,
 @caller=#<Flor::Caller:0x000078446ef2d840 @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>,
 @conf=
  {"loader"=>Flor::HashLoader,
   "sto_uri"=>"sqlite://flor_qs.db",
   "log_dbg"=>1,
   "log_msg"=>1,
   "log_err"=>1,
   "log_src"=>1,
   "log_tree"=>1,
   "log_tree_rw"=>1,
   "log_run"=>1,
   "log_out"=>"stdout",
   "env"=>"dev"},
 @env="dev",
 @executors=[],
 @ganger=#<Flor::Ganger:0x000078446eb6af00 @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>,
 @heart_rate=0.3,
 @hooker=
  #<Flor::Hooker:0x000078446ef2d070
   @hooks=
    [["logger",
      {},
      #<Flor::Logger:0x000078446eb6ae10
       @level=1,
       @out=#<Flor::Logger::StdOut:0x000078446eb6a9d8 @f=#<IO:<STDOUT>>, @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>,
       @uni="sch-u0-i192.168.1.31-p9510-oh10",
       @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>,
      nil],
     ["wlist",
      {},
      #<Flor::WaitList:0x000078446eb72188
       @msg_waiters=[],
       @mutex=#<Thread::Mutex:0x000078446eb69808>,
       @row_frequency=1,
       @row_thread=nil,
       @row_thread_status=nil,
       @row_waiters=[],
       @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>,
      nil]],
   @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>,
 @identifier="sch-u0-i192.168.1.31-p9510-oh10",
 @idle_count=0,
 @loader=#<Flor::HashLoader:0x000078446ef2eec0 @environment={"taskers"=>[["", "alice", DemoTasker], ["", "bob", DemoTasker]]}, @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>,
 @logger=
  #<Flor::Logger:0x000078446eb6ae10
   @level=1,
   @out=#<Flor::Logger::StdOut:0x000078446eb6a9d8 @f=#<IO:<STDOUT>>, @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>,
   @uni="sch-u0-i192.168.1.31-p9510-oh10",
   @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>,
 @max_executors=1,
 @msg_max_res_time=600,
 @next_time=nil,
 @reload_after=60,
 @reloaded_at=2023-12-20 07:44:27.199872835 +1000,
 @spooler=#<Flor::Spooler:0x000078446eb694e8 @dir=nil, @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>,
 @storage=
  #<Flor::Storage:0x000078446eccb908
   @archive=nil,
   @callbacks={},
   @db=#<Sequel::SQLite::Database: "sqlite://flor_qs.db">,
   @db_logger=#<Flor::Storage::DbLogger:0x000078446eb6b130 @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>,
   @models={},
   @mutex=nil,
   @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>,
 @thread=#<Thread:0x000078446ebf7400 /home/emogul/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/flor-1.6.2/lib/flor/unit/scheduler.rb:128 run>,
 @thread_status=:running,
 @wake_up=true,
 @wlist=
  #<Flor::WaitList:0x000078446eb72188
   @msg_waiters=[],
   @mutex=#<Thread::Mutex:0x000078446eb69808>,
   @row_frequency=1,
   @row_thread=nil,
   @row_thread_status=nil,
   @row_waiters=[],
   @unit=#<Flor::Scheduler:0x000078446e8f76d8 ...>>>
irb(main):018* exid = flor.launch(
irb(main):019'   %q{
irb(main):020'     sequence
irb(main):021'       alice 'hello' times: 2
irb(main):022'       bob 'world'
irb(main):023*   },
irb(main):024> payload: { log: [ "started at #{Time.now}" ] })
+---
| {payload:{log:["started at 2023-12-20 07:45:18 +1000"]}}
|
|   1 
|   2 sequence
|   3   alice 'hello' times: 2
|   4   bob 'world'
|   5 
.
+--- 
| 0 sequence L2
| 0_0 alice L3
| 0_0_0 _att L3
| 0_0_0_0 _sqs hello L3
| 0_0_1 _att L3
| 0_0_1_0 times L3
| 0_0_1_1 _num 2 L3
| 0_1 bob L4
| 0_1_0 _att L4
| 0_1_0_0 _sqs world L4
.
=> "domain0-u0-20231219.2145.fejigajaba"
irb(main):025>     /--- run starts Flor::UnitExecutor 71660 domain0-u0-20231219.2145.fejigajaba
    |   { thread: 71680 }
    |   { counters: {}, nodes: 0, execution_size: 0.00k }
   jaba 0 exe [sequence L2] [alice,[[_att,[[_sqs,hello,3]],3],[_att,[[times,... m1s_ f.ret null vars:
   jaba 0_0 exe [task L3] [_att,[[_sqs,alice,3]],3],[_att,[[_sqs,hello,3]]... m2s1r1>1 from 0 f.ret null
   jaba 0_0_0 exe [_att L3] [_sqs,alice,3] m3s2r1>1 from 0_0 f.ret null
   jaba 0_0_0_0 exe [_sqs L3] alice m4s3r1>1 from 0_0_0 f.ret alice
   jaba 0_0_0 rec [_att L3] hp:_att m5s4r1>1 from 0_0_0_0 f.ret alice
   jaba 0_0 rec [task L3] hp:task m6s5r1>1 from 0_0_0 f.ret alice
   jaba 0_0_1 exe [_att L3] [_sqs,hello,3] m7s6r1>1 from 0_0 f.ret alice
   jaba 0_0_1_0 exe [_sqs L3] hello m8s7r1>1 from 0_0_1 f.ret hello
   jaba 0_0_1 rec [_att L3] hp:_att m9s8r1>1 from 0_0_1_0 f.ret hello
   jaba 0_0 rec [task L3] hp:task m10s9r1>1 from 0_0_1 f.ret hello
   jaba 0_0_2 exe [_att L3] [times,[],3],[_num,2,3] m11s10r1>1 from 0_0 f.ret hello
   jaba 0_0_2_0 exe [_sqs L3] times m12s11r1>1 from 0_0_2 f.ret times
   jaba 0_0_2 rec [_att L3] hp:_att m13s12r1>1 from 0_0_2_0 f.ret times
   jaba 0_0_2_1 exe [_num L3] 2 m14s13r1>1 from 0_0_2 f.ret 2
   jaba 0_0_2 rec [_att L3] hp:_att m15s14r1>1 from 0_0_2_1 f.ret null
   jaba 0_0 rec [task L3] hp:task m16s15r1>1 from 0_0_2 f.ret null
   jaba 0_0 tas [task L3] hp:task m17s16r1>1 from 0_0 f.ret null
    |   run ends Flor::Logger 71700 domain0-u0-20231219.2145.fejigajaba
    |   { started: "2023-12-19T21:45:18.583588Z", took: 0.058192093 }
    |   { thread: 71680, consumed: 17, traps: 0 }
    |   { counters: { runs: 1, msgs: 17, omsgs: 0 }, nodes: 2, execution_size: 0.55k }
    \--- .
    /--- run starts Flor::UnitExecutor 71720 domain0-u0-20231219.2145.fejigajaba
    |   { thread: 71740 }
    |   { counters: { runs: 1, msgs: 17, omsgs: 0 }, nodes: 2, execution_size: 0.55k }
   jaba 0_0 ret [task L3] hp:task m18s_ f.ret null
   jaba 0_0 rec [task L3] hp:task m19s_r2>2 f.ret null
   jaba 0 rec [sequence L2] hp:sequence m20s19r2>2 from 0_0 f.ret null vars:
   jaba 0_1 exe [task L4] [_att,[[_sqs,bob,4]],4],[_att,[[_sqs,world,4]],4... m21s20r2>2 from 0 f.ret null
   jaba 0_1_0 exe [_att L4] [_sqs,bob,4] m22s21r2>2 from 0_1 f.ret null
   jaba 0_1_0_0 exe [_sqs L4] bob m23s22r2>2 from 0_1_0 f.ret bob
   jaba 0_1_0 rec [_att L4] hp:_att m24s23r2>2 from 0_1_0_0 f.ret bob
   jaba 0_1 rec [task L4] hp:task m25s24r2>2 from 0_1_0 f.ret bob
   jaba 0_1_1 exe [_att L4] [_sqs,world,4] m26s25r2>2 from 0_1 f.ret bob
   jaba 0_1_1_0 exe [_sqs L4] world m27s26r2>2 from 0_1_1 f.ret world
   jaba 0_1_1 rec [_att L4] hp:_att m28s27r2>2 from 0_1_1_0 f.ret world
   jaba 0_1 rec [task L4] hp:task m29s28r2>2 from 0_1_1 f.ret world
   jaba 0_1 tas [task L4] hp:task m30s29r2>2 from 0_1 f.ret world
    |   run ends Flor::Logger 71700 domain0-u0-20231219.2145.fejigajaba
    |   { started: "2023-12-19T21:45:18.656800Z", took: 0.067260224 }
    |   { thread: 71740, consumed: 13, traps: 0 }
    |   { counters: { runs: 2, msgs: 30, omsgs: 0 }, nodes: 2, execution_size: 0.62k }
    \--- .
    /--- run starts Flor::UnitExecutor 71760 domain0-u0-20231219.2145.fejigajaba
    |   { thread: 71780 }
    |   { counters: { runs: 2, msgs: 30, omsgs: 0 }, nodes: 2, execution_size: 0.62k }
   jaba 0_1 ret [task L4] hp:task m31s_ f.ret world
   jaba 0_1 rec [task L4] hp:task m32s_r3>3 f.ret world
   jaba 0 rec [sequence L2] hp:sequence m33s32r3>3 from 0_1 f.ret world vars:
   jaba rec m34s33r3>3 from 0 f.ret world
   jaba ter m35s34r3>3 from 0 f.ret world
    |   run ends Flor::Logger 71700 domain0-u0-20231219.2145.fejigajaba
    |   { started: "2023-12-19T21:45:18.738931Z", took: 0.034531105 }
    |   { thread: 71780, consumed: 5, traps: 0 }
    |   { counters: { runs: 3, msgs: 35, omsgs: 0 }, nodes: 1, execution_size: 0.55k }
    \--- .
irb(main):025> 
irb(main):026> r = flor.wait(exid)
/home/emogul/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/flor-1.6.2/lib/flor/unit/waiter.rb:109:in `block in wait': timeout for #<Flor::Waiter:0x000078446ecc5918{:exid=>"domain0-u0-20231219.2145.fejigajaba", :timeout=>5}>, msg_waiter? true, row_waiter? false (RuntimeError)
	from /home/emogul/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/flor-1.6.2/lib/flor/unit/waiter.rb:101:in `synchronize'
	from /home/emogul/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/flor-1.6.2/lib/flor/unit/waiter.rb:101:in `wait'
	from /home/emogul/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/flor-1.6.2/lib/flor/unit/wlist.rb:111:in `wait'
	from /home/emogul/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/flor-1.6.2/lib/flor/unit/wlist.rb:37:in `wait'
	from (irb):26:in `<main>'
	from /home/emogul/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/irb-1.10.1/exe/irb:9:in `<top (required)>'
	from /home/emogul/.rbenv/versions/3.2.2/bin/irb:25:in `load'
	from /home/emogul/.rbenv/versions/3.2.2/bin/irb:25:in `<main>'

OK, I get it.

I run the quick start with bundle exec ruby quickstart.rb while you run each step line by line in irb.

By the time you type r = flor.wait(exid) the workflow, as seen in the output above is already terminated, so the waiter just times out.

Quick note, #launch accepts a wait: option. There is a piece of documentation hinting at it at: https://github.com/floraison/flor/blob/master/doc/launching.md

Thanks will check that out.