jmettraux/ruote

Comparison between FlowExpressions subclasses sometimes fail.

jurisgalang opened this issue · 11 comments

I've seen it happen on instantiation of `` fetching the list of processes from the engine, so a call like:

engine.ps('some-wfid')

Would raise an exception similar to the ff:

ArgumentError: comparison of Ruote::Exp::SequenceExpression with Ruote::Exp::ParticipantExpression failed

I've traced it to the following line in ProcessStatus class initialization (starting from the ps method call):

# lib/ruote/engine/process_status.rb:66
@expressions.sort! { |a, b| a.fei.expid <=> b.fei.expid }

If a and b are different subclasses of FlowExpressions (eg: SequenceExpression vs ParticipantExpression) it's possible that the values returned by the calls tofei.expid will be of different types (eg: String vs Fixnum)

Hello Juris,

thanks for the report.

What version of Ruby are you using, what version of ruote, which storage implementation, what is the OS ?

FlowExpressionId#expid must return a String instance.

FlowExpression classes all share the same #fei implementation, the one found in the root class FlowExpression.

Thanks in advance for the context information, best regards,

John

Hey John,

We're using:

* ruby 1.9.2 p180
* ruote gems:
    ruote (2.2.0)
    ruote-couch (2.2.0)
    ruote-redis (2.2.0)
* ubuntu (32bit)

Thanks.

Juris

Hello,

who stores the expressions ? ruote-couch or ruote-redis ?

Would it be possible to reproduce the issue with this in :

@expressions.sort! { |a, b|
  puts '-' * 80
  p [ :a, a.class, a.fei.class, :b, b.class, b.fei.class ]
  a.fei.expid <=> b.fei.expid
}

Thanks in advance,

John

Hey,

It's redis - the code above is similar to the code that was raising the exception.

Juris

Hello Juris,

OK, let me rephrase it.

Could your please replace your lib/ruote/engine/process_status.rb:66 with

@expressions.sort! { |a, b|
  if (a.fei.expid.class != b.fei.expid.class)
    puts '-' * 80
    p [ :a, a.class, a.fei.class, :b, b.class, b.fei.class ]
  end
  a.fei.expid <=> b.fei.expid
}

And gist a log of the output as it is right before the problem ?

This modified piece of code will give us more information about what is going on in there. Else I won't be able to help you at all.

Best regards,

John

Hey John,

Here's the output:

--------------------------------------------------------------------------------
[:a, Ruote::Exp::SequenceExpression, Ruote::FlowExpressionId, :b, Ruote::Exp::ParticipantExpression, Ruote::FlowExpressionId]

Juris

OK, what does

@expressions.sort! { |a, b|
  if (a.fei.expid.class != b.fei.expid.class)
    puts '-' * 80
    p [ :a, a.class, a.fei.class, :b, b.class, b.fei.class ]
    p a.h; puts '---'; p b.h
    puts '-' * 80
  end
  a.fei.expid <=> b.fei.expid
}

say ?

My impression is that some data got misplaced in redis in lieu of a ruote expression. Running the code above will help us see what it could be.

It's a convention throughout ruote that expid should be strings. There is something wrong with your ruote-redis returning an expression that has a Fixnum expid.

I could tell you to clean your redis, but what put the stuff there in the first place. At least the code in this comment would show us what the faulty stuff looks like.

After that we have to find the root cause, you have to understand that it's not necessarily a ruote bug that triggered the issue.

Thanks in advance,

John

Hey John,

This is what it says:

--------------------------------------------------------------------------------
[:a, Ruote::Exp::SequenceExpression, Ruote::FlowExpressionId, :b, 
Ruote::Exp::ParticipantExpression, Ruote::FlowExpressionId]
{"name"=>"sequence", "original_tree"=>["define", {"name"=>"test"}, 
[["one", {"task"=>"a"}, []], ["one", {"task"=>"b"}, []]]], 
"variables"=>{"test"=>["0", ["define", {"name"=>"test"}, [["one", 
{"task"=>"a"}, []], ["one", {"task"=>"b"}, []]]]]}, "on_timeout"=>nil, 
"put_at"=>"2011-07-29 18:56:26.213379 UTC", "_id"=>"0!!20101115-nujogimo", 
"applied_workitem"=>{"fields"=>{}, "fei"=>{"sub_wfid"=>nil, 
"wfid"=>"20101115-nujogimo", "engine_id"=>"engine", "expid"=>"0"}},
"type"=>"expressions", "created_time"=>"2010-11-15T10:06:09-08:00", 
"on_cancel"=>nil, "parent_id"=>nil, "forgotten"=>nil, 
"children"=>[{"sub_wfid"=>nil, "wfid"=>"20101115-nujogimo", 
"engine_id"=>"engine", "expid"=>1}], "tagname"=>nil, 
"fei"=>{"sub_wfid"=>nil, "wfid"=>"20101115-nujogimo", 
"engine_id"=>"engine", "expid"=>"0"}, "on_error"=>nil, "_rev"=>"1"}
---
{"name"=>"participant", "participant"=>["Ruote::StorageParticipant", {}], 
"participant_name"=>"one", "put_at"=>"2011-07-29 18:56:26.216888 UTC", 
"on_timeout"=>nil, "variables"=>nil, "original_tree"=>["participant", 
{"task"=>"b", "ref"=>"one"}, []], "_id"=>"0_1!!20101115-nujogimo", 
"on_cancel"=>nil, "created_time"=>"2010-11-15T10:06:09-08:00", 
"type"=>"expressions", "applied_workitem"=>{"participant_name"=>"one", 
"wfid"=>"20101115-nujogimo", "put_at"=>"2010-11-15T10:06:09-08:00", 
"_id"=>"wi!0_0!!20101115-nujogimo", "fields"=>{"params"=>{"task"=>"b", 
"ref"=>"one"}}, "type"=>"workitems", "fei"=>{"sub_wfid"=>nil, 
"wfid"=>"20101115-nujogimo", "engine_id"=>"engine", "expid"=>1}}, 
"parent_id"=>{"sub_wfid"=>nil, "wfid"=>"20101115-nujogimo", 
"engine_id"=>"engine", "expid"=>"0"}, "forgotten"=>nil, "tagname"=>nil, 
"children"=>[], "on_error"=>nil, "fei"=>{"sub_wfid"=>nil, 
"wfid"=>"20101115-nujogimo", "engine_id"=>"engine", "expid"=>1}, 
"_rev"=>"1"}
--------------------------------------------------------------------------------

I did a FLUSHALL on redis before I ran the spec.

Juris

Hello Juris,

so the ParticipantExpression has this { 'expid' => 1 }. Weird.

May I see the spec ? How can I reproduce the issue ? I know the versions, anything special I should know ?

My goal is to have the smallest piece of code that reproduce the behaviour.

Thanks in advance,

John

Hey John,

Haven't resolved this issue yet.
Let me see if I could reduce that spec down to as few moving parts before I send it to you.

A couple of things:

  • The behavior above is not consistent across developer machines. But we were able to make it happen on a brand new machine/VM
  • We don't mess around with the expid values, but the tests/specs rely on some fixtures that might - I'll be combing through those today or tomorrow to see any mention of expid

A hack/workaround that we've employed here, is to force the type returned by expid to String

Anyway, I'll keep you posted.
Thanks.

Juris

no news, closing.