luislavena/radix

Named parameters are broken when inserted into Tree

luislavena opened this issue ยท 9 comments

Ref #4:

Right now named parameters (identified with :) are being split incorrectly:

require "./src/radix"

module Radix
  class Node
    def to_s(io, level = 0)
      level.times { io << " " }
      io.puts key
      children.each do |child|
        child.to_s(io, level + key.size)
      end
    end
  end
end

tree = Radix::Tree.new
tree.add "/", :root
tree.add "/:foo", :foo
tree.add "/:bar", :bar

puts tree.root
# /
#  :
#   bar
#   foo

Which is causing issues to identify two named parameters since the key is now incomplete:

tree = Radix::Tree.new
tree.add "/", :root
tree.add "/:post", :post
tree.add "/:category/:post", :category_post

puts tree.root
# /
#  :
#   post
#   category/:post

result = tree.find "/first-post"
pp result.found? # => false

The solution suggested in #4 provides an approach to it to solve it:

tree = Radix::Tree.new
tree.add "/", :root
tree.add "/:post", :post
tree.add "/:category/:post", :category_post

puts tree.root
# /
#  :category/:post
#  :post

However still fails on simple lookup:

result = tree.find "/first-post"
pp result.found? # => false

And causes some of the entries in the tree to repeat:

tree = Radix::Tree.new
tree.add "/", :root
tree.add "/:category", :category
tree.add "/:category/:subcategory", :subcategory

puts tree.root
# /
#  :category/:subcategory
#  :category

(subcategory should be a child of category node)

Also breaks on lookup:

result = tree.find "/furniture/chairs"
pp result.found? # => true

result = tree.find "/furniture"
pp result.found? # => false

I'm opening this here to expose the different issues presented in #4 and workout alternate solutions.

Another issue is that /:category/:post is now above /:post which causes #find not to work.

This is caused by Node's priority, but at the same time, is a bigger problem (bear with me while I go over the issue)

With current implementation, both /:category/:post and /:post set priority at 1, so Node#sort! might place them above or bellow each other.

Now, when walking the path during search, it works by finding the first branch of the nodes and when detected a named parameter, it will extract the value using the key in the node.

Then it will detect if there is more path to consume and if so, will lookup for the node that matches that.

tree = Radix::Tree.new
tree.add "/", :root
tree.add "/:post", :post
tree.add "/:category/:post", :category_post
# /
#  :category/:post
#  :post

tree.root.sort!
# /
#  :post
#  :category/:post

result = tree.find "/example"
pp result.found? # => false
pp result.params # => {"post" => "example"}

result = tree.find "/news/first-post"
pp result.found? # => false
pp result.params # => {"post" => "news"}

Is not possible to disambiguate two named parameters at the same level on a Radix tree without incurring on look ahead implementation, which could result in performance slowdown.

Because of this, insertion on the tree with these conditions should be discouraged.

f commented

You're right but actually,

:post
:category/:post

These both are intuitive in web development and I think we cannot block users to write these kind of URLs. And they mostly won't understand why it's impossible :/

It will reduce Kemal's simplicity.

Aren't there any workaround to make it possible?

WDYT @sdogruyol?

f commented

If I do understand correctly, these will also fail:

/posts/:id
/comments/:id

Am I right?

If I do understand correctly, these will also fail:

No, these will work. The only ones that will fail are when two named parameters (:post and :category) are placed at the same level, like the following:

/:foo
/:bar

Or:

/:foo/:baz
/:bar

This cannot be solved without lookahead against children of the node being traversed and if not match, move upper and test against other nodes (and their children) prior discarding current node.

Node's key is not split at regular separator (/) but can be split at any point of the key, aiming to accommodate multiple children and reduce lookup.

f commented

Then as far as I understand these will fail:

/posts/:id
/posts/:id/edit
/posts/:id/new

or these will fail:

/posts/:id
/posts/:id/:action

๐Ÿค”

@f no, they will only fail if the named parameters at the same level are different, here is a working example:

require "./src/radix"

tree = Radix::Tree.new
tree.add "/posts/:id", :post
tree.add "/posts/:id/edit", :edit_post
tree.add "/posts/:id/new", :new_post # (new inside the :id? anyway)

%w(/posts/10 /posts/1000/new /posts/99/edit).each do |path|
  puts "Check '#{path}'"
  result = tree.find(path)
  pp result.found?
  if result.found?
    pp result.payload
    pp result.params
  end
end
Check '/posts/10'
result.found? = true
result.payload = :post
result.params = {"id" => "10"}

Check '/posts/1000/new'
result.found? = true
result.payload = :new_post
result.params = {"id" => "1000"}

Check '/posts/99/edit'
result.found? = true
result.payload = :edit_post
result.params = {"id" => "99"}

And also works for the other example:

Check '/posts/10'
result.found? = true
result.payload = :post
result.params = {"id" => "10"}

Check '/posts/1000/new'
result.found? = true
result.payload = :action_post
result.params = {"id" => "1000", "action" => "new"}

Check '/posts/99/edit'
result.found? = true
result.payload = :action_post
result.params = {"id" => "99", "action" => "edit"}

However, it will not work if at the same level two different named parameters are placed:

/posts/:id
/posts/:gallery_id/gallery

Makes sense?

f commented

Now I completely understand it! :)

# (new inside the :id? anyway) is my fault ๐Ÿ˜„ It sounds wrong already. Was shooting examples.

@f excellent! ๐Ÿ˜„

Average app will not be affected by this, but only found that Kamber presents the scenario I described in this issue ๐Ÿ˜ข

https://github.com/f/kamber/blob/master/src/kamber.cr#L41-L49

Which is why I've added to README in master a suggestion for dealing with it (plus, reduce duplicated content penalty from search engines ๐Ÿ˜‰)

Cheers and have a nice weekend!

f commented

We can change Kamber's URL structure. It won't be a problem.