test-kitchen/vmware-vra-gem

bug in pagination at path "/catalog-service/api/consumer/resources?limit=20" duplicate machines returned

Opened this issue · 11 comments

This is tripping me now all the time. It looks like the bug in Vrealize API but I want to report it here for those using all_resources method. (I am using VMware vRealize Automation Appliance 6.2.1.0-2553372)

My bug report

I have 108 machines to administer. When I make a call client.resources.all_resources which uses def http_get_paginated_array!(path, limit=20) method I get 108 results however 22 of those machines are duplicates (so I am missing 22 machines from my expected set).

When I debugged the issue by stepping over each page in def http_get_paginated_array!(path, limit=20) method I see 6 pages of results.

Starting with page 3 I start seeing duplicates of machines I already got on page 1. The initial 108 machines becomes 86 unique machines, 22 duplicates

Here are my results for machines in 6 pages returned. Notice machine 190 had a dup in page3

    page1 = ["190", "144", "039", "050", "173", "092", "130", "192", "059", "181", "156", "152", "155", "054", "175", "139", "125", "141", "146", "X001"]

    page2 = ["X005", "137", "136", "167", "170", "185", "158", "140", "049", "135", "119", "172", "169", "153", "044", "191", "091", "051", "126", "159"]

    page3 = ["059", "125", "176", "092", "161", "190", "133", "149", "168", "054", "164", "106", "139", "X002", "151", "040", "129", "135", "143", "044"]

    page4= ["142", "191", "163", "159", "182", "178", "137", "193", "047", "189", "175", "186", "194", "136", "058", "090", "146", "147", "184", "104"]

    page5= ["171", "148", "153", "131", "141", "X003", "145", "174", "068", "046", "116", "162", "118", "155", "050", "126", "X001", "173", "187", "127"]

    page6= ["060", "093", "067", "157", "150", "X005", "079", "042"]


    all     = page1 + page2 + page3 + page4 + page5 + page6

    repeats = Hash.new 0
    all.each { |e| repeats[e] +=1 }
    puts repeats.inspect
    puts repeats.values.uniq # 1, 2
    machines_fetched_twice = repeats.map { |k, v| k if v == 2 }.compact

    puts machines_fetched_twice.inspect
    # ["190", "050", "173", "092", "059", "155", "054", "175", "139", "125", "141", "146", "X001", "X005", "137", "136", "135", "153", "044", "191", "126", "159"]
    puts machines_fetched_twice.count # 22

    puts all.count #108
    puts all.uniq.count #86

some conclusion

Out of 108 unique machines (which is a correct count) I get 22 duplicates. That means I have 22 machines unaccounted for in the result.
This looks like a bug in vRealize API pagination.

when I use limit=200 I get all my 108 unique machines as expected.

My recommendation is to setup limit to 200 as default and if you have more machines to fetch perhaps make the limit configurable for the all_resources method.

I feel like a bean counter but for my set of 108 machines if I set limit=46 per page I get my entire set without duplicates. with limit 45 and below all breaks down.
YMMV if you have a different set to fetch and your pagination limit.
I would like to find out if anybody else experiences this bug.

extra

I even hacked this method to see if using rel link will be different. Foolish me.

    def http_get_paginated_array!(path, limit=20)
      items     = []
      base_path = path + "?limit=#{limit}"
      response = FFI_Yajl::Parser.parse(http_get!("#{base_path}"))
      items    += response['content']
        loop do
          break unless !response['links'].empty? && (next_page_url = response['links'].find { |e| e['rel'] == 'next' }['href'])
          u = URI.parse(next_page_url)
          response      = FFI_Yajl::Parser.parse(http_get!("#{u.path}?#{u.query}"))
          items         += response['content']
        end
      items
    end

@RubyTester - thanks for your report. It does appear that this is an issue with vRA and not this gem. That's unfortunate, since I can't fix vRA 😄

That said, I'll look into making the limit value configurable and also see if we can't programmatically de-dup these arrays as well. I'll have more soon, and possibly a branch to try out if you'd be willing to test it out for me. I don't have an environment with enough items to completely reproduce this case.

Also @RubyTester I'd encourage you to log a support issue with VMware with the details of this bug if you haven't already since you can reproduce it. Thanks!

My immediate fix is this:

def http_get_paginated_array!(path, limit=100)
https://github.com/chef-partners/vmware-vra-gem/blob/master/lib/vra/client.rb#L141

can you make 100 a default value? I hope it's not too large.

To make it configurable a small setting on client. it's only used in one call anyway.

# Vra::Client

    class << self
      attr_writer :default_page_size
      def default_page_size
        @default_page_size ||= 20
      end
    end

def http_get_paginated_array!(path, limit=self.class.default_page_size)

# user can increase with 
Vra::Client.default_page_size = 100

I see 3 tests would need to be modified for that config.

For now I am going with immediate fix.

And I don't know where to submit the bug to vRA (busy busy busy).
how about tweet. https://twitter.com/rubytester/status/644959051072180224

I think it's a bad idea to increase the size so large when this could be used in people's environments where fetching such a large list could be resource intensive. The right way is to do as you suggested - make it an option on the client object, default it to 20, and allow users such as yourself to override. And dedup until VMware implements a fix.

I'll have a fix out for this next week.

@RubyTester would you be willing to try out branch adamleff/pagination-dedup?

This has two commits on it:

  • allow page_size to be set on the client object, defaults to 20
  • call #uniq on the return array of http_get_paginated_array! so any duplicates should be handled before the caller starts processing them into instances (such as a Vra::Resource instance)

If this looks good, I'll get a review of it done and merged/released. Thanks!

Hi, my current solution to this bug is this:

module Vra
  class Client
    # hack page limit bug: https://github.com/chef-partners/vmware-vra-gem/issues/10
    alias foobar http_get_paginated_array!
    def http_get_paginated_array!(path, limit=20)
      foobar(path, 100)
    end
  end
end

And I don't really need to go any further than that. This fixes my problem.

Without this bug I think page_size setting may be useful if you have more than 20 items and you want to fetch them in one http call rather several (with my 108 I get 7 rest calls which is not too bad). If there was no bug then sensible limit=20 is just fine and that is what VRealize also provides as default. (https://vdc-download.vmware.com/vmwb-repository/dcr-public/b996ad6c-d44c-45af-8a6f-5945296e4848/8d8e47c6-4bbc-4938-80d3-c758c4ac63b3/docs/catalog-service/api/docs/resource_Resource.html#path__api_consumer_resources.html)

The dedup for me is a waste of time. I should be guaranteed no duplicates in a paged set of items and that's that. - The fact that it returns dups is a bug on VRealize API.

I have no clue where I could actually file a bug with them for endpoint /catalog-service/api/consumer/resources?limit=20. I did manage to post at vmware forum: https://developercenter.vmware.com/forums?id=5098#520515|3053606 and link back to here

Thanks.

@RubyTester I don't see the harm in the dedup if it helps other users.

Regardless of the dedup, the first commit in this branch avoids the need for you to monkey-patch as you can set the page limit to 100 on the client instance when you create it. I'm going to get this reviewed and merged as-is unless you uncover any issues using it. Thanks for the feedback!

Hi @adamleff My post on VMWare developer center was deleted by ? so I have no clue how to interface with those guys.

If you go ahead with dedup in #11 I would add some Warning to the user that a duplicate was found in the result set so people are not surprised that they miss items they expected.

Thanks for your enthusiasm getting this gem going. Good times ahead!

If anybody has similar dups issues please write a bug report to them in a format and place that of their choosing, I don't have active relationship with our VMWare setups, I just consume that API.

I had a change of heart on this today. In my mind, incomplete data is worse than no data. So I'm going to change this to raise an exception if duplicates are detected (since that's an indication that actual data was omitted based on your report) and point users to the workaround. I'm traveling this week but I will make that change real soon and get this released.

@RubyTester I've passed this along to a contact at VMware asking for the best way to report this issue. In the meantime, I'm closing this issue in favor of PR #11 which I'm about to merge and release.

Thanks again for your report!

vmware-vra v1.2.0 has been released to Rubygems!