amazon-archives/aws-sdk-core-ruby

MaxResults combined with a filter gives inconsistent results

Closed this issue · 7 comments

Using max_results with filters is allowed in some cases, and when allowed produces weird results. When using a filter that uses wildcards it returns a token for next results, even though no results were returned. When using no wildcards no next_token is returned.
When the tag value is exact (no wildcards) and produces results (I have an instance tagged with Name = o) it yields an error.

2.1.5 :025 > Aws::EC2::Client.new.describe_instances(max_results: 5, :filters => [{:name=>"tag:Name", :values => ["Nonexisting value tag*"]}])
 => #<struct
 reservations=[],
 next_token=
  "eyJ2IjoiMSIsImMiOiI0d3p5bTMydDR2NC9VUERYY3dBb3lVd3RtSDlCZmhkaG0vZ1o1bHRyOGNPamI1UWY0T051THV5SERJcVFxak1WZFppYzc5Tk1kWFdBY3E2THkra1N5QUtSVCtNUDZ4aGh3b0ZwRmJHU29LdEdjams0d1pvaUhBdzE0MUF5ZXg5NnZBSGw5NHM5UHF6TGV6RWdwTUU3MUhURkVqL0VqQlNLUmpENlk4ZWxTTkE9IiwiaSI6IjVzNE1jUWhpRWdCNElBMmZobDRQS3c9PSIsInMiOiIxIn0=">

2.1.5 :026 > Aws::EC2::Client.new.describe_instances(max_results: 5, :filters => [{:name=>"tag:Name", :values => ["Nonexisting value tag"]}])
 => #<struct  reservations=[], next_token=nil>

2.1.5 :027 > Aws::EC2::Client.new.describe_instances(max_results: 5, :filters => [{:name=>"tag:Name", :values => ["o"]}])
Aws::EC2::Errors::InvalidParameterCombination: The parameter 'maxResults' cannot be used with tag filters in the parameter 'filterSet'. Remove either the tag filters from 'filterSet' or the 'maxResults' parameter and try again.

The Aws::EC2::Errors::InvalidParameterCombination in the third example is returned in the describe instances response from Amazon EC2. I do not know why the service does not return the same error in either of the preceding two scenarios.

I notice in the second call, that the asterisk has been dropped from the filter value string. Was that intentional? What happens if you capture the response from the first call and then simply call #each on it? Does it loop infinitely?

The third example is indeed strange, though I get this consistently when there is a name tag with "o".
The second call indeed has an asterisk dropped. This is intentional to illustrate that changing this produces completely different behaviour. I do not know why the filter value can influence this type of behaviour..

If I capture the response from the first call it exits after calling:
Running each:

b = client.describe_instances(max_results: 5, :filters => [{:name=>"tag:Name", :values => ["Nonexisting value tag*"]}]).each { p 'result' }
"result"
 => nil

More weird behaviour. It says we reached the last page even though there is a next_token.

2.1.5 :026 > b = client.describe_instances(max_results: 5, :filters => [{:name=>"tag:Name", :values => ["Nonexisting value tag*"]}])
 => #<struct
 reservations=[],
 next_token=
  "eyJ2IjoiMSIsImMiOiJvRzVsQURMUzlyWU52Yk9PajNucDdFUGJEMEZkSDQ1OExWL3JrcnU4dlRYMU8xaklNN1pieWZJbm5uSWFHL0tZdU1uZ0Vkc3RHWDVQVDdER05taE8vd1RIcmtvaUZqNnF4NnpXYU5wRVVOOUUza3dxczZLUW9BVEFEU0F1Y0R0eFdGUXo0bFMxT21kbEsrcjZRclpORzRqVXdjcDQ4V056OTk3NjZsVnBqZW89IiwiaSI6Ik8zMzMwTU1RVjNNNVQvdXRJOXJRZmc9PSIsInMiOiIxIn0=">

2.1.5 :027 > b.next_page
Aws::PageableResponse::LastPageError: unable to fetch next page, end of results reached

Based on your followup information, I went digging to see why the response would not fetch the next page when a paging token is present. The paginators were missing an entry for DescribeInstances. The commit above addresses that. Can you retry your failing examples against master?

Thanks for quick commit @trevorrowe !
Unfortunately, the first three tests produce the same same results.
The latter tests now produce three empty pages, consistent with

The last weird behaviour from my latter test now produces 3 empty pages which is consistent with 3 pages I would expect if all my instances were returned.

client.describe_instances(max_results: 5, :filters => [{:name=>"tag:Name", :values => ["Nonexisting value tag*"]}]).each { |x|  p x }
#<struct
 reservations=[],
 next_token=
  "eyJ2IjoiMSIsImMiOiJPdTEyNnBrUUsxRy9aWnVpOVZVUVJ3Y3NUMEwzV0VVdEhXNXhEWGlSUjhBNm5XVUsxUWtzcmhRc01iZWlNRWdXZTN6MVhtRU1hdThacHVEUldmaWlhdFYxQVc5eHdINTcrdE5aak92aHJqRklwUGhXWE12cEJEWkpLek1tZVlTeGtlR01hZmJVbE8zYVlqOFJJR09CQlhUZjJxUjliWk81TzFpN0xsblNLWU09IiwiaSI6IjA0Z3pIRzNBTWRsMldlcHN0M2dkbGc9PSIsInMiOiIxIn0=">

#<struct
 reservations=[],
 next_token=
  "eyJ2IjoiMSIsImMiOiJQTGhaY21IRyt6WW05N2FGMEpkYzVQNUVqZnR0VGdKV0JtK0lBcXRtaFlGb3hvbHJvTk5oTGxtWjZ5VlgxSGNkanpnRUpFQWtIejVqY2EwU2lXZ3cybmxBMXNhNUQ1SVVGS01CRHE0dUU3SHErNVVDS0JRMlhJd1hxSjRSRGREc0wzN0h6Z21kQVJnV0Vvdm9zcmtjM3c9PSIsImkiOiJjblpFVDBMcWFZN29vWHpGcWY3Zk53PT0iLCJzIjoiMSJ9">

#<struct  reservations=[], next_token=nil>

Amazon EC2 is returning the empty responses pages. This is not unique within AWS. When paginating, EC2 may have scanned a particular number of records looking for matches. Failing to find any within a period of time, it returns an empty response with a paging token. Sending the followup request with the paging token allows it to continue scanning from where it left off.

If you had fewer records to scan, you might have just received an empty response with no paging token. If you have more, you could possibly expect many more empty response pages before the final page.

OK, this is in line with behaviour where the second page would have results and all other pages had zero. This makes the use case for pagination somewhat less useful for us as might logically be only interested in the first page of the results.

So to conclude, pagination can only be useful when requesting all resources of a given query (i.e. all instances). While filtersets are allowed in some cases by the API (see error), the results of those queries are not compounded but paginated by the total number of resources available if said filterset was not applied.

So the first few weird cases are I take it correct behaviour? Should this be added to the documentation? And perhaps the use of filtersets combined with pagination be restricted (always throw error instead of sometimes)

You are correct, that is expected behavior.

I'm going to pass along the documentation feedback to the EC2 team. We build our API reference documentation directly from theirs. I'm less inclined to attempt to validate this client-side. In the past, services have loosened restrictions on request parameters, and I do not want to force SDK updates to be able to use no-code change features should EC2 improve using filters sets with pagination.