POST_Links does not return link information when following links recursively
mattjala opened this issue · 5 comments
When using POST_Links
to retrieve a link by name with follow_links
enabled, if the link is retrieved from a sub-group, the returned JSON is empty.
Implementing the follow_links
parameter for POST_Links
raises some questions - if multiple links are found by the same name, which should be returned? The first in iteration order? Do the domain crawlers guarantee any particular iteration order, and do we want to expose that behavior?
It might make sense to remove the follow_links
param from POST_Links
. Similarly, CreateOrder
, Limit
, and Marker
seem to be holdovers from GET_Links
and should probably be removed as well.
Test to elicit the bug:
def testPostLinksFollowLinks(self):
domain = self.base_domain + "/testPostLinksFollowLinks.h5"
helper.setupDomain(domain)
print("testPostLinksFollowLinks", domain)
headers = helper.getRequestHeaders(domain=domain)
req = helper.getEndpoint() + "/"
rsp = self.session.get(req, headers=headers)
self.assertEqual(rsp.status_code, 200)
rspJson = json.loads(rsp.text)
root_id = rspJson["root"]
# create group "g1" in root group
req = helper.getEndpoint() + "/groups"
body = {"link": {"id": root_id, "name": "g1"}}
rsp = self.session.post(req, data=json.dumps(body), headers=headers)
self.assertEqual(rsp.status_code, 201)
rspJson = json.loads(rsp.text)
group_id = rspJson["id"]
path = "/dummy_target"
# create link "link1" in g1
req = helper.getEndpoint() + "/groups/" + group_id + "/links/link1"
body = {"h5path": path}
rsp = self.session.put(req, data=json.dumps(body), headers=headers)
self.assertEqual(rsp.status_code, 201)
# make POST_Links request to root group, expect 404
body = {"titles": ["link1"]}
req = helper.getEndpoint() + "/groups/" + root_id + "/links"
rsp = self.session.post(req, data=json.dumps(body), headers=headers)
self.assertEqual(rsp.status_code, 404)
# make POST_Links request to root group with follow_links, expect to find link1
req = helper.getEndpoint() + "/groups/" + root_id + "/links?follow_links=1"
rsp = self.session.post(req, data=json.dumps(body), headers=headers)
self.assertEqual(rsp.status_code, 200)
rspJson = json.loads(rsp.text)
self.assertTrue("links" in rspJson)
links = rspJson["links"]
self.assertTrue(len(links) == 1) # fails due to empty return
link = links[0]
self.assertTrue("title" in link)
self.assertEqual(link["title"] == "link1")
Spoke to John about this - follow_links
should return all links that match the name in the provided groups. Limit
and CreateOrder
are used in the case where titles
is not provided, so they should stay. marker
isn't actually checked for by POST_Links
, so no change is needed there.
@jreadey I'm not sure I understand how this endpoint is supposed to work with the pattern
parameter. testGetPattern
expects group ids to be returned that don't contain any links that fit the pattern. Is it meant to return all searched group ids (empty or not) when using a pattern, and not without a pattern?
Here's the (expected) response in testGetPattern
with empty groups:
{'links':
{'g-f81913f9-909037d3-7091-9b7118-18bf5b': [],
'g-f81913f9-909037d3-0b41-8c15a0-ba4fad': [],
'g-f81913f9-909037d3-2304-9163c1-15d503': [
{'id': 'd-f81913f9-909037d3-cd25-bd6a92-8700e7', 'class': 'H5L_TYPE_HARD', 'created': 1707845214.4165003, 'title': 'dset2.1'},
{'id': 'd-f81913f9-909037d3-be98-2bdfa6-91fa48', 'class': 'H5L_TYPE_HARD', 'created': 1707845214.4451396, 'title': 'dset2.2'}
],
'g-f81913f9-909037d3-f6a8-00d8e6-4046b6': [
{'id': 'd-f81913f9-909037d3-39d1-9c7ecc-a8b46b', 'class': 'H5L_TYPE_HARD', 'created': 1707845214.270042, 'title': 'dset1.1.1'},
{'id': 'd-f81913f9-909037d3-697f-a976fe-b8c909', 'class': 'H5L_TYPE_HARD', 'created': 1707845214.2981899, 'title': 'dset1.1.2'}
],
'g-f81913f9-909037d3-6cba-d68216-f3d0b1': [],
'g-f81913f9-909037d3-cc63-497daf-2bf556': []
}
}
Yes, that is the intent. You could make the case to not return group ids that contain no links, but I like the explicit empty list.
I tried out titles with follow_links and it doesn't return a sensible result -- links to subgroups won't get followed if the subgroup link is not in titles. Rather than fix this up, I think it makes more sense to return a 400 if titles is used with follow_links -- I've submitted a PR for this change. The PR also raises 400 if CreateOrder is used with titles or Limit is used with titles.