Add support for bounding box in GeoSearchGenerator
zstadler opened this issue ยท 11 comments
The Wikimedia geosearch supports the use of a bounding box as an alternative to the coordinates+radius as a Geograpic selector:
gsbbox: Bounding box to search in: pipe (|) separated coordinates of top left and bottom right corners.
and provides an example:
api.php?action=query&list=geosearch&gsbbox=37.8|-122.3|37.7|-122.4
Since the coordinates+radius approach is limited to a 10000 meter radius, combining multiple requests in order to cover a larger area is a challenge. On the other hand, the use of a bounding box for searching Wikimedia is easier to aggregate and to integrate with other Geographic systems
Please consider adding support for search based on a bounding box.
See also this Wikimedia API bug report related to the use of gsbbox
for geosearch
.
Thanks for your links, @zstadler ! I will check on this and work on the implementation after the holiday, which is, tomorrow ๐
Published v0.7.0-int.6
. You may now use GeoSearchGenerator.BoundingRectangle
to specify a small rectangle with the left (longitude), top (latitude), width, height and search for the pages.
I'm planning to refector GeoSearch, GeoCoordinate and GeoCoordinateRectangle API. I'm going to extract the Dimension
and Global
from GeoCoordinate
structure, and GeoCoordinateRectangle
may need some polishment. If you have any more suggestion / feature requests regarding to these API, feel free to open another issue and let me know ๐
Thanks for this! :-)
What's a small rectangle?
I'm getting the following error:
OperationFailedException: toobig: Bounding box is too big
- the exception should indicate which bbox I should be using I think...
Also toobig is missing a space :-)
I've tried this roughly, and ranges less than 0.2 degrees in longitude and lattitude seem okay.
WikiClientLibrary/UnitTestProject1/Tests/GeneratorTests.cs
Lines 413 to 422 in cafda1f
My hypothesis is that on MW API server, eventually you cannot bypass the Radius
limitation of GeoSearch. 10km is roughly 0.28 degrees on earth.
So if you are planning to scan on some larger area the earth, you may need to split your range into a grid, and request for the smaller tiles one by one from the client.
And toobig
is actually the error code from MW API response, like permissiondenied
or badtoken
.
Thanks for the quick response!
This is what I do right now with the 10Km radius search, only the circles are overlapping and I though I'll be able to do it in one call of bbox instead of around 1000.
Here's the relevant code I was hoping to simplify... :-/
https://github.com/IsraelHikingMap/Site/blob/5bf63fc2a0e2c1a22bf82d3f1175141b45c25356/IsraelHiking.API/Services/Poi/WikipediaPointsOfInterestAdapter.cs#L77
When using the GeoSearchGenerator
it seems that I can't cross the pagination size of 500 in terms of number of results.
The following is generating a 500 items results but I don't know how to continue to the next page:
var geoSearchGenerator = new GeoSearchGenerator(new WikiSite(wikiClient, new SiteOptions($"https://he.wikipedia.org/w/api.php")))
{
BoundingRectangle = GeoCoordinateRectangle.FromBoundingCoordinates(34.75, 32, 34.9, 32.15),
PaginationSize = 1000 // this is ignored
};
var results = await geoSearchGenerator.EnumItemsAsync().ToListAsync(); // this returns only 500...
Let me know if you want me to open a new issue on this or am I missing out something?
Same request from the browser:
https://he.wikipedia.org/w/api.php?action=query&maxlag=5&list=geosearch&gsradius=10&gsprimary=primary&gslimit=500&gsbbox=32.15%7C34.75%7C32%7C34.9
Seems like the response doesn't have a continuation parameter? not sure...
It seems so. GeoSearch does not support pagination for now. Example response of https://en.wikipedia.org/w/api.php?action=query&maxlag=5&list=geosearch&gsradius=10&gsprimary=primary&gslimit=2&gsbbox=32.15%7C34.75%7C32%7C34.9
{
"batchcomplete": "",
"query": {
"geosearch": [
{
"pageid": 18328987,
"ns": 0,
"title": "Beit Zvi",
"lat": 32.078408333333336,
"lon": 34.821713888888894,
"dist": 489.4,
"primary": ""
},
{
"pageid": 46324352,
"ns": 0,
"title": "HaAliya HaShniya Garden",
"lat": 32.0697,
"lon": 34.8148,
"dist": 1127.4,
"primary": ""
}
]
}
}
I think the continuation problem is originally tracked with phab:T95241 and closed as duplicate of phab:T78703.
Unfortunately, I don't think T78703
is going to be resolved soon...