mff-uk/odcs-dpus

Ordered SPARQL queries should use Virtuoso's scrollable cursors

Closed this issue · 2 comments

All DPUs loading data via SPARQL queries using ORDER BY (such as XSLT DPU) should use Virtuoso's scrollable cursors (see Virtuoso's documentation, section "Example: Prevent Limits of Sorted LIMIT/OFFSET query"). When OFFSET in an ordered SPARQL query exceeds Virtuoso's setting MaxSortedTopRows from virtuoso.ini (typically set to 10-20K rows), the query fails with error message like the following:

Virtuoso 22023 Error SR353: Sorted TOP clause specifies more then 41000 rows to sort.
Only 40000 are allowed.
Either decrease the offset and/or row count or use a scrollable cursor

Temporary workaround for this issue is to increase the MaxSortedTopRows setting, but the solution is to use a sub-SELECT with ORDER BY wrapped in SELECT query with OFFSET and LIMIT. For example, the XSLT DPU uses the query:

SELECT ?s ?o
WHERE {
  ?s <http://linked.opendata.cz/ontology/odcs/xmlValue> ?o .
}
ORDER BY ?s ?o

This query with scrollable cursor that allows loading larger data could look like the following:

SELECT ?s ?o
WHERE {
  {
    SELECT ?s ?o
    WHERE {
      ?s <http://linked.opendata.cz/ontology/odcs/xmlValue> ?o .
    }
    ORDER BY ?s ?o
  }
}
# Pagination goes here:
LIMIT 10000
OFFSET 1000000