SemanticMediaWiki/SemanticGlossary

Large(ish?) Glossary is only partially loaded

hexmode opened this issue · 1 comments

Setup and configuration

  • MW version: 1.27.3 (fb5a1b7) / 1.34.0 (5e55aee)
  • PHP version: 5.6.27 (apache2handler) / 7.2.31 (fpm-fcgi)
  • DB: Percona XtraDB Cluster | 5.7.23-23-57-log
  • SMW version: 2.5.7 (3cd7d22) / 3.1.5 (ea0fd5d)
  • SG version: 3.0.0 (84f6f30)
  • Lingo version: 3.1.0 (21deae5) / 3.1.1 (5ca6e83)

(Where there are two versions above, the first is the production and testing system for 1.27 and the second is where we are staging an upgrade.)

Issue

I'm guessing this has something to do with the size of the Glossary.

I don't know how to count the number of entries, but a user reported that SG had stopped working in production (1.27) while it was working correctly in an almost identically configured test system (1.27) but not in our staged upgrade system (1.34).

After a laboriously tracking down the problem, I found reason the 1.27-test system was working was because caching was different.

Setting $wgexLingoCacheType = CACHE_NONE; and using the following diff against src/Cache/ElementsCacheBuilder.php led to it working in the production 1.27 system:

diff --git a/src/Cache/ElementsCacheBuilder.php b/src/Cache/ElementsCacheBuilder.php
index 6988cd8..819918f 100644
--- a/src/Cache/ElementsCacheBuilder.php
+++ b/src/Cache/ElementsCacheBuilder.php
@@ -62,7 +62,7 @@ class ElementsCacheBuilder {
 	 */
 	public function getElements() {
 
-		$ret = array();
+		static $ret = [];
 
  		if ( $this->queryResults === null ) {
 			$this->queryResults = $this->store->getQueryResult( $this->buildQuery() )->getResults();
@@ -71,7 +71,7 @@ class ElementsCacheBuilder {
 		// find next line
 		$page = current( $this->queryResults );
 
-		if ( $page && count( $ret ) == 0 ) {
+		while ( $page && count( $ret ) === 0 ) {
 
 			next( $this->queryResults );
 
@@ -98,6 +98,8 @@ class ElementsCacheBuilder {
 				wfDebug( "Cached glossary entry $cachekey.\n" );
 				$this->glossaryCache->getCache()->set( $cachekey, $ret );
 			}
+
+			$page = next( $this->queryResults );
 		}
 
 		return $ret;

(I think 1.34 will work, too, but I'm rebuilding data at the moment... will check back when it completes.)

I ended up using the above diff because I noticed the weird if ( $page ... above. I tracked down the static to the refactoring that removed SemanticGlossaryBackend. I'm not confident that my work is correct, but at least it allows the glossary to work when Lingo caching is turned off.

Oh... yeah, I say this is a problem with a large(ish) glossary because even though the original report was that it wasn't working at all, I noticed that some of the earlier entries were being loaded (up to the C's) but none of the later ones.