google/lovefield

Only certain fields are showing / queryable during simple .select()

fezzzza opened this issue · 6 comments

Hi,

I have a 1.5MB lovefield database in 22 tables, many of which are synchronised selectively from my MySQL server.

Here is the relevant portion of my schema. The schema code is dynamically generated so as to precisely match the schema on the server:

var db;
var db_reasons_fields_links;//repeated for each table
schemaBuilder = lf.schema.create('cl17-vx', 1);
schemaBuilder.createTable('reasons_fields_links')
.addColumn('id', lf.Type.STRING)
.addColumn('reason_link', lf.Type.STRING)
.addColumn('field_link', lf.Type.NUMBER)
.addColumn('ordinal', lf.Type.NUMBER)
.addColumn('ordinal_shortlist', lf.Type.NUMBER)
.addColumn('created_timestamp', lf.Type.NUMBER)
.addColumn('sync_timestamp', lf.Type.NUMBER)
.addColumn('modified_timestamp', lf.Type.NUMBER)
.addColumn('modified_session_link', lf.Type.NUMBER)
.addNullable(['reason_link','field_link','ordinal','ordinal_shortlist','created_timestamp','sync_timestamp','modified_timestamp','modified_session_link'])
.addPrimaryKey(['id'])
.addIndex('reason_link_idx',['reason_link'])
.addIndex('field_link_idx',['field_link'])
;
schemaBuilder
	.connect()
	.then(function(newDb){
		console.log('dbinit.connect(): - starting table allocation');
		db=newDb;
db_reasons_fields_links=db.getSchema().table('reasons_fields_links');//repeated for each table
...

...so all in all there are 22 blocks of the .createTable() command similar to the above and the final line is replicated 22 times for each table.

My test platform is running Chrome version 73.0.3683.103 (Official Build) (64-bit)
on Linux Mint 19.1
I got the same results after updating Chrome to version 74.0.3729.157 (Official Build) (64-bit)

The sync down looks complete, as when looking in the "Application" tab in DevTools, I can see records 0 through 283, with all the field detail I would normally expect to see. So far so good.

If I do this in the console:

var a;
db.select()
.from(db_reasons_fields_links)
.exec()
.then(function(r){a=r})

a holds 284 results, but each result only has three fields: reason_link, field_link, and id.

I rebooted and repeated the test (using copy-and-paste so the code is identical) and now I have 284 sets of five fields, namely:

field_link
id
ordinal
ordinal_shortlist
reason_link

After upgrading to Chrome 74, this reverted back to 3 fields

The following query:

var a;
db.select()
.from(db_reasons_fields_links)
.where(db_reasons_fields_links.modified_timestamp.gt(1))
.exec()
.then(function(r){a=r})

Results in an empty dataset. Similarly if I were to query:

var a;
db.select()
.from(db_reasons_fields_links)
.where(db_reasons_fields_links.modified_timestamp.eq(1547831833))
.exec()
.then(function(r){a=r})

a is an empty array;
(where I can see from DevTools that this value 1547831833 exists in the IndexedDB.)

I have tried the above with that field as an lf.Type.INTEGER and as lf.Type.NUMBER.

The database is not very big. None of the tables have over 1000 rows, and the Application/Clear Storage page in DevTools tells me that IndexedDB takes up 1.5MB. I am using 3.3MB out of 4399MB storage quota.

I seem to have plenty of memory available. window.performance.memory reports:
jsHeapSizeLimit: 2197815296
totalJSHeapSize: 45411643
usedJSHeapSize: 43563787

I hope you can replicate and fix.... or else point out where I'm going horrendously wrong!

a holds 284 results, but each result only has three fields: reason_link, field_link, and id.

This is quite strange. Could you export all the data of your db with the export() method, and attach it here? This would help reproduce, or further debug the problem. See docs for export() at https://github.com/google/lovefield/blob/master/docs/spec/03_life_of_db.md#37-importexport

The sync down looks complete, as when looking in the "Application" tab in DevTools, I can see records 0 through 283, with all the field detail I would normally expect to see

Perhaps also attach the code that adds data to the db? Also can you repro with a single table, or does this only happen when multiple tables are used?

One more thing to note, that might be related. Do you modify any results returned from Lovefield in place? If so, this would be corrupting the cache, and might explain why you get different results from select() and different showing in the DevTools. See note from docs at https://github.com/google/lovefield/blob/master/docs/spec/04_query.md#41-select-query-builder

Specifically the part that says:
"One important concept is to treat the returned results of select queries as read-only and do not modify it. Based on performance considerations, Lovefield does not actively clone/freeze the object retrieved from its internal cache. For this to work, the user is supposed to follow the rule of not altering results returned from select query. For example: ..."

I think you nailed it. I was aware of the cache corruption issue - more from getting burnt and learning the lesson the hard way, rather than seeing it in black-and-white in the documentation, and I had thought that I had isolated the result enough before operating on it. Obviously not. See comments in the code.

sql.exec().then(function(rows){
	for (var n = 0; n < rows.length; n++) {
		destObj[rows[n]['id']]=rows[n];//   <-- * this seems not to be enough to isolate the cache *
		
		if (sdbe && (typeof destObj[rows[n]['id']].created_timestamp!="undefined"))
			delete(destObj[rows[n]['id']].created_timestamp); // <-- * then this corrupts the cache *
		if (sdbe && typeof destObj[rows[n]['id']].sync_timestamp!="undefined")
			delete(destObj[rows[n]['id']].sync_timestamp);
		if (sdbe && typeof destObj[rows[n]['id']].modified_timestamp!="undefined")
			delete(destObj[rows[n]['id']].modified_timestamp);
		if (sdbe && typeof destObj[rows[n]['id']].modified_session_link!="undefined")
			delete(destObj[rows[n]['id']].modified_session_link);
	}
})

The workaround is to transfer every element of the promised rows array of objects individually to a separate array of objects, before operating on that completely isolated from the result, something like:

sql.exec().then(function(rows){
        var irows=isolate(rows); // <-- transfer to a separate container, avoiding javascript "pointers"
		for (var n = 0; n < irows.length; n++) {
			destObj[irows[n]['id']]=irows[n];
...etc

The workaround is to transfer every element of the promised rows array of objects individually to a separate array of objects

Even that does not sound 100% safe, because does not do a deep clone of the objects being returned. If you are only operating on the array (adding/removing elements) a shallow clone would be enough. But if you are modifying individual rows, cache corruption would still occur.

db.select().from(myTable).exec().then(results => {
  const newArray = results.slice(); // Create a shallow clone the array.
  newArray[0].someField = 'foo';
  // Just corrupted the Lovefield cache, because newArray[0] === results[0]
});

Ok I think we're on the same page here, and maybe our terminology is misaligned. By "element" I am referring to key/value pairs, whereas you appear to be interpreting "elements" as "rows", ie, objects of key/value pairs. My isolate() function recursively drills down to copy the individual key/value pairs, thereby performing a deep copy (and this solves my problem in this case), whereas in my first example above destObj[rows[n]['id']]=rows[n]; copies a row of key/value pairs with each operation, therefore performing a shallow copy, which (I'm sure you know this but this is for the benefit of anyone reading this) doesn't actually copy the row but installs a javascript "pointer" to that row, so any operations on the "copy" also affect the source and therefore affect the cache - and that's why I had the problem in the first place.

But thank you for the clarification about being able to delete entire shallow-copied rows without having to worry about cache corruption. That certainly makes certain design considerations a little easier.

Ok I think we're on the same page here,

Agreed, I think we are on the same page too. Closing this issue since it is working as designed. Thanks for your thorough analysis.