google/lovefield

Insert from JSON bulk

michel-reyes opened this issue · 6 comments

Hi,
Is there a option to insert into a table values from a big JSON file (20 MiB)?
the following option is too low

for(var i = 5; i < locations.length; i++) {		
	// create a row
	console.log('creating new row...');
	row = t.createRow({
	        "State": locations[i].State,
		"City": locations[i].City,
		"CountryCode": locations[i].CountryCode,
		"CurrCode": locations[i].CurrCode,
		"PayerCode": locations[i].PayerCode,
		"BranchCode": locations[i].BranchCode,
		"Street": locations[i].Street
	});
	db.insert().into(t).values([row]).exec();
	console.log('new row inserted.');
}

it took like 5 minutes...

Could you format the code snippet? It is fairly hard to read. Either way have a look at https://github.com/google/lovefield/blob/master/docs/spec/03_life_of_db.md#37-importexport for bulk import/export. Also be aware of the behavior described at #219 with regards to date/time fields.

I updated it for you. So besides the bulk import, the way you are inserting is sub-optimal. You are issuing an INSERT query for every single row. Why don't you create all rows and call the following once

db.insert().into(t).values(rows).exec();

As you already noticed the call to values() already accepts an array. Why passing arrays of one element multiple times?

I was trying to do like you say but I got a problem:

schemaBuilder.createTable('locations').
addColumn('State', lf.Type.STRING).
ddColumn('City', lf.Type.STRING).
addColumn('CountryCode', lf.Type.STRING).
addColumn('CurrCode', lf.Type.STRING).
addColumn('PayerCode', lf.Type.STRING).
addColumn('BranchCode', lf.Type.STRING).
addColumn('Street', lf.Type.STRING);

My file is an array of objects like this:

locations = [
{
"State":"X",
"City":"Caracas",
"CountryCode":"VE",
"CurrCode":"VEF",
"PayerCode":"0005500",
"BranchCode":"","Street":""
}]`

it has 40000 objects

When I try to use
db.insert().into(t).values(locations).exec();

Lovefield says:

Cannot read property 'State' of undefined

Something is wrong in your code, and can't tell from the provided snippets. Here is a working example, https://jsfiddle.net/jz739970/1/.

Excellent example, it works!

Can you explain why to use

var rows = locations.map(l => t.createRow(l));

In your code snippet you are creating an unnecessary intermediate object. This also hurts performance.

row = t.createRow({
	        "State": locations[i].State,
		"City": locations[i].City,
		"CountryCode": locations[i].CountryCode,
		"CurrCode": locations[i].CurrCode,
		"PayerCode": locations[i].PayerCode,
		"BranchCode": locations[i].BranchCode,
		"Street": locations[i].Street
	});

Why do this when locations[i] is already the object that you want? Either way, closing this bug. Please use the forum for usage questions that are not necessarily bugs.