orangehill/iseed

error when seeding from table with many records

skwl opened this issue ยท 19 comments

skwl commented

I created a seed from a table with 14 columns and over 5000 rows but when I ran "php artisan db:seed", I got a general error: 1390 prepared statement contains too many placeholders.

I edited the seed code to make two inserts instead of seeding all of the rows in one and then it worked.

\DB::table('users')->insert(array (

I could be wrong but to many placeholders sounds like a query issue to me..

I've seen this issue before when having a seed file were some entries have mismatched keys. That is to say if you add a key to on entry it must exist for every one, otherwise the query wouldn't work.

skwl commented

It is a query issue. Apparently, the number of placeholders is limited to about 65536 per query. With 14 columns per row, I can only insert 4681 rows in one prepared statement. So if I startt another prepared statement after row 4681 by adding "\DB::table('users')->insert(array (" with the appropriate commas and parantheses, I can get the seed to work.

I could not find the place to add code to iseeds that would limit the number of placeholders per prepared statement.

I did some digging into this, and I found an answer on SO by Trent Lloyd:

"The maximum number of placeholders for values in a prepared statement is [limited to] the maximum value of a 16-bit unsigned integer, or specfically [sic]: 65,536."

If you take a look at the source code for init_param_array function, specifically line 2124 you can see where this limit is imposed.

This issue is neither iSeed or Laravel related.

skwl commented

The issue with iSeed is that it creates a seed that will fail because it exceeds 65,536 placeholders when creating a seed from a large table. I am not skilled enough to edit the code to create multiple prepared statements when necessary. If the number of columns times the number of rows exceeds 65,536, then it would be necessary to limit the number of rows in each statement to 65,536 divided by the number of columns.

I'm unsure how/if @tihomiro wants to tackle this..

Since it didn't feel that I was gonna have free time to tackle this anytime soon, I asked my coleague Ivan to have a look.

Fixed by splitting input data into chunks of 500 elements per insert statement.
I tried seeding 15 000 rows with 20 columns, and it worked fine.

A limit of 500 would only allow 131 columns. I'd suggest dropping the limit further, or making it dynamic or customizable?

Everything worked fine for me out of the box, until I tried to run a 61MB file (1.3MM rows). With that, I get a memory failure. I tried increasing memory limit to as high as I could, but it still failed. Is there a way to rewrite this so it doesn't try to get everything in a table into memory?

@nateritter
It is certainly possible to rewrite the generator so it uses only a few megabytes of memory.
However, doing it in such way would be contrary to the concept of Laravel's seeds.

Running it on a 1.3M rows sounds like you are trying to use iSeed as a DB backup/migration tool and I must suggest using the native mysql/mysqldump commands which have minimal memory impact and maximal speed and reliability.

I'm not using it as a backup/migration tool. Static data which is needed for my app to work (ie, cities, and their relevant data points, attributes of properties, amenities for locations, etc).

Kinda bummed about this as it's not unreasonable in this day and age to think that you're working with an application with a lot of data. The only other workaround for this is to manually extract certain rows that I know will relate properly to as/to foreign keys. Or, don't test features which use this data (not an option). Both of those workarounds suck.

.... aaaand I'm back. Same issue as 2014 on a new application, and I see it's unfortunately not been modified to accompany this use case. :/

@nateritter OK, I agree that this should be rewritten in a way that the data is split into chunks while reading from DB instead on writing to file, whilst pipelining the output to seed file directly. I'll reopen the issue, though it might take a while before I have some free time to deal with this. I'll do my best.

I do not understand what they are referring to: splitting input data into smaller chunks of elements per insert statement. Could you give me some examples of how to generate a seed with artisan? My table has 75859 records :(

@eliasgonzalezugalde sorry, iSeed won't be able to help. This feature is, sadly, still waiting for a couple of hours of my free time.

OK, thanks for answering.

Seeding an sql lite database will fail, if too many parameters are present (in the insert statement). See: https://www.sqlite.org/c3ref/c_limit_attached.html#sqlitelimitvariablenumber

Edit: Functionality is already present in Iseed.php, Line 224 but it's not implemented. Tested and working.

PR -> #132