Extremely slow test on ci runner

Question

Extremely slow test on ci runner

Closed this issue 4 months ago · 21 comments

I'm trying to upgrade ecto and to myxql but for some reason on the ci runner the tests are extremely slow like 10x slower then normal.

Running test locally they perform fine with the ecto upgrade.

All the queries have around 40 ms delay like below. So originally I thought it's maybe because of the CI server not having the unix socket and it takes myxql some time to fallback. But even when setting protocol in test config for ecto on :tcp it still has the around 40 ms delay on everything.
QUERY OK source="applications" db=39.9ms
QUERY OK db=0.4ms queue=40.3ms

It's a kubernetes gitlab ci runner with mysql running a connected service.
With like the following ecto config

config :synapse, Synapse.Repo,
  username: "root",
  password: "",
  database: "synapse_test",
  hostname: "127.0.0.1",
  pool: Ecto.Adapters.SQL.Sandbox,
  charset: "utf8",
  collation: "utf8_general_ci",
  protocol: :tcp,
  prepare: :unnamed

nulian commented 5 years ago

0.3.1

Answer 1 · 2020-01-17T12:52:38.000Z

Could you paste the exact version numbers of ecto_sql and myxql you tried?

Answer 2 · 2020-01-17T13:08:27.000Z

ecto 3.3.1
ecto_sql 3.3.2
db_connection 2.2.0

Answer 3 · 2020-01-17T13:13:27.000Z

what's your myxql version?

Answer 4 · 2020-02-05T22:27:38.000Z

@nulian do they run faster if you use named prepared statements ? We're investigating a similar issue with 0.3.2 where tests are 4-5x slower with unnamed prepare.

Answer 5 · 2020-02-24T14:23:27.000Z

It does seem to fix test performance removing the unnamed statement flag.
So I could do that to improve test performance. A bit weird that on my local machine I didn't have the big penalty but the test server did.
Running it again with the unnamed flag enabled to be sure nothing else changed in the month.

Answer 6 · 2020-02-24T15:12:17.000Z

Retested them. The myxql without unnamed option performance I would be overall satisfied with.

Mariaex test run: 31.5s
Myxql without unnamed option: 42.7s
Myxql with unnamed option: 398.2s

Answer 7 · 2020-07-19T08:09:54.000Z

I encountered the same problem, but it was solved by setting nodelay: true.

config :synapse, Synapse.Repo,
  ...
  prepare: :unnamed,
  socket_options: [nodelay: true]

However, I didn't know why this option isn't needed in mariaex.

Answer 8 · 2020-07-21T02:55:02.000Z

Thank you @melpon! That change got our test execution time back down to where it should be.

Answer 9 · 2021-10-08T14:08:07.000Z

Hello @josevalim @wojtekmach. Maybe not related to the original problem (let me know if this should go to the separate issue) but I've noticed performance slow-down after 7a5363a. That commit leads to the case when:

Repo.insert(%Model{field1: "value1", field2: nil})
Repo.insert(%Model{field1: "value1", field2: "value2"})
Repo.insert(%Model{field1: "value1", field2: nil})

Will create 3 prepared statements. After first query - first prepared statement will be created and executed - as expected. After the second query the first prepared statement will be closed and second will be created. And after the third query - second prepared statement will be closed and new one created again.

Obviously that may significantly hurt perfomance. Especially if an application often executes insert/update for the same model. Tests of my application are executed almost two times slower after update of myxql. That commit was introduced in context of insert_all and it is kind of clear why to remember only last one insert_all, but why it was done in this way for insert/update (and seems even for select + limit for example)?

I am not really familar with postgex code, but it seems it does not do such things https://github.com/elixir-ecto/postgrex/blob/master/lib/postgrex/protocol.ex#L3363

Answer 10 · 2021-10-08T14:49:51.000Z

@0xAX the commit above should only reprepare the query if they are different. I would expect the three Repo.insert above to be the exact same statement. Can you check if that's the case or not for you?

I am not really familar with postgex code, but it seems it does not do such things https://github.com/elixir-ecto/postgrex/blob/master/lib/postgrex/protocol.ex#L3363

MySQL unfortunately has a limit on the number of prepared statements, which means we need to be more aggressive to purge them.

Answer 11 · 2021-10-08T14:52:09.000Z

@josevalim Yes, I checked and I see (in wireshark) that prepared statement is closed everytime after new such insert

Answer 12 · 2021-10-08T15:00:27.000Z

re-checked just for the case:

I've set pool_size of my Repo configuration to 1 to be sure that everything is related to one connection and executed:

App.Repo.insert(%MyModel{name: "test_1", field_id: nil})
App.Repo.insert(%MyModel{name: "test_2", field_id: "test"})
App.Repo.insert(%MyModel{name: "test_3", field_id: nil})

and I see following queries were sent to my db:

is it expected behaviour?

MySQL unfortunately has a limit on the number of prepared statements, which means we need to be more aggressive to purge them.

Thanks for explanation. But isn't it too agressive?

Answer 13 · 2021-10-08T15:14:47.000Z

Thanks for explanation. But isn't it too agressive?

Oh to be clear, it shouldn't be closing if the query (i.e. the SQL) is the same. So my question is if the statements that are logged are the same or if they have small differences between them.

Answer 14 · 2021-10-08T15:22:07.000Z

That is exactly the problem. They are different. First and third do not contain placeholder for nil/null value, so they look like:

INSERT INTO `my_model` (`name`,`inserted_at`,`updated_at`) VALUES (?,?,?)

and the second one:

INSERT INTO `my_model` (`name`,`field_id`,`inserted_at`,`updated_at`) VALUES (?,?,?,?)

I've found a ticket that describes that nil values will be skipped in insert since ecto 2.0 and it is kind of clear why. But combination of these things makes myxql very slow for such things.

Answer 15 · 2021-10-08T15:30:50.000Z

I see. So it is nothing to change here. The issue is that we are using the same name for different queries. If you change this line:

https://github.com/elixir-ecto/ecto_sql/blob/master/lib/ecto/adapters/myxql.ex#L259

To be:

opts = [{:cache_statement, "ecto_insert_#{source}_#{length(fields)}"} | opts]

It should introduce another degree of caching. Another option is to emit all fields and have the value be DEFAULT, this would keep the query the same.

Answer 16 · 2021-10-08T15:31:44.000Z

Another option is to emit all fields and have the value be DEFAULT, this would keep the query the same.

Nevermind, I believe we can't pass DEFAULT as parameter. :(

Answer 17 · 2021-10-08T15:58:36.000Z

@josevalim Yes, thank you for this, I saw cache_statement in ecto_sql but didn't think about touch it.

So I will prepare PR for ecto_sql if you don't have any objections.

But before I have some additional question/suggestions about cache_statement:

I would suggest to do the same for update and other operations in https://github.com/elixir-ecto/ecto_sql/blob/master/lib/ecto/adapters/sql.ex#L792? Something like you've suggested - cache_statement = "ecto_#{operation}_#{source}_#{length(params)}" (and for insert_all as well) to improve performance of update requests that are similar to the insert case described above. What do you think?
Second possible variant - what do you think about adding ability to pass own cache_statement (even for insert_all case)? Actually we can do it now but it will be ignored right now. Passing own cache_statement in some cases will provide us control on prepared statement cache. That is pretty useful in context of performance, for example for insert_all if you usually have batches of the same size no need to go though the full create-close-create cycle everytime

Answer 18 · 2021-10-08T16:09:14.000Z

Agreed on both notes. Thanks for the convo and I am looking forward to your PR!

Answer 19 · 2022-08-29T09:55:55.000Z

Weird I had that same delay issue trigger on my linux dev environment after installing updates. Which I could solve with the nodelay: true.
No clue why my dev environment suddenly needed that option because before it was fast.

Answer 20 · 2024-05-18T17:18:51.000Z

Closing this. It is stale and it may even have been fixed in Ecto (it has been a while, so I don’t fully remember).