google/ml-metadata

put_attributions_and_associations hang forever if big array size

drsagitn opened this issue · 6 comments

Hi,

In python mlmd, when i called store.put_attributions_and_associations(attribution_arr, association_arr) with big array size of attribution (about 1500) then the function hang forever. Making chunk of 100 work but very slow. Is there performance issue in that function?

@drsagitn thanks for letting us know the issue. The call is not optimized for the large arrays, currently for each element it does validation and insert. let's keep the issue open and fix it in the next release.

A couple questions about the usage: What is the backend and deployment setting? And how many artifacts, executions, contexts in your instances? Curious to know the use case here too! :)

That is the very first artifact insertion to the db. I tried to insert 1500 artifacts describing image metadata of a dataset. There is about 1 or 2 executions and context inserted before.

interesting, previously i thought this was a big shared instance. what is the physical db and deployment settings here (a shared mysql with a kfp server, a nfs sqlite file?). could this cause by the deployment?

about the use case, qq, have you considered to have a single dataset as an artifact, and that artifact has 1500 properties, each of which describing an image?

The database is mysql 8.0 hosted by GCP. It is newly setup db and only used for mlmd.

For usecase, I do have dataset artifact as well. Dataset has versions (the context). Each version manages a list of committed images.

Image metadata has annotation and others properties which is pretty long, it couldn't be fitted into a string_value property of dataset which is only 65535 bytes length

got it. thanks for the info. one thing to tune the deployment setting is see whether native sql insertion to that db has latency issue. let's also keep this issue open to optimize the call for large arrays too.

/cc @BrianSong

I tested native sql insertion take only 3s to complete 1500 records