writing XML footer is very slow

Question

writing XML footer is very slow

galfthan opened this issue 10 years ago · 17 comments

In large simulations close() is quite slow, on Hornet it used 45% of total write time (2.9GB vlsv file, 200 nodes).

Answer 1 · 2015-04-08T14:02:22.000Z

We were misled by our tests in Hornet and Voima assuming results would be the same. On voima (lfs stripe count 1) the footer writing takes very little time while on hornet it takes ages (10 stripes though).

Answer 2 · 2015-04-08T14:28:53.000Z

I tested with one ost on hornet then the footer writing takes again 0.02 seconds out of 80 s spent in IO. So it seems the footer does not like to write on a striped file.

Answer 3 · 2015-04-08T14:36:17.000Z

Could this footer be written using MPI writing instead?

Answer 4 · 2015-04-08T15:15:43.000Z

I can try to replicate this is voima and see if I can make it faster.

Answer 5 · 2015-04-08T18:15:22.000Z

I can replicate this is voima.

I set electric sail simulation to use 130 nodes and 2600 MPI processes. Spatial grid size 480x480, 600x600x1 velocity blocks bounding box.

initial state: 115 GB in 84.7 s, 1.36 GB/s data rate (stripe=8)
- VLSV open: 0.13 s
- VLVS close: 28.1 s
restart: 112.3 GB in 148.7 s, 1.39 GB/s
- VLSV open: 0.16 s
- VLSV close: 67.5 s

Answer 6 · 2015-04-08T19:24:21.000Z

Seems like it's the resizing of the large file which takes most time. It is possible to speed up the footer writing by using MPI calls by about 50% instead of the master-only fstream, but it's still not terribly fast.

Answer 7 · 2015-04-08T20:03:59.000Z

After the previous change,

initial state:
- VLSV close: 9.9 s
restart:
- VLSV close: 8.8 s

Changes are in close-speedup branch, I'll try to make this even faster.

Answer 8 · 2015-04-08T23:21:41.000Z

Hmm apparently resizing the VLSV file to correct size before writing the data doesn't help at all, so the slowness must really be due to striping or something.

Answer 9 · 2015-04-09T00:01:53.000Z

Please try this in Hornet.

initial state: 115 GB in 83.5 s, 1.38 GB/s data rate, VLSV close 0
restart: 112.3 GB in 82.2 s, 1.37 GB/s data rate, VLSV close 0

Answer 10 · 2015-04-09T05:29:38.000Z

I can verify that this solved the issue on Hornet! With 2.9GB files with a stripe count of 10 the effective data rate was 1.3 GB/s. This number also includes some computation in the datareducers.
Close had no impact on performance, took 5 ms / file.

Answer 11 · 2015-04-09T18:04:29.000Z

If the data on output files looks good, I'll merge this to master.

Answer 12 · 2015-04-10T14:35:32.000Z

The data looks good. Restart speed on Hornet has also improved a bit, the record is now ~3.9GB/s with a stripe count of 42. All of that time (>99%) is spent writing the distribution data, so any further optimization would have to take place there.

Answer 13 · 2015-04-10T14:37:27.000Z

Cray/Lustre has many of the same collective MPI optimizations in place that are in Adios, so trying to tune the MPI hints might be the next step.

Answer 14 · 2015-04-10T14:51:27.000Z

I tried a few (to do with collective buffering), which were recommended on some Cray centers documentation, but did not see much improvement. On the other hand my testing was not very comprehensive so it does not prove much.

I also did an experiment where I changed the last file io in vlsv write to collective calls, and then turned on the no_independent_io flag (this is not exact name, see man intro_mpi). It supposedly helps at least with open performance. I saw no difference with this flag on, so I did not bother pushing that vlsv variant. On much larger core counts there might be some effect...

Answer 15 · 2015-04-12T19:34:40.000Z

As mentioned on flowdock, vlasiator runs on hornet do not restart with a "Cell migration failed" error so there is probably some catch still...

Answer 16 · 2015-04-13T06:51:25.000Z

No that was related to another issue with not update local cell cache.

Answer 17 · 2015-04-14T21:52:57.000Z

This issue has been resolved, closing it.