GeertJohan/go.rice

embed-go performance and memory usage

nkovacs opened this issue · 9 comments

I've optimized embed-go a bit. The trick is that I don't write the file contents into the source code, just a placeholder, format the code with the placeholders, then use fasttemplate to replace the placeholders with the contents of the files, streaming it directly from the original files to the destination go file.

This means the files are never held in memory completely, so memory usage is much lower.
It also avoids running gofmt on a very large source code, which speeds up things a bit and also lowers memory usage.

Since gofmt doesn't always align struct values depending on their length, I had to add an empty line to make sure the code is always correctly formatted.

I also had to copy the code behind strconv.Quote, unfortunately the public API was not good enough, and performance was pretty bad with it.

With a fairly large box of 387 files, mostly javascript, the numbers are:
before:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.18
Maximum resident set size (kbytes): 264668
after:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.56
Maximum resident set size (kbytes): 9144

With a single, 80Mb file:
before:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.59
Maximum resident set size (kbytes): 2213800
after:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.03
Maximum resident set size (kbytes): 10192

I haven't tested it thoroughly and I still need to escape the fasttemplate placeholders in case they show up somewhere else in the generated code.

The code is in this branch: https://github.com/nkovacs/go.rice/tree/fasttemplate

What do you think?

Wow, that is pretty neat! I'm just wondering: isn't there a package that does the encoding into a string? It feels a bit weird to have that as part of the embedding flow instead of a separate package.

I couldn't find one. Since strconv.Quote is good for 99% use cases, I guess no one needed to create one.

I've extracted it into a separate package and added some tests, including all the strconv.Quote tests from the standard library: https://github.com/nkovacs/streamquote

I'll update this PR tomorrow.

That's awesome, thanks!

I optimized streamquote (got rid of a bunch of small slice allocations), helps a bit with large files (v1 is the previous version from february, v2 is with the optimization)

one big file:

metric master v1 v2
rice wall time 0:06.72 0:02.90 0:02.11
rice max resident (kbytes) 1988876 10244 4720
build wall time 0:04.17 0:04.31 0:04.38
build max resident (kbytes) 1023208 1028712 1040584

many small files:

metric master v1 v2
rice wall time 0:00.40 0:00.20 0:00.22
rice max resident (kbytes) 104248 9232 9180
build wall time 0:01.63 0:01.63 0:01.63
build max resident (kbytes) 181652 181008 190088

Neat! Do you think this is ready for a PR, or do you still want to change some things?

I still need to fix filename escaping, if the filename contains {%%}, it might cause problems with fasttemplate.

Filename escaping has been merged. Thanks!

Do you want to add more code related to this feature, or can we close this issue?

I just need to fix the issue in my previous comment, I'll have a PR up soon.

Just merged the PR. Thanks!