lgraubner/sitemap-generator-cli

Allocation failed - process out of memory

superlowburn opened this issue · 5 comments

I'm running against a pretty large site, but the ec2 instance I'm running sitemap-gen-cli on has 15G of memory.

<--- Last few GCs --->

221192849 ms: Scavenge 1396.0 (1456.2) -> 1396.0 (1456.2) MB, 0.7 / 0 ms (+ 1.0 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
221194245 ms: Mark-sweep 1396.0 (1456.2) -> 1395.9 (1456.2) MB, 1395.1 / 0 ms (+ 1.0 ms in 1 steps since start of marking, biggest step 1.0 ms) [last resort gc].
221195632 ms: Mark-sweep 1395.9 (1456.2) -> 1395.9 (1456.2) MB, 1387.9 / 0 ms [last resort gc].

<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0xa9bbe6b4629
1: update [/usr/local/lib/node_modules/sitemap-generator-cli/node_modules/simplecrawler/lib/queue.js:~211] [pc=0x171a01be07b2] (this=0x3f4a0fa7a179 ,id=799675,updates=0x2634006fbd39 <an Object with map 0x1bda30095361>,callback=0x2634006fbcf1 <JS Function (SharedFunctionInfo 0x66d3edfbcd1)>)
2: processReceivedData [/usr/local/lib/node_modules/sitemap-generator-cli/node_modules/...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Aborted (core dumped)

Never ran into this before. Could you provide the URL you are trying to run the generator against?

Sure. Thanks for taking a look. http://gbiz.org/California-Yellow-Pages/

The sitemap-generator package needs to implement a more performant way to save the data. I'm into it.

great!

Please check out the latest release (6.1.1). Sitemaps are now directly streamed to your hard drive instead of being kept in the memory. This should solve your issue. If not feel free to reopen this issue.