Quil-C 1.27.0 crashes on circuit with parallel parts
mhodson-rigetti opened this issue · 4 comments
A 36Q circuit targeting Aspen-M-2 which consists of 6x 6Q independent, parallel circuits fails to compile on Quil-C 1.27.0 as deployed on the new DockerHub image. A reproduction recipe in notebook form depending only on pyquil
is provided:
In some cases "docker run" on the container exits to the command line with no messages. In some cases the following log is emitted, and we observe high CPU load and increasing memory usage prior to the crash:
<134>1 2023-02-09T03:24:34Z 5a8dc6ecdb0f quilc 1 - - Request 0033288d-dc2a-4bea-ac7e-afbb18673986 received for get_version_info
<134>1 2023-02-09T03:24:34Z 5a8dc6ecdb0f quilc 1 LOG0002 [rigetti@0000 methodName="get_version_info" requestID="0033288d-dc2a-4bea-ac7e-afbb18673986" wallTime="0.157" error="false"] Requested get_version_info completed
<134>1 2023-02-09T03:24:34Z 5a8dc6ecdb0f quilc 1 - - Request 64d69ce1-e3e0-4259-9660-daee2664334c received for quil_to_native_quil
Heap exhausted during garbage collection: 32 bytes available, 48 requested.
Gen Boxed Code Raw LgBox LgCode LgRaw Pin Alloc Waste Trig WP GCs Mem-age
2 48061 0 405 0 0 0 9 1584207120 3926768 623779512 48466 1 1.1910
3 78680 0 1242 6 0 0 67 2615207344 3873360 2000000 79928 0 0.7338
4 0 0 0 0 0 0 0 0 0 2000000 0 0 0.0000
5 0 0 0 0 0 0 0 0 0 2000000 0 0 0.0000
6 1874 7 662 110 0 25 0 85694528 2058176 2000000 2678 0 0.0000
Total bytes allocated = 4285108992
Dynamic-space-size bytes = 4294967296
GC control variables:
*GC-INHIBIT* = true
*GC-PENDING* = true
*STOP-FOR-GC-PENDING* = false
fatal error encountered in SBCL pid 1(tid 0x7f4ba886f700):
Heap exhausted, game over.
0: CL-QUIL::ALGEBRAICALLY-REDUCE-INSTRUCTIONS, pc = 0x5240e13f, fp = 0x7f4ba886d980
1: CL-QUIL::COMPRESS-INSTRUCTIONS-IN-CONTEXT, pc = 0x5240f620, fp = 0x7f4ba886da38
2: CL-QUIL::COMPRESS-INSTRUCTIONS-WITH-POSSIBLY-UNKNOWN-PARAMS, pc = 0x5240fb08, fp = 0x7f4ba886dac0
3: (LABELS CL-QUIL::FLUSH-QUEUE :IN CL-QUIL::COMPRESS-INSTRUCTIONS), pc = 0x52322f07, fp = 0x7f4ba886db88
4: (LABELS CL-QUIL::PROCESS-INSTRUCTION :IN CL-QUIL::COMPRESS-INSTRUCTIONS), pc = 0x52322a94, fp = 0x7f4ba886dc50
5: SB-KERNEL::%MAP-FOR-EFFECT-ARITY-1, pc = 0x52857340, fp = 0x7f4ba886dca8
6: CL-QUIL::COMPRESS-INSTRUCTIONS, pc = 0x523223b3, fp = 0x7f4ba886dd80
7: (LABELS CL-QUIL::PROCESS-BLOCK :IN CL-QUIL::COMPILER-HOOK), pc = 0x52256bca, fp = 0x7f4ba886de60
8: CL-QUIL::COMPILER-HOOK, pc = 0x52255bfc, fp = 0x7f4ba886df60
9: QUILC::PROCESS-PROGRAM, pc = 0x524c0467, fp = 0x7f4ba886e070
10: QUILC::QUIL-TO-NATIVE-QUIL-HANDLER, pc = 0x53e33e73, fp = 0x7f4ba886e140
11: (FLET RPCQ::APPLY-HANDLER :IN RPCQ::%PROCESS-REQUEST), pc = 0x5284a2ff, fp = 0x7f4ba886e298
12: RPCQ::%PROCESS-REQUEST, pc = 0x52849a77, fp = 0x7f4ba886e3a0
13: RPCQ::%PROCESS-RAW-REQUEST, pc = 0x524bda0d, fp = 0x7f4ba886e4c8
14: RPCQ::%RPC-SERVER-THREAD-WORKER, pc = 0x5284a798, fp = 0x7f4ba886e5e8
15: (LAMBDA () :IN RPCQ::START-SERVER), pc = 0x5284bbca, fp = 0x7f4ba886e7c0
16: (LAMBDA () :IN BORDEAUX-THREADS::BINDING-DEFAULT-SPECIALS), pc = 0x525d32d0, fp = 0x7f4ba886e838
17: (FLET SB-UNIX::BODY :IN SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE), pc = 0x52a3790d, fp = 0x7f4ba886e940
18: (FLET "WITHOUT-INTERRUPTS-BODY-4" :IN SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE), pc = 0x52a37dcf, fp = 0x7f4ba886ea80
19: (FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE), pc = 0x52a37498, fp = 0x7f4ba886ebf0
20: (FLET "WITHOUT-INTERRUPTS-BODY-1" :IN SB-THREAD::CALL-WITH-MUTEX), pc = 0x52872adf, fp = 0x7f4ba886ecb8
21: SB-THREAD::CALL-WITH-MUTEX, pc = 0x52872794, fp = 0x7f4ba886ed60
22: SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE, pc = 0x52a36f78, fp = 0x7f4ba886ee98
23: Foreign function call_into_lisp, pc = 0x43b9bf, fp = 0x7f4ba886eed0
24: Foreign function new_thread_trampoline, pc = 0x423a3b, fp = 0x7f4ba886eef0
I have also attached the dictionary representation of the current compiler ISA, which is also sent to Quil-C, in case the issue depends on the current topology.
The issue does not occur if we roll back to 1.26.0; with the earlier version, compilation completes in a few seconds.
Possibly related to #860 (as a reminder to myself)
@mhodson-rigetti A workaround is probably increasing the heap limit when building QUILC. But we can still look into it.
@mhodson-rigetti A workaround is probably increasing the heap limit when building QUILC. But we can still look into it.
@stylewarning I watched process memory sail past 3.2GB. I can't imagine why you need that much heap and more than five minutes to compile a circuit that is native bar the RY decompositions? 2.6.0 did it in seconds.
@mhodson-rigetti I don't think it should need that much memory either, of course.