mit-plv/fiat-crypto

Bedrock/End2End/RupicolaCrypto/Derive.v is too-memory heavy

JasonGross opened this issue · 5 comments

This is preventing us from being tested on Coq's CI, see coq/coq#16801. Last I checked, it takes about a minute to compile and about 4 GB of RAM (4384020 ko, as of 70f832b). This is up from 20s and about 1GB of RAM (1169688 ko as of 42cace1)

@andres-erbsen @DIJamner can one of you take care of this?

My guess would be around https://github.com/mit-plv/fiat-crypto/pull/1482/files#diff-076269a2427c8322a666f8b6d8d554d51d53706a7785ff1d7873c6282e30355bR173 for the cause. As for what to do about it, idk. In GarageDoor program-logic proofs, sprinkling Optimize Proof and Optimize Heap helped quite a bit...

The timing and memory log is

.607830900s +397672kb: Chars 0 - 32 [Require~Import~Rupicola.Lib.Api.] 0.494 secs (0.231u,0.193s)
.016281200s +13060kb: Chars 33 - 68 [Require~Import~Rupicola.Lib.Ar...] 0.008 secs (0.008u,0.s)
.004501300s +2308kb: Chars 69 - 103 [Require~Import~Rupicola.Lib.Lo...] 0. secs (0.u,0.s)
.004524400s +1844kb: Chars 104 - 139 [Require~Import~Rupicola.Lib.Ge...] 0. secs (0.u,0.s)
.004777000s +2048kb: Chars 140 - 185 [Require~Import~coqutil.Word.Li...] 0. secs (0.u,0.s)
2.400876900s +649160kb: Chars 186 - 243 [Require~Import~Crypto.Bedrock....] 2.42 secs (2.272u,0.147s)
.005078100s +0kb: Chars 244 - 286 [Require~Import~bedrock2.BasicC...] 0. secs (0.u,0.s)
.005360700s +0kb: Chars 287 - 350 [Require~Import~Crypto.Bedrock....] 0. secs (0.u,0.s)
.005389200s +0kb: Chars 352 - 369 [Section~Bedrock2.] 0. secs (0.u,0.s)
.004914000s +0kb: Chars 373 - 682 [Instance~spec_of_quarter~:~(sp...] 0.005 secs (0.005u,0.s)
.005121700s +0kb: Chars 686 - 876 [Derive~<genarg:identref>~SuchT...] 0.003 secs (0.003u,0.s)
.007705600s +0kb: Chars 879 - 885 [Proof.] 0. secs (0.u,0.s)
11.323829500s +65660kb: Chars 890 - 898 [compile.] 11.346 secs (11.333u,0.012s)
1.365369600s +8192kb: Chars 901 - 905 [Qed.] 1.308 secs (1.308u,0.s)
.010491100s +0kb: Chars 909 - 2008 [Lemma~compile_quarter~:~~~fora...] 0.009 secs (0.009u,0.s)
.005392300s +0kb: Chars 2011 - 2017 [Proof.] 0. secs (0.u,0.s)
.014554100s +0kb: Chars 2022 - 2042 [(repeat~straightline).] 0.019 secs (0.019u,0.s)
.048286300s +0kb: Chars 2047 - 2078 [(repeat~(eexists;~split;~eauto)).] 0.047 secs (0.047u,0.s)
.218557700s +0kb: Chars 2083 - 2102 [(handle_call;~eauto).] 0.218 secs (0.218u,0.s)
.048286600s +2048kb: Chars 2105 - 2109 [Qed.] 0.043 secs (0.043u,0.s)
.011697300s +0kb: Chars 2116 - 2920 [Instance~spec_of_chacha20_bloc...] 0.011 secs (0.011u,0.s)
.004657600s +0kb: Chars 2926 - 3176 [Derive~<genarg:identref>~SuchT...] 0.003 secs (0.003u,0.s)
.005260500s +0kb: Chars 3179 - 3185 [Proof.] 0. secs (0.u,0.s)
.060369600s +0kb: Chars 3190 - 3204 [compile_setup.] 0.065 secs (0.065u,0.s)
.005912400s +0kb: Chars 3209 - 3235 [(cbn[chacha20_block_init]).] 0.005 secs (0.005u,0.s)
.589596700s +8192kb: Chars 3240 - 3260 [(repeat~compile_step).] 0.589 secs (0.58u,0.009s)
.018440500s +0kb: Chars 3266 - 3304 [(simple~eapply~compile_nlet_as...] 0.018 secs (0.018u,0.s)
.702447400s +4096kb: Chars 3309 - 3366 [(simple~eapply~compile_buf_bac...] 0.702 secs (0.701u,0.s)
.007204800s +0kb: Chars 3371 - 3405 [instantiate~(~1~:=~(Naive.wrap...] 0. secs (0.u,0.s)
.007387400s +0kb: Chars 3410 - 3423 [(simpl;~eauto).] 0.013 secs (0.013u,0.s)
.026596600s +0kb: Chars 3429 - 3467 [(simple~eapply~compile_nlet_as...] 0.025 secs (0.025u,0.s)
.957970900s +6144kb: Chars 3472 - 3524 [(eapply~compile_buf_push_word3...] 0.957 secs (0.938u,0.019s)
.005490100s +0kb: Chars 3529 - 3580 [(subst~v;~cbn[buf_backed_by~Da...] 0.004 secs (0.004u,0.s)
.063464700s +0kb: Chars 3585 - 3629 [(subst~v;~unfold~buf_backed_by...] 0.063 secs (0.063u,0.s)
.020657400s +0kb: Chars 3635 - 3673 [(simple~eapply~compile_nlet_as...] 0.02 secs (0.02u,0.s)
1.016170000s +4096kb: Chars 3678 - 3730 [(eapply~compile_buf_push_word3...] 1.015 secs (1.015u,0.s)
.006738100s +0kb: Chars 3735 - 3765 [(subst~v;~unfold~buf_backed_by).] 0. secs (0.u,0.s)
.005752700s +0kb: Chars 3766 - 3782 [(unfold~buf_push).] 0. secs (0.u,0.s)
.009526000s +0kb: Chars 3783 - 3789 [eauto.] 0.002 secs (0.002u,0.s)
.005258800s +0kb: Chars 3790 - 3796 [(simpl).] 0. secs (0.u,0.s)
.006234500s +0kb: Chars 3797 - 3801 [lia.] 0.005 secs (0.005u,0.s)
.006195300s +0kb: Chars 3806 - 3836 [(subst~v;~unfold~buf_backed_by).] 0. secs (0.u,0.s)
.005234300s +0kb: Chars 3837 - 3853 [(unfold~buf_push).] 0. secs (0.u,0.s)
.007817500s +2048kb: Chars 3858 - 3864 [admit.] 0. secs (0.u,0.s)
.006488000s +0kb: Chars 3892 - 3930 [(simple~eapply~compile_nlet_as...] 0.027 secs (0.027u,0.s)
1.110026300s +4336kb: Chars 3935 - 3987 [(eapply~compile_buf_push_word3...] 1.13 secs (1.13u,0.s)
.006273400s +0kb: Chars 3992 - 3993 [{] 0. secs (0.u,0.s)
.006338900s +0kb: Chars 3994 - 4024 [(subst~v;~unfold~buf_backed_by).] 0. secs (0.u,0.s)
.005443500s +0kb: Chars 4025 - 4041 [(unfold~buf_push).] 0. secs (0.u,0.s)
.007070300s +0kb: Chars 4042 - 4048 [eauto.] 0.002 secs (0.002u,0.s)
.007558600s +0kb: Chars 4049 - 4055 [(simpl).] 0. secs (0.u,0.s)
.007202400s +0kb: Chars 4056 - 4060 [lia.] 0.006 secs (0.006u,0.s)
.005855500s +0kb: Chars 4061 - 4062 [}] 0. secs (0.u,0.s)
.005674600s +0kb: Chars 4067 - 4068 [{] 0. secs (0.u,0.s)
.006065600s +0kb: Chars 4069 - 4099 [(subst~v;~unfold~buf_backed_by).] 0. secs (0.u,0.s)
.007174800s +0kb: Chars 4100 - 4116 [(unfold~buf_push).] 0. secs (0.u,0.s)
.007685400s +0kb: Chars 4117 - 4140 [admit.] 0. secs (0.u,0.s)
.006266100s +0kb: Chars 4141 - 4142 [}] 0. secs (0.u,0.s)
.006820900s +0kb: Chars 4148 - 4186 [(simple~eapply~compile_nlet_as...] 0.029 secs (0.029u,0.s)
1.191647400s +6144kb: Chars 4191 - 4243 [(eapply~compile_buf_push_word3...] 1.234 secs (1.234u,0.s)
.006248500s +0kb: Chars 4248 - 4249 [{] 0. secs (0.u,0.s)
.004746000s +0kb: Chars 4250 - 4280 [(subst~v;~unfold~buf_backed_by).] 0. secs (0.u,0.s)
.004976300s +0kb: Chars 4281 - 4297 [(unfold~buf_push).] 0. secs (0.u,0.s)
.006189200s +0kb: Chars 4298 - 4304 [eauto.] 0.003 secs (0.003u,0.s)
.006485400s +0kb: Chars 4305 - 4311 [(simpl).] 0. secs (0.u,0.s)
.005110700s +0kb: Chars 4312 - 4316 [lia.] 0.005 secs (0.005u,0.s)
.005710900s +0kb: Chars 4317 - 4318 [}] 0. secs (0.u,0.s)
.006370900s +0kb: Chars 4323 - 4324 [{] 0. secs (0.u,0.s)
.005590400s +0kb: Chars 4325 - 4355 [(subst~v;~unfold~buf_backed_by).] 0. secs (0.u,0.s)
.007016900s +0kb: Chars 4356 - 4372 [(unfold~buf_push).] 0. secs (0.u,0.s)
.005874000s +0kb: Chars 4373 - 4397 [admit.] 0. secs (0.u,0.s)
.005873400s +0kb: Chars 4398 - 4399 [}] 0. secs (0.u,0.s)
.005920500s +0kb: Chars 4405 - 4443 [(simple~eapply~compile_nlet_as...] 0.019 secs (0.019u,0.s)
.206831300s +2048kb: Chars 4448 - 4510 [(simple~eapply~compile_w32s_of...] 0.25 secs (0.25u,0.s)
.005520600s +0kb: Chars 4516 - 4529 [compile_step.] 0.001 secs (0.001u,0.s)
.919506400s +4096kb: Chars 4534 - 4574 [(compile_step;~[~repeat~compil...] 0.922 secs (0.922u,0.s)
.005245400s +0kb: Chars 4580 - 4623 [(eapply~expr_compile_Z_literal...] 0. secs (0.u,0.s)
.005369000s +0kb: Chars 4628 - 4635 [shelve.] 0. secs (0.u,0.s)
.005087900s +0kb: Chars 4640 - 4653 [compile_step.] 0.001 secs (0.001u,0.s)
.016631400s +0kb: Chars 4833 - 4867 [(simple~eapply~compile_word_me...] 0.029 secs (0.029u,0.s)
.005655300s +0kb: Chars 4872 - 4903 [shelve.] 0. secs (0.u,0.s)
.109404900s +2048kb: Chars 4908 - 4928 [(repeat~compile_step).] 0.114 secs (0.114u,0.s)
.005093600s +0kb: Chars 4933 - 4934 [{] 0. secs (0.u,0.s)
.005694200s +0kb: Chars 4941 - 4957 [(unfold~buf_push).] 0. secs (0.u,0.s)
.005658200s +0kb: Chars 4964 - 4984 [(rewrite~!app_length).] 0.015 secs (0.015u,0.s)
.005092700s +0kb: Chars 4991 - 5012 [(unfold~buf_backed_by).] 0. secs (0.u,0.s)
.004838000s +0kb: Chars 5019 - 5040 [(cbn[length~Nat.add]).] 0.001 secs (0.001u,0.s)
.008032400s +0kb: Chars 5047 - 5109 [replace~(Datatypes.length~v3)~...] 0.003 secs (0.003u,0.s)
.005159600s +0kb: Chars 5116 - 5122 [admit.] 0. secs (0.u,0.s)
.005325100s +0kb: Chars 5127 - 5128 [}] 0. secs (0.u,0.s)
.007161000s +0kb: Chars 5133 - 5139 [admit.] 0. secs (0.u,0.s)
.004567400s +0kb: Chars 5144 - 5150 [admit.] 0. secs (0.u,0.s)
.006246400s +0kb: Chars 5155 - 5161 [admit.] 0. secs (0.u,0.s)
.006705900s +0kb: Chars 5166 - 5172 [admit.] 0. secs (0.u,0.s)
.004929600s +0kb: Chars 5177 - 5183 [admit.] 0. secs (0.u,0.s)
.006487000s +0kb: Chars 5188 - 5201 [compile_step.] 0. secs (0.u,0.s)
.005553600s +0kb: Chars 5206 - 5233 [(eapply~compile_skip_marker).] 0.018 secs (0.018u,0.s)
.005238100s +0kb: Chars 5238 - 5276 [(simple~eapply~compile_nlet_as...] 0.03 secs (0.03u,0.s)
1.070070700s +6144kb: Chars 5281 - 5331 [(eapply~compile_bytes_of_w32s;...] 1.087 secs (1.087u,0.s)
.005480700s +0kb: Chars 5336 - 5342 [admit.] 0. secs (0.u,0.s)
.014652500s +0kb: Chars 5347 - 5385 [(simple~eapply~compile_nlet_as...] 0.019 secs (0.019u,0.s)
3.106644600s +14292kb: Chars 5390 - 5440 [(eapply~compile_w32s_of_bytes;...] 3.106 secs (3.096u,0.01s)
.005278700s +0kb: Chars 5445 - 5451 [admit.] 0. secs (0.u,0.s)
.005132600s +0kb: Chars 5456 - 5462 [admit.] 0. secs (0.u,0.s)
.005236300s +0kb: Chars 5467 - 5473 [admit.] 0. secs (0.u,0.s)
.005427600s +0kb: Chars 5478 - 5484 [admit.] 0. secs (0.u,0.s)
.004970200s +0kb: Chars 5489 - 5516 [(eapply~compile_skip_marker).] 0.016 secs (0.016u,0.s)
.019295400s +0kb: Chars 5521 - 5559 [(simple~eapply~compile_nlet_as...] 0.028 secs (0.028u,0.s)
1.244258800s +6144kb: Chars 5564 - 5614 [(eapply~compile_bytes_of_w32s;...] 1.243 secs (1.243u,0.s)
.005586900s +0kb: Chars 5619 - 5625 [admit.] 0. secs (0.u,0.s)
.015259100s +0kb: Chars 5630 - 5668 [(simple~eapply~compile_nlet_as...] 0.02 secs (0.02u,0.s)
1.228919700s +6144kb: Chars 5673 - 5722 [(eapply~compile_buf_as_array;~...] 1.228 secs (1.228u,0.s)
.004897800s +0kb: Chars 5727 - 5733 [admit.] 0. secs (0.u,0.s)
.005016300s +0kb: Chars 5738 - 5744 [admit.] 0. secs (0.u,0.s)
.009262100s +0kb: Chars 5749 - 5787 [(simple~eapply~compile_nlet_as...] 0.018 secs (0.018u,0.s)
3.370538700s +12288kb: Chars 5792 - 5857 [(eapply~compile_buf_make_stack...] 3.369 secs (3.369u,0.s)
.026147600s +0kb: Chars 5862 - 5904 [(eapply~compile_buf_as_array;~...] 0.025 secs (0.025u,0.s)
.019078800s +0kb: Chars 5909 - 5936 [(eapply~compile_skip_marker).] 0.018 secs (0.018u,0.s)
.027620900s +0kb: Chars 5946 - 5984 [(simple~eapply~compile_nlet_as...] 0.027 secs (0.027u,0.s)
.028160900s +0kb: Chars 5989 - 6031 [(eapply~compile_buf_as_array;~...] 0.027 secs (0.027u,0.s)
1.030960900s +18432kb: Chars 6037 - 6219 [(repeat~~~lazymatch~goal~with~...] 1.03 secs (1.03u,0.s)
.025625200s +0kb: Chars 6224 - 6255 [(eapply~compile_nlet_as_nlet_eq).] 0.025 secs (0.025u,0.s)
.040247700s +0kb: Chars 6260 - 6316 [(eapply~compile_word_unsizedli...] 0.039 secs (0.039u,0.s)
.031640800s +0kb: Chars 6321 - 6352 [(eapply~compile_nlet_as_nlet_eq).] 0.031 secs (0.031u,0.s)
.041896400s +0kb: Chars 6357 - 6413 [(eapply~compile_word_unsizedli...] 0.041 secs (0.041u,0.s)
.035976500s +4096kb: Chars 6418 - 6449 [(eapply~compile_nlet_as_nlet_eq).] 0.035 secs (0.025u,0.009s)
.043245800s +0kb: Chars 6454 - 6510 [(eapply~compile_word_unsizedli...] 0.042 secs (0.042u,0.s)
.030644900s +0kb: Chars 6515 - 6546 [(eapply~compile_nlet_as_nlet_eq).] 0.03 secs (0.03u,0.s)
.049679100s +0kb: Chars 6551 - 6607 [(eapply~compile_word_unsizedli...] 0.049 secs (0.049u,0.s)
.030052600s +0kb: Chars 6612 - 6643 [(eapply~compile_nlet_as_nlet_eq).] 0.029 secs (0.029u,0.s)
.056636200s +6144kb: Chars 6648 - 6704 [(eapply~compile_word_unsizedli...] 0.056 secs (0.046u,0.009s)
.030334700s +0kb: Chars 6709 - 6740 [(eapply~compile_nlet_as_nlet_eq).] 0.029 secs (0.029u,0.s)
.039940200s +0kb: Chars 6745 - 6801 [(eapply~compile_word_unsizedli...] 0.039 secs (0.039u,0.s)
.041550200s +0kb: Chars 6806 - 6837 [(eapply~compile_nlet_as_nlet_eq).] 0.041 secs (0.041u,0.s)
.041452100s +0kb: Chars 6842 - 6898 [(eapply~compile_word_unsizedli...] 0.04 secs (0.04u,0.s)
.031009700s +0kb: Chars 6903 - 6934 [(eapply~compile_nlet_as_nlet_eq).] 0.03 secs (0.03u,0.s)
.050985800s +6144kb: Chars 6939 - 6995 [(eapply~compile_word_unsizedli...] 0.05 secs (0.05u,0.s)
.030098100s +0kb: Chars 7000 - 7031 [(eapply~compile_nlet_as_nlet_eq).] 0.029 secs (0.029u,0.s)
.044107100s +0kb: Chars 7036 - 7092 [(eapply~compile_word_unsizedli...] 0.043 secs (0.043u,0.s)
.047688700s +0kb: Chars 7097 - 7128 [(eapply~compile_nlet_as_nlet_eq).] 0.046 secs (0.046u,0.s)
.054298700s +0kb: Chars 7133 - 7189 [(eapply~compile_word_unsizedli...] 0.054 secs (0.054u,0.s)
.032654100s +0kb: Chars 7194 - 7225 [(eapply~compile_nlet_as_nlet_eq).] 0.032 secs (0.032u,0.s)
.053028600s +6144kb: Chars 7230 - 7286 [(eapply~compile_word_unsizedli...] 0.052 secs (0.052u,0.s)
.033668100s +0kb: Chars 7291 - 7322 [(eapply~compile_nlet_as_nlet_eq).] 0.033 secs (0.033u,0.s)
.046297800s +0kb: Chars 7327 - 7383 [(eapply~compile_word_unsizedli...] 0.045 secs (0.045u,0.s)
.037655400s +0kb: Chars 7388 - 7419 [(eapply~compile_nlet_as_nlet_eq).] 0.037 secs (0.037u,0.s)
.042930600s +0kb: Chars 7424 - 7480 [(eapply~compile_word_unsizedli...] 0.042 secs (0.042u,0.s)
.027067300s +0kb: Chars 7485 - 7516 [(eapply~compile_nlet_as_nlet_eq).] 0.026 secs (0.026u,0.s)
.048586700s +6260kb: Chars 7521 - 7577 [(eapply~compile_word_unsizedli...] 0.048 secs (0.048u,0.s)
.031047300s +0kb: Chars 7582 - 7613 [(eapply~compile_nlet_as_nlet_eq).] 0.03 secs (0.03u,0.s)
.046948800s +0kb: Chars 7618 - 7674 [(eapply~compile_word_unsizedli...] 0.046 secs (0.046u,0.s)
.031328800s +0kb: Chars 7679 - 7710 [(eapply~compile_nlet_as_nlet_eq).] 0.031 secs (0.031u,0.s)
.038865900s +0kb: Chars 7715 - 7771 [(eapply~compile_word_unsizedli...] 0.038 secs (0.038u,0.s)
.005893900s +0kb: Chars 7776 - 7789 [compile_step.] 0. secs (0.u,0.s)
.096731800s +4096kb: Chars 7795 - 7846 [(rewrite~Nat_iter_as_nd_ranged...] 0.101 secs (0.101u,0.s)
.016560700s +0kb: Chars 7851 - 7886 [(change~(0~+~Z.of_nat~10)~with...] 0.016 secs (0.016u,0.s)
.028282600s +0kb: Chars 7896 - 7934 [(simple~eapply~compile_nlet_as...] 0.028 secs (0.028u,0.s)
.091941300s +4096kb: Chars 7939 - 7978 [(eapply~compile_nd_ranged_for_...] 0.091 secs (0.091u,0.s)
.005426700s +0kb: Chars 7983 - 7990 [shelve.] 0. secs (0.u,0.s)
.004822500s +0kb: Chars 7995 - 8002 [shelve.] 0. secs (0.u,0.s)
.005529600s +0kb: Chars 8007 - 8014 [shelve.] 0. secs (0.u,0.s)
.005714900s +0kb: Chars 8019 - 8026 [shelve.] 0. secs (0.u,0.s)
.005490400s +0kb: Chars 8031 - 8038 [shelve.] 0. secs (0.u,0.s)
.005816800s +0kb: Chars 8043 - 8050 [shelve.] 0. secs (0.u,0.s)
.005335700s +0kb: Chars 8056 - 8057 [{] 0. secs (0.u,0.s)
.005005500s +0kb: Chars 8064 - 8077 [compile_step.] 0.004 secs (0.004u,0.s)
.005268900s +0kb: Chars 8084 - 8122 [(simple~eapply~compile_nlet_as...] 0.009 secs (0.009u,0.s)
.140791800s +2048kb: Chars 8129 - 8203 [(simple~eapply~compile_quarter...] 0.173 secs (0.173u,0.s)
.009321100s +0kb: Chars 8210 - 8248 [(simple~eapply~compile_nlet_as...] 0.008 secs (0.008u,0.s)
.184521000s +2048kb: Chars 8255 - 8329 [(simple~eapply~compile_quarter...] 0.183 secs (0.183u,0.s)
.010549900s +0kb: Chars 8336 - 8374 [(simple~eapply~compile_nlet_as...] 0.01 secs (0.01u,0.s)
.205846400s +0kb: Chars 8381 - 8455 [(simple~eapply~compile_quarter...] 0.205 secs (0.205u,0.s)
.009481100s +0kb: Chars 8462 - 8500 [(simple~eapply~compile_nlet_as...] 0.009 secs (0.009u,0.s)
.217535300s +2048kb: Chars 8507 - 8581 [(simple~eapply~compile_quarter...] 0.216 secs (0.216u,0.s)
.010149800s +0kb: Chars 8588 - 8626 [(simple~eapply~compile_nlet_as...] 0.009 secs (0.009u,0.s)
.223725100s +2048kb: Chars 8633 - 8707 [(simple~eapply~compile_quarter...] 0.223 secs (0.223u,0.s)
.011027000s +0kb: Chars 8714 - 8752 [(simple~eapply~compile_nlet_as...] 0.01 secs (0.01u,0.s)
.250134700s +2048kb: Chars 8759 - 8833 [(simple~eapply~compile_quarter...] 0.249 secs (0.249u,0.s)
.008515300s +0kb: Chars 8840 - 8878 [(simple~eapply~compile_nlet_as...] 0.008 secs (0.008u,0.s)
.275955300s +0kb: Chars 8885 - 8959 [(simple~eapply~compile_quarter...] 0.275 secs (0.275u,0.s)
.010253900s +0kb: Chars 8966 - 9004 [(simple~eapply~compile_nlet_as...] 0.009 secs (0.009u,0.s)
.288412300s +2048kb: Chars 9011 - 9085 [(simple~eapply~compile_quarter...] 0.288 secs (0.288u,0.s)
.005070300s +0kb: Chars 9092 - 9098 [admit.] 0. secs (0.u,0.s)
.005606600s +0kb: Chars 9103 - 9104 [}] 0. secs (0.u,0.s)
.005112900s +0kb: Chars 9109 - 9122 [compile_step.] 0. secs (0.u,0.s)
1.592484000s +6144kb: Chars 9132 - 9417 [(repeat~~~lazymatch~goal~with~...] 1.606 secs (1.575u,0.03s)
.009540200s +0kb: Chars 9423 - 9454 [(eapply~compile_nlet_as_nlet_eq).] 0.008 secs (0.008u,0.s)
.127917800s +0kb: Chars 9461 - 9830 [(lazymatch~goal~with~~|~|-~Wea...] 0.126 secs (0.126u,0.s)
.008674600s +0kb: Chars 9837 - 9838 [{] 0. secs (0.u,0.s)
.015578000s +0kb: Chars 9847 - 9873 [(eapply~broadcast_word_add).] 0.024 secs (0.024u,0.s)
.008924200s +0kb: Chars 9882 - 9889 [shelve.] 0. secs (0.u,0.s)
.019810300s +0kb: Chars 9898 - 9926 [(eapply~broadcast_id;~shelve).] 0.027 secs (0.027u,0.s)
.008375600s +0kb: Chars 9935 - 9995 [admit.] 0. secs (0.u,0.s)
.009499600s +0kb: Chars 10002 - 10003 [}] 0. secs (0.u,0.s)
.008678100s +0kb: Chars 10010 - 10023 [compile_step.] 0.001 secs (0.001u,0.s)
.008295800s +0kb: Chars 10028 - 10059 [(eapply~compile_nlet_as_nlet_eq).] 0.008 secs (0.008u,0.s)
.008136100s +0kb: Chars 10066 - 10109 [(eapply~compile_bytes_of_w32s;...] 0.007 secs (0.007u,0.s)
.008069300s +0kb: Chars 10116 - 10129 [compile_step.] 0. secs (0.u,0.s)
.008791100s +0kb: Chars 10136 - 10167 [(eapply~compile_nlet_as_nlet_eq).] 0.008 secs (0.008u,0.s)
.008999000s +0kb: Chars 10174 - 10216 [(eapply~compile_buf_as_array;~...] 0.004 secs (0.004u,0.s)
42.943439400s +2674652kb: Chars 10223 - 10236 [compile_step.] 42.978 secs (42.148u,0.829s)
.004786100s +0kb: Chars 10246 - 10252 [Abort.] 0. secs (0.u,0.s)
.004552800s -3978748kb: Chars 10256 - 10269 [End~Bedrock2.] 0.005 secs (0.005u,0.s)

Sorted with sed s'/^\([^ ]*\) \(.\)\([^k]*\)\(kb\)/\3 (\2) \4 \1/g' | sort -h gives

1844 (+) kb .004524400s: Chars 104 - 139 [Require~Import~Rupicola.Lib.Ge...] 0. secs (0.u,0.s)
2048 (+) kb .004777000s: Chars 140 - 185 [Require~Import~coqutil.Word.Li...] 0. secs (0.u,0.s)
2048 (+) kb .007817500s: Chars 3858 - 3864 [admit.] 0. secs (0.u,0.s)
2048 (+) kb .048286600s: Chars 2105 - 2109 [Qed.] 0.043 secs (0.043u,0.s)
2048 (+) kb .109404900s: Chars 4908 - 4928 [(repeat~compile_step).] 0.114 secs (0.114u,0.s)
2048 (+) kb .140791800s: Chars 8129 - 8203 [(simple~eapply~compile_quarter...] 0.173 secs (0.173u,0.s)
2048 (+) kb .184521000s: Chars 8255 - 8329 [(simple~eapply~compile_quarter...] 0.183 secs (0.183u,0.s)
2048 (+) kb .206831300s: Chars 4448 - 4510 [(simple~eapply~compile_w32s_of...] 0.25 secs (0.25u,0.s)
2048 (+) kb .217535300s: Chars 8507 - 8581 [(simple~eapply~compile_quarter...] 0.216 secs (0.216u,0.s)
2048 (+) kb .223725100s: Chars 8633 - 8707 [(simple~eapply~compile_quarter...] 0.223 secs (0.223u,0.s)
2048 (+) kb .250134700s: Chars 8759 - 8833 [(simple~eapply~compile_quarter...] 0.249 secs (0.249u,0.s)
2048 (+) kb .288412300s: Chars 9011 - 9085 [(simple~eapply~compile_quarter...] 0.288 secs (0.288u,0.s)
2308 (+) kb .004501300s: Chars 69 - 103 [Require~Import~Rupicola.Lib.Lo...] 0. secs (0.u,0.s)
4096 (+) kb .035976500s: Chars 6418 - 6449 [(eapply~compile_nlet_as_nlet_eq).] 0.035 secs (0.025u,0.009s)
4096 (+) kb .091941300s: Chars 7939 - 7978 [(eapply~compile_nd_ranged_for_...] 0.091 secs (0.091u,0.s)
4096 (+) kb .096731800s: Chars 7795 - 7846 [(rewrite~Nat_iter_as_nd_ranged...] 0.101 secs (0.101u,0.s)
4096 (+) kb .702447400s: Chars 3309 - 3366 [(simple~eapply~compile_buf_bac...] 0.702 secs (0.701u,0.s)
4096 (+) kb .919506400s: Chars 4534 - 4574 [(compile_step;~[~repeat~compil...] 0.922 secs (0.922u,0.s)
4096 (+) kb 1.016170000s: Chars 3678 - 3730 [(eapply~compile_buf_push_word3...] 1.015 secs (1.015u,0.s)
4336 (+) kb 1.110026300s: Chars 3935 - 3987 [(eapply~compile_buf_push_word3...] 1.13 secs (1.13u,0.s)
6144 (+) kb .050985800s: Chars 6939 - 6995 [(eapply~compile_word_unsizedli...] 0.05 secs (0.05u,0.s)
6144 (+) kb .053028600s: Chars 7230 - 7286 [(eapply~compile_word_unsizedli...] 0.052 secs (0.052u,0.s)
6144 (+) kb .056636200s: Chars 6648 - 6704 [(eapply~compile_word_unsizedli...] 0.056 secs (0.046u,0.009s)
6144 (+) kb .957970900s: Chars 3472 - 3524 [(eapply~compile_buf_push_word3...] 0.957 secs (0.938u,0.019s)
6144 (+) kb 1.070070700s: Chars 5281 - 5331 [(eapply~compile_bytes_of_w32s;...] 1.087 secs (1.087u,0.s)
6144 (+) kb 1.191647400s: Chars 4191 - 4243 [(eapply~compile_buf_push_word3...] 1.234 secs (1.234u,0.s)
6144 (+) kb 1.228919700s: Chars 5673 - 5722 [(eapply~compile_buf_as_array;~...] 1.228 secs (1.228u,0.s)
6144 (+) kb 1.244258800s: Chars 5564 - 5614 [(eapply~compile_bytes_of_w32s;...] 1.243 secs (1.243u,0.s)
6144 (+) kb 1.592484000s: Chars 9132 - 9417 [(repeat~~~lazymatch~goal~with~...] 1.606 secs (1.575u,0.03s)
6260 (+) kb .048586700s: Chars 7521 - 7577 [(eapply~compile_word_unsizedli...] 0.048 secs (0.048u,0.s)
8192 (+) kb .589596700s: Chars 3240 - 3260 [(repeat~compile_step).] 0.589 secs (0.58u,0.009s)
8192 (+) kb 1.365369600s: Chars 901 - 905 [Qed.] 1.308 secs (1.308u,0.s)
12288 (+) kb 3.370538700s: Chars 5792 - 5857 [(eapply~compile_buf_make_stack...] 3.369 secs (3.369u,0.s)
13060 (+) kb .016281200s: Chars 33 - 68 [Require~Import~Rupicola.Lib.Ar...] 0.008 secs (0.008u,0.s)
14292 (+) kb 3.106644600s: Chars 5390 - 5440 [(eapply~compile_w32s_of_bytes;...] 3.106 secs (3.096u,0.01s)
18432 (+) kb 1.030960900s: Chars 6037 - 6219 [(repeat~~~lazymatch~goal~with~...] 1.03 secs (1.03u,0.s)
65660 (+) kb 11.323829500s: Chars 890 - 898 [compile.] 11.346 secs (11.333u,0.012s)
397672 (+) kb .607830900s: Chars 0 - 32 [Require~Import~Rupicola.Lib.Api.] 0.494 secs (0.231u,0.193s)
649160 (+) kb 2.400876900s: Chars 186 - 243 [Require~Import~Crypto.Bedrock....] 2.42 secs (2.272u,0.147s)
2674652 (+) kb 42.943439400s: Chars 10223 - 10236 [compile_step.] 42.978 secs (42.148u,0.829s)
3978748 (-) kb .004552800s: Chars 10256 - 10269 [End~Bedrock2.] 0.005 secs (0.005u,0.s)

So it looks like we should insert Optimize Proof. Optimize Heap. around compile_step and compile. But what is actually going on with a single compile_step taking 42 seconds? Can that be improved?

Alternatively, that compile_step is right before an Abort, maybe it should just be dropped?

I did a bit of profiling, and maybe we can reduce memory consumption a bit on this example. See coq/coq#16809.

I'm planning to merge #1533 once it passes the CI, which should hopefully improve the situation here. I haven't done much in the way of profiling, so my subsequent additions might have degraded the performance in other areas, but the behavior of the last compile_step shouldn't be a problem.