Optimize build times
kvark opened this issue ยท 24 comments
Removing failure
and derivative
should be a solid step in this direction. See gfx-rs/gfx#2970 and also #198
As already said in #210, removing derivative
does not seem to have any major effect in build times. I've just tested out how failure
affects the build times by removing it and replacing the failure::Errors
with basic appropriate structs/enums. Given how sparse it is actually used it didnt seem to have any major effect on build times either, resulting in 24.29s
build time for just the rendy crates and 1m 04s
for a full build, compared to the previous 24.53s
and 1m 08s
. The 4 second cut off is most likely due to failure
not being a dependency in the whole project anymore thanks to the removal. (All the timings were done with possible background noise though, which is why i had one rendy crates-only build with just 20s).
Nevertheless I'd say to remove failure
given that it seems to have been just used as a convenient error wrapping for strings. One or two rendy crates even have it as a left over dependency without even using it in the source like rendy-frame
for example.
So if the consensus is to remove failure
I'll be happy to clean up my test changes and make a PR for it.
#198 has largely removed failure
, it's just a few places left before it can be expelled from the Cargo toml. That, I think, explains why you aren't seeing much gains from finishing this work.
Overall, it would be great to see your PR doing this. Derivative can stay, for now :)
Edit: Reorganized data, added relative build time.
Edit 2: Measured with --all-features
Build Time: 10m49.902s (single thread on Intel i3-5010U).
And here are all the crates that takes more than 5s:
% of Build Time | Build Time | Crate |
---|---|---|
7.53% | 48.90s | serde_derive |
5.56% | 36.09s | syn |
5.02% | 32.59s | winit |
4.98% | 32.36s | syn |
4.76% | 30.92s | derivative |
4.58% | 29.75s | wayland_protocols |
3.91% | 25.44s | smithay_client_toolkit |
3.88% | 25.21s | syn |
2.70% | 17.53s | synstructure |
2.38% | 15.47s | wayland_client |
2.32% | 15.10s | gfx_hal |
2.21% | 14.40s | wayland_sys |
2.02% | 13.13s | serde |
1.94% | 12.60s | x11_dl |
1.88% | 12.22s | wayland_scanner |
1.80% | 11.72s | nix |
1.77% | 11.51s | rendy_graph |
1.75% | 11.42s | cc |
1.52% | 9.87s | rendy_command |
1.51% | 9.85s | rustc_serialize |
1.31% | 8.55s | proc_macro2 |
1.31% | 8.51s | proc_macro2 |
1.28% | 8.36s | proc_macro2 |
1.28% | 8.33s | rendy_shader |
1.22% | 7.93s | xml |
1.20% | 7.85s | serde_json |
1.14% | 7.44s | rendy_memory |
1.12% | 7.32s | andrew |
1.07% | 6.98s | failure_derive |
1.04% | 6.76s | rendy_util |
1.00% | 6.49s | rendy_factory |
0.99% | 6.47s | rendy_chain |
0.94% | 6.15s | spirv_reflect |
0.91% | 5.96s | num_bigint |
Time measurements from cargo-bloat
The low hanging fruit here is deriving Deserialize
behind a deserialize
feature that's not enabled by default. See how webrender does it to a large extent. Note: Deserialize
is known to take significantly more time than Serialize
.
Am I correct that removing syn
duplcation would also save us ~30s?
Indeed!
syn
is built 3 times (4 times with all features) with each build taking around 30s on my machine.
Duplicates origin
syn v0.12.15
โโโ num-derive v0.1.44
โโโ spirv_headers v1.3.4
โโโ spirv-reflect v0.2.1
โโโ rendy-shader v0.4.0
syn v0.14.9
โโโ palette_derive v0.4.1
โโโ palette v0.4.1
โโโ rendy-texture v0.4.0
syn v0.15.44
โโโ derivative v1.0.3
โ โโโ rendy-command v0.4.0
โ โโโ rendy-descriptor v0.4.0
โ โโโ rendy-factory v0.4.0
โ โโโ rendy-frame v0.4.0
โ โโโ rendy-graph v0.4.0
โ โโโ rendy-memory v0.4.0
โ โโโ rendy-resource v0.4.0
โ โโโ rendy-shader v0.4.0
โ โโโ rendy-texture v0.4.0
โ โโโ rendy-util v0.4.0
โ โโโ rendy-wsi v0.4.0
โ โโโ winit v0.20.0-alpha3
โ โโโ gfx-backend-dx12 v0.3.0
โ โ โโโ rendy-util v0.4.0
โ โโโ gfx-backend-empty v0.3.0
โ โ โโโ rendy-util v0.4.0
โ โโโ gfx-backend-metal v0.3.2
โ โ โโโ rendy-util v0.4.0
โ โโโ gfx-backend-vulkan v0.3.0
โ โ โโโ rendy-util v0.4.0
โ โโโ rendy-wsi v0.4.0
โโโ failure_derive v0.1.5
โ โโโ failure v0.1.5
โ โโโ rendy v0.4.0
โ โโโ rendy-command v0.4.0
โ โโโ rendy-descriptor v0.4.0
โ โโโ rendy-frame v0.4.0
โ โโโ rendy-memory v0.4.0
โ โโโ rendy-mesh v0.4.0
โ โโโ rendy-shader v0.4.0
โ โโโ rendy-texture v0.4.0
โโโ num-derive v0.2.5
โ โโโ tiff v0.3.1
โ โโโ image v0.22.2
โ โโโ rendy-texture v0.4.0
โโโ synstructure v0.10.2
โโโ failure_derive v0.1.5
syn v1.0.5
โโโ serde_derive v1.0.101
โโโ serde v1.0.101
โ โโโ gfx-hal v0.3.0
โ โ โโโ gfx-backend-dx12 v0.3.0
โ โ โ โโโ rendy-util v0.4.0
โ โ โโโ gfx-backend-empty v0.3.0
โ โ โ โโโ rendy-util v0.4.0
โ โ โโโ gfx-backend-metal v0.3.2
โ โ โ โโโ rendy-util v0.4.0
โ โ โโโ gfx-backend-vulkan v0.3.0
โ โ โ โโโ rendy-util v0.4.0
โ โ โโโ rendy v0.4.0
โ โ โโโ rendy-chain v0.4.0
โ โ โโโ rendy-command v0.4.0
โ โ โโโ rendy-descriptor v0.4.0
โ โ โโโ rendy-factory v0.4.0
โ โ โโโ rendy-frame v0.4.0
โ โ โโโ rendy-graph v0.4.0
โ โ โโโ rendy-memory v0.4.0
โ โ โโโ rendy-mesh v0.4.0
โ โ โโโ rendy-resource v0.4.0
โ โ โโโ rendy-shader v0.4.0
โ โ โโโ rendy-texture v0.4.0
โ โ โโโ rendy-util v0.4.0
โ โ โโโ rendy-wsi v0.4.0
โ โโโ rendy-factory v0.4.0
โ โโโ rendy-memory v0.4.0
โ โโโ rendy-mesh v0.4.0
โ โโโ rendy-shader v0.4.0
โ โโโ rendy-texture v0.4.0
โ โโโ rendy-util v0.4.0
โ โโโ serde_bytes v0.11.2
โ โ โโโ rendy-mesh v0.4.0
โ โโโ serde_json v1.0.40
โ โ โโโ thread_profiler v0.3.0
โ โ โโโ rendy v0.4.0
โ โ โโโ rendy-chain v0.4.0
โ โ โโโ rendy-command v0.4.0
โ โ โโโ rendy-factory v0.4.0
โ โ โโโ rendy-frame v0.4.0
โ โ โโโ rendy-graph v0.4.0
โ โ โโโ rendy-texture v0.4.0
โ โ โโโ rendy-util v0.4.0
โ โโโ smallvec v0.6.10
โ โ โโโ gfx-backend-dx12 v0.3.0
โ โ โโโ gfx-backend-metal v0.3.2
โ โ โโโ gfx-backend-vulkan v0.3.0
โ โ โโโ parking_lot_core v0.6.2
โ โ โ โโโ parking_lot v0.9.0
โ โ โ โโโ gfx-backend-metal v0.3.2
โ โ โ โโโ rendy-factory v0.4.0
โ โ โ โโโ rendy-util v0.4.0
โ โ โ โโโ winit v0.20.0-alpha3
โ โ โ โโโ gfx-backend-dx12 v0.3.0
โ โ โ โโโ gfx-backend-empty v0.3.0
โ โ โ โโโ gfx-backend-metal v0.3.2
โ โ โ โโโ gfx-backend-vulkan v0.3.0
โ โ โ โโโ rendy-wsi v0.4.0
โ โ โโโ rendy-command v0.4.0
โ โ โโโ rendy-descriptor v0.4.0
โ โ โโโ rendy-factory v0.4.0
โ โ โโโ rendy-frame v0.4.0
โ โ โโโ rendy-graph v0.4.0
โ โ โโโ rendy-memory v0.4.0
โ โ โโโ rendy-mesh v0.4.0
โ โ โโโ rendy-resource v0.4.0
โ โ โโโ rendy-shader v0.4.0
โ โ โโโ rendy-wsi v0.4.0
โ โโโ spirv-reflect v0.2.1
โ โโโ rendy-shader v0.4.0
โโโ spirv-reflect v0.2.1
I have checked and syn
is the only duplicate that takes a significant time to build.
@malobre @omni-viral as I'm trying to vendor rendy-memory
and rendy-descriptor
crates as 3rd party dependencies in Firefox, the extra code of Rendy dependencies is hard to justify.
Please consider removing derivative
, failure
, and any other dependencies that aren't required (or make them optional). This is an issue today for us.
I have time to work on that, but I have one question.
Should I remove the serde
feature and create two new one (one for serialize, one for deserialize) or should I keep the serde
feature which would enable both new features (for back-compatibility) ?
@malobre good question!
First of all, Deserialize
is significantly slower to generate than Serialize
. That's the reason https://github.com/servo/webrender/ has separate features ("capture" vs "replay") for deriving these traits.
Rendy though would mostly care about what gfx-rs derives today, and that's just controlled by serde
optional dependency. Maybe in the future we'll split it like WebRender does, but it hasn't shown up as a problem yet.
When you measured the times, why did serde
even show up? Is it accidentally enabled by default somewhere?
Another question - was the syn
duplication problem resolved?
Oh I see what the problem is, the timings were with all features enabled. So the serde
flag is probably sufficent.
As for the syn
duplication this is still not resolved.
To resolve:
- Waiting on gwihlidal/spirv-reflect-rs#10
-
palette_derive
needs to updatesyn ^0.14
tosyn ^1
This has already been done on the repo but a new version hasn't been published since the changes (Ogeon/palette#145) - Either remove
derivative
(#210) or update it'ssyn ^0.15.10
dependency tosyn ^1
After all these changes we will only be building syn ^1
1 time.
So removing the derivative
is still an option.
Removing derivative
from just the two crates mentioned(rendy-memory
and rendy-descriptor
) might also be an option for the time being, since there are only 3 derives that are made through derivative
for those crates. So the maintenance reason wouldn't apply too much to those two crates #210 (comment).
TBH I'm for getting rid of derivative
because it's slow and not all features described in their doc works.
getting rid of derivative because it's slow
Is it though? From what we've seen before in #210, it isn't really that significant after all.
This might be changed if we get rid of extra copy of syn
by removing it.
@malobre I don't know tbh. I kinda start to think that it might be good idea to get rid of it, just to make it a bit easier on firefox dev team and alike. All the little libraries add up. My conern is that applying this logic to many other dependencies is dangerous. The concept of sharing code is useful after all, it can actually bring improvements to the build times. We just need to keep things up to date.
Another kinda separate point is, I think we should focus a little bit more on making the rendy's code alone faster to compile. I'd love to have some kind of tools that tell me where the compiler spends most time, maybe we can help it a bit. Also a big factor for sure is the proliferation of generics that gets monomorphized in the user's code. It's really hard to get rid of some of them (notably backend), but maybe we can at least monomorphize them in rendy's code already.
I agree about the dependency situation, this is all about finding an equilibrium. The syn
situation will take some time to resolve.
About rendy code, I'll try to establish what takes time to compile and make a little summary so we have an idea where optimization would be nice.
$ cargo +nightly build -Z timings
yields some amazing graphs
$ cargo +nightly rustc -p rendy* -- -Z self-profile
$ summarize summarize rendy* -p 2
Self-profile results (anything over 2%)
rendy
+-------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+-------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_register_crate | 85.43ms | 75.998 | 63 | 0 | 0.00ns | 0.00ns |
+-------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| macro_expand_crate | 6.14ms | 5.461 | 1 | 0 | 0.00ns | 0.00ns |
+-------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| resolve_crate | 3.93ms | 3.500 | 1 | 0 | 0.00ns | 0.00ns |
+-------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_load_macro | 3.39ms | 3.018 | 1 | 0 | 0.00ns | 0.00ns |
+-------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 112.416667ms
Filtered results account for 87.977% of total time.
rendy-chain
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 2.09s | 36.314 | 71 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module | 887.33ms | 15.426 | 71 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of | 436.35ms | 7.586 | 266 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_make_bitcode | 338.74ms | 5.889 | 71 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation | 159.97ms | 2.781 | 5741 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck | 131.77ms | 2.291 | 266 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry | 129.02ms | 2.243 | 46383 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_register_crate | 119.44ms | 2.076 | 28 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 5.752254287s
Filtered results account for 74.606% of total time.
rendy-command
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 3.21s | 31.872 | 275 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate | 1.93s | 19.193 | 1646 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of | 1.76s | 17.454 | 301 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed | 1.68s | 16.670 | 406 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 10.067050635s
Filtered results account for 85.189% of total time.
rendy-core
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 1.84s | 25.269 | 84 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of | 804.89ms | 11.041 | 721 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module | 657.22ms | 9.015 | 84 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 426.66ms | 5.853 | 323 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate | 408.66ms | 5.606 | 5572 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck | 375.23ms | 5.147 | 721 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_make_bitcode | 320.65ms | 4.398 | 84 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed | 252.94ms | 3.470 | 322 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation | 220.19ms | 3.020 | 11136 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| optimized_mir | 173.09ms | 2.374 | 1380 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_crate | 168.91ms | 2.317 | 1 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 7.290118122s
Filtered results account for 77.510% of total time.
rendy-descriptor
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate | 731.89ms | 26.706 | 425 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed | 530.80ms | 19.368 | 78 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 433.91ms | 15.833 | 48 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of | 302.60ms | 11.042 | 62 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_ascribe_user_type | 102.22ms | 3.730 | 27 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry | 93.48ms | 3.411 | 40160 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 2.740566614s
Filtered results account for 80.089% of total time.
rendy-factory
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 1.97s | 30.391 | 150 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of | 1.01s | 15.637 | 225 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate | 866.56ms | 13.366 | 1889 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed | 748.60ms | 11.547 | 295 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 327.42ms | 5.050 | 44 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation | 182.95ms | 2.822 | 5934 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck | 156.44ms | 2.413 | 225 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module | 143.65ms | 2.216 | 44 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 6.483234307s
Filtered results account for 83.442% of total time.
rendy-frame
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 240.92ms | 28.216 | 52 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed | 84.17ms | 9.857 | 90 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry | 75.94ms | 8.894 | 39316 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_register_crate | 62.39ms | 7.307 | 51 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of | 46.53ms | 5.449 | 59 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation | 26.37ms | 3.089 | 1134 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| macro_expand_crate | 25.40ms | 2.974 | 1 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck | 22.04ms | 2.582 | 59 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| specialization_graph_of | 17.96ms | 2.103 | 12 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 853.838094ms
Filtered results account for 70.471% of total time.
rendy-graph
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate | 2.93s | 24.458 | 2794 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 2.27s | 18.948 | 178 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed | 1.76s | 14.694 | 416 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of | 1.27s | 10.650 | 313 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_trait_item_well_formed | 915.61ms | 7.652 | 51 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck | 360.54ms | 3.013 | 313 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation | 254.18ms | 2.124 | 8976 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 249.12ms | 2.082 | 38 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 11.965757753s
Filtered results account for 83.620% of total time.
rendy-memory
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 2.22s | 33.156 | 198 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed | 1.31s | 19.502 | 356 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of | 1.02s | 15.293 | 235 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 379.59ms | 5.672 | 56 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate | 273.60ms | 4.088 | 1031 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_trait_item_well_formed | 182.05ms | 2.720 | 19 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module | 157.98ms | 2.361 | 56 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 6.69243081s
Filtered results account for 82.791% of total time.
rendy-mesh
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 279.94ms | 14.926 | 29 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of | 207.61ms | 11.069 | 73 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 197.14ms | 10.511 | 52 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate | 174.39ms | 9.298 | 1175 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed | 140.00ms | 7.465 | 97 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module | 121.70ms | 6.489 | 29 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation | 71.56ms | 3.816 | 2351 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry | 71.44ms | 3.809 | 39971 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_make_bitcode | 61.80ms | 3.295 | 29 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck | 58.47ms | 3.118 | 73 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_register_crate | 58.36ms | 3.112 | 50 | 0 | 0.00ns | 0.00ns |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 1.875489349s
Filtered results account for 76.908% of total time.
rendy-shader
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 221.35ms | 14.599 | 44 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate | 209.79ms | 13.837 | 267 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of | 196.49ms | 12.960 | 54 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed | 160.51ms | 10.587 | 44 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 157.44ms | 10.384 | 23 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry | 66.17ms | 4.364 | 44844 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module | 58.81ms | 3.879 | 23 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_register_crate | 36.69ms | 2.420 | 50 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck | 32.50ms | 2.144 | 54 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 1.516147003s
Filtered results account for 75.175% of total time.
rendy-texture
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 148.73ms | 13.140 | 425 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed | 120.69ms | 10.663 | 407 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 95.98ms | 8.480 | 12 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of | 75.84ms | 6.700 | 405 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry | 60.21ms | 5.319 | 39355 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation | 58.17ms | 5.139 | 2270 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck | 47.48ms | 4.195 | 405 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_register_crate | 37.91ms | 3.349 | 50 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module | 35.70ms | 3.154 | 12 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| optimized_mir | 31.77ms | 2.807 | 535 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| specialization_graph_of | 30.30ms | 2.677 | 20 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| macro_expand_crate | 29.80ms | 2.632 | 1 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| resolve_crate | 24.97ms | 2.206 | 1 | 0 | 0.00ns | 0.00ns |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 1.131897315s
Filtered results account for 70.461% of total time.
rendy-wsi
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate | 611.53ms | 27.304 | 331 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 579.53ms | 25.875 | 31 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of | 395.32ms | 17.650 | 42 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed | 280.90ms | 12.541 | 41 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry | 50.91ms | 2.273 | 28656 | 0 | 0.00ns | 0.00ns |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 2.239738174s
Filtered results account for 85.644% of total time.
Whoa, why rendy-core
is so slow? It's mostly a bunch of exported macros.
One indirect syn
dependency was removed. (gwihlidal/spirv-reflect-rs#10)
The second one should be resolved in the next few days. After that (edit: palette_derive will still requires derivative
will be the only dependency depending on syn
๐ฅณsyn
but it will be the same version as derivative so only one build is needed)
(Summary: #203 (comment))
@omni-viral could you cargo clean && cargo +nightly build -Z timings
and upload the resulting graphs ? To compare with my results.
Current nightly compiler ICE on rendy without backend features.
Ogeon/palette#145 has been closed, and palette_derive^0.5
has been published on crates.io, it now requires syn^1
.
(Edit: derivative still uses an older version of syn
mcarton/rust-derivative#43)
implemented dependabot to keep dependencies up to date, individual issues can be raised about time consuming dependencies, duplicates in tree, etc.
palette upgrade has PR already
would like to automate the timings result being posted to PRs, will file issue