amethyst/rendy

Optimize build times

kvark opened this issue ยท 24 comments

kvark commented

Removing failure and derivative should be a solid step in this direction. See gfx-rs/gfx#2970 and also #198

As already said in #210, removing derivative does not seem to have any major effect in build times. I've just tested out how failure affects the build times by removing it and replacing the failure::Errors with basic appropriate structs/enums. Given how sparse it is actually used it didnt seem to have any major effect on build times either, resulting in 24.29s build time for just the rendy crates and 1m 04s for a full build, compared to the previous 24.53s and 1m 08s. The 4 second cut off is most likely due to failure not being a dependency in the whole project anymore thanks to the removal. (All the timings were done with possible background noise though, which is why i had one rendy crates-only build with just 20s).

Nevertheless I'd say to remove failure given that it seems to have been just used as a convenient error wrapping for strings. One or two rendy crates even have it as a left over dependency without even using it in the source like rendy-frame for example.

So if the consensus is to remove failure I'll be happy to clean up my test changes and make a PR for it.

kvark commented

#198 has largely removed failure, it's just a few places left before it can be expelled from the Cargo toml. That, I think, explains why you aren't seeing much gains from finishing this work.
Overall, it would be great to see your PR doing this. Derivative can stay, for now :)

Edit: Reorganized data, added relative build time.
Edit 2: Measured with --all-features

Build Time: 10m49.902s (single thread on Intel i3-5010U).
And here are all the crates that takes more than 5s:

% of Build Time Build Time Crate
7.53% 48.90s serde_derive
5.56% 36.09s syn
5.02% 32.59s winit
4.98% 32.36s syn
4.76% 30.92s derivative
4.58% 29.75s wayland_protocols
3.91% 25.44s smithay_client_toolkit
3.88% 25.21s syn
2.70% 17.53s synstructure
2.38% 15.47s wayland_client
2.32% 15.10s gfx_hal
2.21% 14.40s wayland_sys
2.02% 13.13s serde
1.94% 12.60s x11_dl
1.88% 12.22s wayland_scanner
1.80% 11.72s nix
1.77% 11.51s rendy_graph
1.75% 11.42s cc
1.52% 9.87s rendy_command
1.51% 9.85s rustc_serialize
1.31% 8.55s proc_macro2
1.31% 8.51s proc_macro2
1.28% 8.36s proc_macro2
1.28% 8.33s rendy_shader
1.22% 7.93s xml
1.20% 7.85s serde_json
1.14% 7.44s rendy_memory
1.12% 7.32s andrew
1.07% 6.98s failure_derive
1.04% 6.76s rendy_util
1.00% 6.49s rendy_factory
0.99% 6.47s rendy_chain
0.94% 6.15s spirv_reflect
0.91% 5.96s num_bigint

Time measurements from cargo-bloat

kvark commented

The low hanging fruit here is deriving Deserialize behind a deserialize feature that's not enabled by default. See how webrender does it to a large extent. Note: Deserialize is known to take significantly more time than Serialize.

Am I correct that removing syn duplcation would also save us ~30s?

Indeed!
syn is built 3 times (4 times with all features) with each build taking around 30s on my machine.

Duplicates origin
syn v0.12.15
โ””โ”€โ”€ num-derive v0.1.44
    โ””โ”€โ”€ spirv_headers v1.3.4
        โ””โ”€โ”€ spirv-reflect v0.2.1
            โ””โ”€โ”€ rendy-shader v0.4.0 

syn v0.14.9
โ””โ”€โ”€ palette_derive v0.4.1
    โ””โ”€โ”€ palette v0.4.1
        โ””โ”€โ”€ rendy-texture v0.4.0 

syn v0.15.44
โ”œโ”€โ”€ derivative v1.0.3
โ”‚   โ”œโ”€โ”€ rendy-command v0.4.0 
โ”‚   โ”œโ”€โ”€ rendy-descriptor v0.4.0 
โ”‚   โ”œโ”€โ”€ rendy-factory v0.4.0 
โ”‚   โ”œโ”€โ”€ rendy-frame v0.4.0 
โ”‚   โ”œโ”€โ”€ rendy-graph v0.4.0 
โ”‚   โ”œโ”€โ”€ rendy-memory v0.4.0 
โ”‚   โ”œโ”€โ”€ rendy-resource v0.4.0 
โ”‚   โ”œโ”€โ”€ rendy-shader v0.4.0 
โ”‚   โ”œโ”€โ”€ rendy-texture v0.4.0 
โ”‚   โ”œโ”€โ”€ rendy-util v0.4.0 
โ”‚   โ”œโ”€โ”€ rendy-wsi v0.4.0 
โ”‚   โ””โ”€โ”€ winit v0.20.0-alpha3
โ”‚       โ”œโ”€โ”€ gfx-backend-dx12 v0.3.0 
โ”‚       โ”‚   โ””โ”€โ”€ rendy-util v0.4.0 
โ”‚       โ”œโ”€โ”€ gfx-backend-empty v0.3.0 
โ”‚       โ”‚   โ””โ”€โ”€ rendy-util v0.4.0 
โ”‚       โ”œโ”€โ”€ gfx-backend-metal v0.3.2 
โ”‚       โ”‚   โ””โ”€โ”€ rendy-util v0.4.0 
โ”‚       โ”œโ”€โ”€ gfx-backend-vulkan v0.3.0 
โ”‚       โ”‚   โ””โ”€โ”€ rendy-util v0.4.0 
โ”‚       โ””โ”€โ”€ rendy-wsi v0.4.0 
โ”œโ”€โ”€ failure_derive v0.1.5
โ”‚   โ””โ”€โ”€ failure v0.1.5
โ”‚       โ”œโ”€โ”€ rendy v0.4.0 
โ”‚       โ”œโ”€โ”€ rendy-command v0.4.0 
โ”‚       โ”œโ”€โ”€ rendy-descriptor v0.4.0 
โ”‚       โ”œโ”€โ”€ rendy-frame v0.4.0 
โ”‚       โ”œโ”€โ”€ rendy-memory v0.4.0 
โ”‚       โ”œโ”€โ”€ rendy-mesh v0.4.0 
โ”‚       โ”œโ”€โ”€ rendy-shader v0.4.0 
โ”‚       โ””โ”€โ”€ rendy-texture v0.4.0 
โ”œโ”€โ”€ num-derive v0.2.5
โ”‚   โ””โ”€โ”€ tiff v0.3.1
โ”‚       โ””โ”€โ”€ image v0.22.2
โ”‚           โ””โ”€โ”€ rendy-texture v0.4.0 
โ””โ”€โ”€ synstructure v0.10.2
    โ””โ”€โ”€ failure_derive v0.1.5 

syn v1.0.5
โ””โ”€โ”€ serde_derive v1.0.101
    โ”œโ”€โ”€ serde v1.0.101
    โ”‚   โ”œโ”€โ”€ gfx-hal v0.3.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ gfx-backend-dx12 v0.3.0 
    โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ rendy-util v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ gfx-backend-empty v0.3.0 
    โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ rendy-util v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ gfx-backend-metal v0.3.2 
    โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ rendy-util v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ gfx-backend-vulkan v0.3.0 
    โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ rendy-util v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-chain v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-command v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-descriptor v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-factory v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-frame v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-graph v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-memory v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-mesh v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-resource v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-shader v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-texture v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-util v0.4.0 
    โ”‚   โ”‚   โ””โ”€โ”€ rendy-wsi v0.4.0 
    โ”‚   โ”œโ”€โ”€ rendy-factory v0.4.0 
    โ”‚   โ”œโ”€โ”€ rendy-memory v0.4.0 
    โ”‚   โ”œโ”€โ”€ rendy-mesh v0.4.0 
    โ”‚   โ”œโ”€โ”€ rendy-shader v0.4.0 
    โ”‚   โ”œโ”€โ”€ rendy-texture v0.4.0 
    โ”‚   โ”œโ”€โ”€ rendy-util v0.4.0 
    โ”‚   โ”œโ”€โ”€ serde_bytes v0.11.2
    โ”‚   โ”‚   โ””โ”€โ”€ rendy-mesh v0.4.0 
    โ”‚   โ”œโ”€โ”€ serde_json v1.0.40
    โ”‚   โ”‚   โ””โ”€โ”€ thread_profiler v0.3.0
    โ”‚   โ”‚       โ”œโ”€โ”€ rendy v0.4.0 
    โ”‚   โ”‚       โ”œโ”€โ”€ rendy-chain v0.4.0 
    โ”‚   โ”‚       โ”œโ”€โ”€ rendy-command v0.4.0 
    โ”‚   โ”‚       โ”œโ”€โ”€ rendy-factory v0.4.0 
    โ”‚   โ”‚       โ”œโ”€โ”€ rendy-frame v0.4.0 
    โ”‚   โ”‚       โ”œโ”€โ”€ rendy-graph v0.4.0 
    โ”‚   โ”‚       โ”œโ”€โ”€ rendy-texture v0.4.0 
    โ”‚   โ”‚       โ””โ”€โ”€ rendy-util v0.4.0 
    โ”‚   โ”œโ”€โ”€ smallvec v0.6.10
    โ”‚   โ”‚   โ”œโ”€โ”€ gfx-backend-dx12 v0.3.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ gfx-backend-metal v0.3.2 
    โ”‚   โ”‚   โ”œโ”€โ”€ gfx-backend-vulkan v0.3.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ parking_lot_core v0.6.2
    โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ parking_lot v0.9.0
    โ”‚   โ”‚   โ”‚       โ”œโ”€โ”€ gfx-backend-metal v0.3.2 
    โ”‚   โ”‚   โ”‚       โ”œโ”€โ”€ rendy-factory v0.4.0 
    โ”‚   โ”‚   โ”‚       โ”œโ”€โ”€ rendy-util v0.4.0 
    โ”‚   โ”‚   โ”‚       โ””โ”€โ”€ winit v0.20.0-alpha3
    โ”‚   โ”‚   โ”‚           โ”œโ”€โ”€ gfx-backend-dx12 v0.3.0 
    โ”‚   โ”‚   โ”‚           โ”œโ”€โ”€ gfx-backend-empty v0.3.0 
    โ”‚   โ”‚   โ”‚           โ”œโ”€โ”€ gfx-backend-metal v0.3.2 
    โ”‚   โ”‚   โ”‚           โ”œโ”€โ”€ gfx-backend-vulkan v0.3.0 
    โ”‚   โ”‚   โ”‚           โ””โ”€โ”€ rendy-wsi v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-command v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-descriptor v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-factory v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-frame v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-graph v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-memory v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-mesh v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-resource v0.4.0 
    โ”‚   โ”‚   โ”œโ”€โ”€ rendy-shader v0.4.0 
    โ”‚   โ”‚   โ””โ”€โ”€ rendy-wsi v0.4.0 
    โ”‚   โ””โ”€โ”€ spirv-reflect v0.2.1
    โ”‚       โ””โ”€โ”€ rendy-shader v0.4.0 
    โ””โ”€โ”€ spirv-reflect v0.2.1 

I have checked and syn is the only duplicate that takes a significant time to build.

kvark commented

@malobre @omni-viral as I'm trying to vendor rendy-memory and rendy-descriptor crates as 3rd party dependencies in Firefox, the extra code of Rendy dependencies is hard to justify.

Please consider removing derivative, failure, and any other dependencies that aren't required (or make them optional). This is an issue today for us.

  • failure was removed in #211
  • derivative has also been removed in #210 (but not merged, see PR for discussion)

The next step is probably to hide Serialize and Deserialize behind different features as you previously suggested.

I have time to work on that, but I have one question.
Should I remove the serde feature and create two new one (one for serialize, one for deserialize) or should I keep the serde feature which would enable both new features (for back-compatibility) ?

kvark commented

@malobre good question!
First of all, Deserialize is significantly slower to generate than Serialize. That's the reason https://github.com/servo/webrender/ has separate features ("capture" vs "replay") for deriving these traits.

Rendy though would mostly care about what gfx-rs derives today, and that's just controlled by serde optional dependency. Maybe in the future we'll split it like WebRender does, but it hasn't shown up as a problem yet.

When you measured the times, why did serde even show up? Is it accidentally enabled by default somewhere?

Another question - was the syn duplication problem resolved?

Oh I see what the problem is, the timings were with all features enabled. So the serde flag is probably sufficent.

As for the syn duplication this is still not resolved.
To resolve:

  • Waiting on gwihlidal/spirv-reflect-rs#10
  • palette_derive needs to update syn ^0.14 to syn ^1
    This has already been done on the repo but a new version hasn't been published since the changes (Ogeon/palette#145)
  • Either remove derivative (#210) or update it's syn ^0.15.10 dependency to syn ^1

After all these changes we will only be building syn ^1 1 time.

kvark commented

So removing the derivative is still an option.

@Frizi how do you feel about merging #210 ?
If we keep the derivative dependency I will make a PR on their repo to update to syn ^1

Removing derivative from just the two crates mentioned(rendy-memory and rendy-descriptor) might also be an option for the time being, since there are only 3 derives that are made through derivative for those crates. So the maintenance reason wouldn't apply too much to those two crates #210 (comment).

TBH I'm for getting rid of derivative because it's slow and not all features described in their doc works.

Frizi commented

getting rid of derivative because it's slow

Is it though? From what we've seen before in #210, it isn't really that significant after all.

This might be changed if we get rid of extra copy of syn by removing it.

@malobre I don't know tbh. I kinda start to think that it might be good idea to get rid of it, just to make it a bit easier on firefox dev team and alike. All the little libraries add up. My conern is that applying this logic to many other dependencies is dangerous. The concept of sharing code is useful after all, it can actually bring improvements to the build times. We just need to keep things up to date.

Another kinda separate point is, I think we should focus a little bit more on making the rendy's code alone faster to compile. I'd love to have some kind of tools that tell me where the compiler spends most time, maybe we can help it a bit. Also a big factor for sure is the proliferation of generics that gets monomorphized in the user's code. It's really hard to get rid of some of them (notably backend), but maybe we can at least monomorphize them in rendy's code already.

I agree about the dependency situation, this is all about finding an equilibrium. The syn situation will take some time to resolve.

About rendy code, I'll try to establish what takes time to compile and make a little summary so we have an idea where optimization would be nice.

$ cargo +nightly build -Z timings yields some amazing graphs

Graph

Screenshot_2019-10-31 Cargo Build Timings โ€” rendy 0 5 0

$ cargo +nightly rustc -p rendy* -- -Z self-profile
$ summarize summarize rendy* -p 2
Self-profile results (anything over 2%)
rendy
+-------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                    | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+-------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_register_crate | 85.43ms   | 75.998          | 63         | 0          | 0.00ns       | 0.00ns                |
+-------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| macro_expand_crate      | 6.14ms    | 5.461           | 1          | 0          | 0.00ns       | 0.00ns                |
+-------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| resolve_crate           | 3.93ms    | 3.500           | 1          | 0          | 0.00ns       | 0.00ns                |
+-------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_load_macro     | 3.39ms    | 3.018           | 1          | 0          | 0.00ns       | 0.00ns                |
+-------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 112.416667ms
Filtered results account for 87.977% of total time.

rendy-chain
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                             | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj     | 2.09s     | 36.314          | 71         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module                   | 887.33ms  | 15.426          | 71         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of                 | 436.35ms  | 7.586           | 266        | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_make_bitcode | 338.74ms  | 5.889           | 71         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation              | 159.97ms  | 2.781           | 5741       | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck                     | 131.77ms  | 2.291           | 266        | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry            | 129.02ms  | 2.243           | 46383      | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_register_crate          | 119.44ms  | 2.076           | 28         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 5.752254287s
Filtered results account for 74.606% of total time.

rendy-command
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                        | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 3.21s     | 31.872          | 275        | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate     | 1.93s     | 19.193          | 1646       | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of            | 1.76s     | 17.454          | 301        | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed      | 1.68s     | 16.670          | 406        | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 10.067050635s
Filtered results account for 85.189% of total time.

rendy-core
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                             | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj     | 1.84s     | 25.269          | 84         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of                 | 804.89ms  | 11.041          | 721        | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module                   | 657.22ms  | 9.015           | 84         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed      | 426.66ms  | 5.853           | 323        | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate          | 408.66ms  | 5.606           | 5572       | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck                     | 375.23ms  | 5.147           | 721        | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_make_bitcode | 320.65ms  | 4.398           | 84         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed           | 252.94ms  | 3.470           | 322        | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation              | 220.19ms  | 3.020           | 11136      | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| optimized_mir                    | 173.09ms  | 2.374           | 1380       | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_crate                    | 168.91ms  | 2.317           | 1          | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 7.290118122s
Filtered results account for 77.510% of total time.

rendy-descriptor
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                        | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate     | 731.89ms  | 26.706          | 425        | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed      | 530.80ms  | 19.368          | 78         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 433.91ms  | 15.833          | 48         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of            | 302.60ms  | 11.042          | 62         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_ascribe_user_type   | 102.22ms  | 3.730           | 27         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry       | 93.48ms   | 3.411           | 40160      | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 2.740566614s
Filtered results account for 80.089% of total time.

rendy-factory
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                         | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed  | 1.97s     | 30.391          | 150        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of             | 1.01s     | 15.637          | 225        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate      | 866.56ms  | 13.366          | 1889       | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed       | 748.60ms  | 11.547          | 295        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 327.42ms  | 5.050           | 44         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation          | 182.95ms  | 2.822           | 5934       | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck                 | 156.44ms  | 2.413           | 225        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module               | 143.65ms  | 2.216           | 44         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 6.483234307s
Filtered results account for 83.442% of total time.

rendy-frame
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                        | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 240.92ms  | 28.216          | 52         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed      | 84.17ms   | 9.857           | 90         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry       | 75.94ms   | 8.894           | 39316      | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_register_crate     | 62.39ms   | 7.307           | 51         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of            | 46.53ms   | 5.449           | 59         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation         | 26.37ms   | 3.089           | 1134       | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| macro_expand_crate          | 25.40ms   | 2.974           | 1          | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck                | 22.04ms   | 2.582           | 59         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| specialization_graph_of     | 17.96ms   | 2.103           | 12         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 853.838094ms
Filtered results account for 70.471% of total time.

rendy-graph
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                         | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate      | 2.93s     | 24.458          | 2794       | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed  | 2.27s     | 18.948          | 178        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed       | 1.76s     | 14.694          | 416        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of             | 1.27s     | 10.650          | 313        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_trait_item_well_formed | 915.61ms  | 7.652           | 51         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck                 | 360.54ms  | 3.013           | 313        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation          | 254.18ms  | 2.124           | 8976       | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 249.12ms  | 2.082           | 38         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 11.965757753s
Filtered results account for 83.620% of total time.

rendy-memory
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                         | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed  | 2.22s     | 33.156          | 198        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed       | 1.31s     | 19.502          | 356        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of             | 1.02s     | 15.293          | 235        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 379.59ms  | 5.672           | 56         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate      | 273.60ms  | 4.088           | 1031       | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_trait_item_well_formed | 182.05ms  | 2.720           | 19         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module               | 157.98ms  | 2.361           | 56         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 6.69243081s
Filtered results account for 82.791% of total time.

rendy-mesh
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                             | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj     | 279.94ms  | 14.926          | 29         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of                 | 207.61ms  | 11.069          | 73         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed      | 197.14ms  | 10.511          | 52         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate          | 174.39ms  | 9.298           | 1175       | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed           | 140.00ms  | 7.465           | 97         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module                   | 121.70ms  | 6.489           | 29         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation              | 71.56ms   | 3.816           | 2351       | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry            | 71.44ms   | 3.809           | 39971      | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_make_bitcode | 61.80ms   | 3.295           | 29         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck                     | 58.47ms   | 3.118           | 73         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_register_crate          | 58.36ms   | 3.112           | 50         | 0          | 0.00ns       | 0.00ns                |
+----------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 1.875489349s
Filtered results account for 76.908% of total time.

rendy-shader
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                         | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed  | 221.35ms  | 14.599          | 44         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate      | 209.79ms  | 13.837          | 267        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of             | 196.49ms  | 12.960          | 54         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed       | 160.51ms  | 10.587          | 44         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 157.44ms  | 10.384          | 23         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry        | 66.17ms   | 4.364           | 44844      | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module               | 58.81ms   | 3.879           | 23         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_register_crate      | 36.69ms   | 2.420           | 50         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck                 | 32.50ms   | 2.144           | 54         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 1.516147003s
Filtered results account for 75.175% of total time.

rendy-texture
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                         | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed  | 148.73ms  | 13.140          | 425        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed       | 120.69ms  | 10.663          | 407        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| LLVM_module_codegen_emit_obj | 95.98ms   | 8.480           | 12         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of             | 75.84ms   | 6.700           | 405        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry        | 60.21ms   | 5.319           | 39355      | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| evaluate_obligation          | 58.17ms   | 5.139           | 2270       | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| mir_borrowck                 | 47.48ms   | 4.195           | 405        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_register_crate      | 37.91ms   | 3.349           | 50         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| codegen_module               | 35.70ms   | 3.154           | 12         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| optimized_mir                | 31.77ms   | 2.807           | 535        | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| specialization_graph_of      | 30.30ms   | 2.677           | 20         | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| macro_expand_crate           | 29.80ms   | 2.632           | 1          | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| resolve_crate                | 24.97ms   | 2.206           | 1          | 0          | 0.00ns       | 0.00ns                |
+------------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 1.131897315s
Filtered results account for 70.461% of total time.

rendy-wsi
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| Item                        | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| type_op_prove_predicate     | 611.53ms  | 27.304          | 331        | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_impl_item_well_formed | 579.53ms  | 25.875          | 31         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| typeck_tables_of            | 395.32ms  | 17.650          | 42         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| check_item_well_formed      | 280.90ms  | 12.541          | 41         | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
| metadata_decode_entry       | 50.91ms   | 2.273           | 28656      | 0          | 0.00ns       | 0.00ns                |
+-----------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
Total cpu time: 2.239738174s
Filtered results account for 85.644% of total time.

Whoa, why rendy-core is so slow? It's mostly a bunch of exported macros.

One indirect syn dependency was removed. (gwihlidal/spirv-reflect-rs#10)

The second one should be resolved in the next few days. After that derivative will be the only dependency depending on syn ๐Ÿฅณ (edit: palette_derive will still requires syn but it will be the same version as derivative so only one build is needed)

(Summary: #203 (comment))

@omni-viral could you cargo clean && cargo +nightly build -Z timings and upload the resulting graphs ? To compare with my results.

Current nightly compiler ICE on rendy without backend features.

Ogeon/palette#145 has been closed, and palette_derive^0.5 has been published on crates.io, it now requires syn^1.

(Edit: derivative still uses an older version of syn mcarton/rust-derivative#43)

implemented dependabot to keep dependencies up to date, individual issues can be raised about time consuming dependencies, duplicates in tree, etc.

palette upgrade has PR already

would like to automate the timings result being posted to PRs, will file issue