Why are the reactive versions of Quarkus and Helidon so slow to start?
mraible opened this issue · 13 comments
Quarkus Reactive takes 7x longer to start: 353ms vs 49.4.
Helidon SE takes 5x longer: 303.6ms vs 59.4.
Startup time
Imperative
M: (36 + 48 + 36 + 47 + 42) / 5 = 41.8
Q: (55 + 46 + 44 + 54 + 48) / 5 = 49.4
S: (67 + 70 + 65 + 66 + 54) / 5 = 64.4
H: (64 + 55 + 61 + 58 + 59) / 5 = 59.4
Reactive
M: (52 + 45 + 47 + 46 + 47) / 5 = 47.4
Q: (359 + 291 + 276 + 534 + 305) / 5 = 353
S: (55 + 72 + 62 + 52 + 52) / 5 = 58.6
H: (239 + 239 + 509 + 263 + 268) / 5 = 303.6
Imperative with Virtual Threads
Q: (44 + 38 + 46 + 43 + 45) / 5 = 43.2
S: (71 + 64 + 68 + 71 + 66) / 5 = 68
Steps to reproduce
Install GraalVM 21:
sdk install java 21.0.2-graalce
For imperative:
git clone https://github.com/oktadev/auth0-java-rest-api-examples
cd auth0-java-rest-api-examples
./build.sh
./start.sh
The last script takes an argument of micronaut
, quarkus
, spring-boot
, or helidon
.
For reactive:
git clone https://github.com/oktadev/auth0-java-reactive-examples
cd auth0-java-reactive-examples
./build.sh
./start.sh
For virtual threads:
cd auth0-java-rest-api-examples
git checkout virtual-threads
./build.sh
./start.sh
I only calculated the startup times for Quarkus and Spring Boot because they're the only apps modified in the virtual-threads
branch.
Memory usage (in MB):
Imperative
Framework | 0 requests | 1 request | 10K requests |
---|---|---|---|
Micronaut | 53 | 63 | 102 |
Quarkus | 37 | 47 | 51 |
Spring Boot | 76 | 86 | 105 |
Helidon | 82 | 92 | 69 |
Imperative with Virtual Threads
Framework | 0 requests | 1 request | 10K requests |
---|---|---|---|
Quarkus | 37 | 48 | 51 |
Spring Boot | 76 | 87 | 102 |
Reactive
Framework | 0 requests | 1 request | 10K requests |
---|---|---|---|
Micronaut | 53 | 62 | 102 |
Quarkus | 49 | 50 | 53 |
Spring Boot | 75 | 107 | 174 |
Helidon | 44 | 45 | 68 |
The slow startup seems to be caused by the frameworks fetching the issuer or JWKS URI on startup. If I turn off my internet and run ./start.sh quarkus
, it results in an error:
2024-01-31 14:13:43,816 WARN [io.qua.oid.com.run.OidcCommonUtils] (vert.x-eventloop-thread-1)
OIDC Server is not available:: java.net.UnknownHostException: dev-06bzs1cu.us.auth0.com:
nodename nor servname provided, or not known
Same with Helidon:
Exception in thread "main" io.helidon.common.configurable.ResourceException:
Failed to open stream to uri: https://dev-06bzs1cu.us.auth0.com/.well-known/jwks.json
Since these frameworks are supposed to be as fast as possible for serverless environments, I think they do this processing on the first request rather than on startup.
For some background, when I first started this comparison a couple of years ago, Spring Security and Helidon did the same thing (processing on startup instead of first request). They both switched to processing on the first request.
- Spring Security issue: spring-projects/spring-security#9991
- Helidon PR: helidon-io/helidon#3742
Not sure it makes any difference, but you could re-write the quarkus HelloResource
class as
@Path("/hello")
public class HelloResource {
@Inject
SecurityIdentity securityIdentity;
@GET
@Path("/")
@Authenticated
@Produces(MediaType.TEXT_PLAIN)
@NonBlocking
public String hello() {
return "Hello, " + securityIdentity.getPrincipal().getName() + "!";
}
}
You don't need to explicitly return a Uni
. Adding @NonBlocking
will tell Quarkus to run the method on the event loop as a non-blocking method.
Thanks for the code review, @edeandrea! I added your suggested change in ee7b4ce.
I discovered why Quarkus reactive is slow to start. It's because it's looking up the issuer rather than doing it lazily. Please take a look at my comment above for more information.
I discovered why Quarkus reactive is slow to start. It's because it's looking up the issuer rather than doing it lazily. Please take a look at #5 (comment) for more information.
I think I have a way around that. Try adding this to application.properties
:
# https://quarkus.io/guides/all-config#quarkus-oidc_quarkus.oidc.jwks.resolve-early
quarkus.oidc.jwks.resolve-early=false
# https://quarkus.io/guides/all-config#quarkus-oidc_quarkus.oidc.discovery-enabled
quarkus.oidc.discovery-enabled=false
# https://quarkus.io/guides/all-config#quarkus-oidc_quarkus.oidc.jwks-path
quarkus.oidc.jwks-path=${quarkus.oidc.auth-server-url}/.well-known/jwks.json
This is what I get now:
╰─ ./start.sh quarkus
date (GNU coreutils) 9.4
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by David MacKenzie.
8.0.0
__ ____ __ _____ ___ __ ____ ______
--/ __ \/ / / / _ | / _ \/ //_/ / / / __/
-/ /_/ / /_/ / __ |/ , _/ ,< / /_/ /\ \
--\___\_\____/_/ |_/_/|_/_/|_|\____/___/
2024-01-31 16:45:01,816 INFO [io.quarkus] (main) quarkus 1.0.0-SNAPSHOT native (powered by Quarkus 3.6.7) started in 0.013s. Listening on: http://0.0.0.0:8080
2024-01-31 16:45:01,816 INFO [io.quarkus] (main) Profile prod activated.
2024-01-31 16:45:01,816 INFO [io.quarkus] (main) Installed features: [cdi, oidc, resteasy-reactive, security, smallrye-context-propagation, vertx]
App is available on port 8080. Duration to start: 30 milliseconds
2024-01-31 16:45:02,098 INFO [io.quarkus] (Shutdown thread) quarkus stopped in 0.001s
I also downgraded to 3.6.7 because 3.7 hasn't yet been released. That version switch may or may not have any effect.
I also downgraded to 3.6.7 because 3.7 hasn't yet been released. That version switch may or may not have any effect.
switching to 3.7.1 also seems to work fine and produce similar results, even if it hasn't yet been released.
Also, using JsonWebToken
instead of SecurityIdentity
probably doesn't matter from a performance perspective either. If you'd rather stick to a MicroProfile interface vs a Quarkus-specific one thats probably fine.
@edeandrea @mraible thanks, quarkus.oidc.jwks.resolve-early=false
is a fairly specific feature when for example an authentication to the JWKS endpoint has to be performed at the time the token is available.
Skipping the discovery step is indeed one way to accelerate.
Using dynamic tenant resolver can complement it: https://quarkus.io/guides/security-openid-connect-multitenancy#tenant-config-resolver - for the bearer single tenant case one can just implement it as:
final OidcTenantConfig config = new OidcTenantConfig();
config.setTenantId("tenant-c");
config.setAuthServerUrl("http://localhost:8180/realms/tenant-c");
return Uni.createFrom().item(config);
The config resolution will take time at the first request
Thanks for the fix, @edeandrea! Personally, I think these settings should be the default and you should have to turn on startup discovery. Regarding the 3.7.1 version, I get that when creating a new project with the Quarkus CLI:
sdk install quarkus
quarkus create app com.okta.rest:quarkus --extension="quarkus-oidc,resteasy-reactive" --gradle
@sberyozkin If I remove the resolve-early
property, it starts OK but prints this to the console:
2024-01-31 16:06:48,551 WARN [io.qua.oid.run.OidcRecorder] (vert.x-eventloop-thread-1) OIDC server
is not availa2024-01-31 16:06:48,555 WARN [io.qua.oid.run.OidcRecorder] (vert.x-eventloop-thread-1)
Tenant 'Default': 'OIDC Server is not available'. OIDC server is not available yet, an attempt to connect
will be made during the first request. Access to resources protected by this tenant may fail if OIDC server
will not become available
Since printing this message will likely slow startup, I'll keep using the property.
Update: Start time for Quarkus: (45 + 41 + 42 + 44 + 49) / 5 = 44.2
Personally, I think these settings should be the default and you should have to turn on startup discovery.
I would tend to agree. @sberyozkin is the right person to answer as to why it is the way it is.
Reopening since Helidon is still slow to start, even after #6. (183 + 265 + 177 + 386 + 303) / 5 = 262.8
switching to 3.7.1 also seems to work fine and produce similar results, even if it hasn't yet been released.
For projects that always want to be on the bleeding edge, using
<quarkus.platform.artifact-id>quarkus-bom</quarkus.platform.artifact-id>
gets around this problem
@sberyozkin If I remove the resolve-early property, it starts OK but prints this to the console:...
So, the OIDC server is indeed may not be available at the startup time ? That is fine then, Quarkus will attempt to reconnect at the first request, but I guess at the demo level it may be of some concern.
quarkus.oidc.jwks.resolve-early=false
has not been introduced to deal with the startup time optimizations but to address a concrete user requirement related to the JWKS endpoint authentication which is typically unnecessary. I'm happy that it can help with this optimization though. That said it is a new property and may need some time to settle.
As I said earlier, you can try removing everything from the application properties and have the following bean in the project:
import jakarta.enterprise.context.ApplicationScoped;
import io.quarkus.oidc.OidcRequestContext;
import io.quarkus.oidc.OidcTenantConfig;
import io.quarkus.oidc.TenantConfigResolver;
import io.smallrye.mutiny.Uni;
import io.vertx.ext.web.RoutingContext;
@ApplicationScoped
public class Auth0TenantConfigResolver implements TenantConfigResolver {
@Override
public Uni<OidcTenantConfig> resolve(RoutingContext context, OidcRequestContext<OidcTenantConfig> requestContext) {
final OidcTenantConfig config = new OidcTenantConfig();
config.setTenantId("auth0");
config.setAuthServerUrl("https://dev-06bzs1cu.us.auth0.com");
// Optionally disable the discovery too
config.setDiscoveryEnabled(false);
config.setJwksPath("https://dev-06bzs1cu.us.auth0.com/.well-known/jwks.json");
return Uni.createFrom().item(config);
}
}
At least please keep this option in mind.
Personally, I think these settings should be the default and you should have to turn on startup discovery.
IMHO it is important to be able to know at the prod startup time if the OIDC server is actually available. We have 2 options, one which Matt is currently using, and the one I've just typed, that can give users all the flexibility. Cheers