Efficient webjars version resolution via `webjars-locator-lite`
dsyer opened this issue ยท 29 comments
Spring (MVC and Webflux) has a ResourceResolver
abstraction that can be used to resolve the versions in webjars, avoiding the need to maintain the version explicitly in 2 or more places (build file and HTML source). E.g. (from Petclinic):
<script src="/webjars/jquery/jquery.min.js"></script>
Resolves to classpath:/META-INF/resources/webjars/jquery/<version>/jquery.min.js
at runtime.
Spring Boot carries the responsibility of configuring the resource resolver, and currently it uses the webjars-locator-core
(https://github.com/webjars/webjars-locator-core) library to do that, so version resolution only works if that library is on the classpath. The WebJarsAssetLocator
from that library has a very broad and powerful API for locating files inside webjars, but there are some issues, namely:
- It is fairly inefficient, since it scans the whole
/META-INF/resources/webjars
classpath on startup (in a constructor!). - It has 2 awkward dependencies (github classpath scanner and jackson)
- It doesn't work in a native image (spring-attic/spring-native#157) because of the classpath scanning
But we don't need webjars-locator-core
to just do version resolution, which is all Spring Boot offers, because webjars have a very well-defined structure. They all have a pom.properties
with the version in it, and they only use a handful of well-known group ids, so they are easy to locate. It might be a good idea to implement it in Framework, since it is so straightforward and only depends on reading resources from the classpath.
All of the issues above could be addressed just by providing a simpler version resolver natively (and configuring the resource config in a native image with a hint).
Here's an implementation (with no caching or any optimizations):
public class WebJarsVersionResourceResolver extends AbstractResourceResolver {
private static final String PROPERTIES_ROOT = "META-INF/maven/";
private static final String NPM = "org.webjars.npm/";
private static final String PLAIN = "org.webjars/";
private static final String POM_PROPERTIES = "/pom.properties";
@Override
protected Resource resolveResourceInternal(@Nullable HttpServletRequest request, String requestPath,
List<? extends Resource> locations, ResourceResolverChain chain) {
Resource resolved = chain.resolveResource(request, requestPath, locations);
if (resolved == null) {
String webJarResourcePath = findWebJarResourcePath(requestPath);
if (webJarResourcePath != null) {
return chain.resolveResource(request, webJarResourcePath, locations);
}
}
return resolved;
}
@Override
protected String resolveUrlPathInternal(String resourceUrlPath,
List<? extends Resource> locations, ResourceResolverChain chain) {
String path = chain.resolveUrlPath(resourceUrlPath, locations);
if (path == null) {
String webJarResourcePath = findWebJarResourcePath(resourceUrlPath);
if (webJarResourcePath != null) {
return chain.resolveUrlPath(webJarResourcePath, locations);
}
}
return path;
}
@Nullable
protected String findWebJarResourcePath(String path) {
String webjar = webjar(path);
if (webjar.length() > 0) {
String version = version(webjar);
// A possible refinement here would be to check if the version is already in the path
if (version != null) {
String partialPath = path(webjar, version, path);
if (partialPath != null) {
String webJarPath = webjar + File.separator + version + File.separator + partialPath;
return webJarPath;
}
}
}
return null;
}
private String webjar(String path) {
int startOffset = (path.startsWith("/") ? 1 : 0);
int endOffset = path.indexOf('/', 1);
String webjar = endOffset != -1 ? path.substring(startOffset, endOffset) : path;
return webjar;
}
private String version(String webjar) {
Resource resource = new ClassPathResource(PROPERTIES_ROOT + NPM + webjar + POM_PROPERTIES);
if (!resource.isReadable()) {
resource = new ClassPathResource(PROPERTIES_ROOT + PLAIN + webjar + POM_PROPERTIES);
}
// Webjars also uses org.webjars.bower as a group id, so we could add that as a fallback (but not so many people use those)
if (resource.isReadable()) {
Properties properties;
try {
properties = PropertiesLoaderUtils.loadProperties(resource);
return properties.getProperty("version");
} catch (IOException e) {
}
}
return null;
}
private String path(String webjar, String version, String path) {
if (path.startsWith(webjar)) {
path = path.substring(webjar.length()+1);
}
return path;
}
}
and here's how to install it in a Spring Boot application:
@Bean
public WebMvcConfigurer configurer() {
return new WebMvcConfigurer() {
@Override
public void addResourceHandlers(ResourceHandlerRegistry registry) {
registry.addResourceHandler("/webjars/**").addResourceLocations("classpath:META-INF/resources/webjars/").resourceChain(true).addResolver(new WebJarsVersionResourceResolver());
}
};
}
(See it work in the Petclinic here: https://github.com/dsyer/spring-petclinic/blob/webjars/src/main/java/org/springframework/samples/petclinic/system/WebJarsVersionResourceResolver.java.)
I gave this solution a spin in one of my projects and so far the impressions are good.
Is it OK if I refine it a bit and provide a PR to replace the existing (WebJars Locator based) WebJarsResourceResolver
implementations?
@vpavic I don't think we should implement the resolution mechanism in Spring Framework directly, as this is webjars-locator's purpose in the first place. As far as I know there is no concrete specification here to follow, so we might derive from current or future behavior of the official library.
GraalVM native becoming an important topic in the Java community, I think we should instead think about proper native support in webjars. Frameworks could instead drive webjars-locator at build time; the library could resolve all available resources and dump them in an index and possibly GraalVM resource configuration, since those resources will be needed at runtime. At runtime, Frameworks could then query the locator library for resources and it could use its own index (no scanning involved!) to resolve resources.
I'm seeing that other frameworks are working on custom-made solutions to tackle this problem and I think it would be nice to consider that as a community.
I'm closing this issue as a result since this is not the path we're choosing, but we can keep using this to discuss possible plans. Thanks!
I'm a little bit disappointed as I don't think it's true that "this is webjars-locator's purpose in the first place" - that library has intentionally given itself a much larger surface area than we need for simply locating a version for a webjar, and it's a huge waste of resources to use webjars-locator if that's all you need (which I believe is 99-100% of Spring users).
The tagline of "webjars-locator-core" is "locate assets within WebJars" and that's exactly how we're using it. If the approach described here is much more efficient and compatible with native apps, this should be considered as the default in the library itself. I don't think that re-implementing this feature in Spring directly is doing much good to the webjars community.
I agree with @dsyer.
The implementation outlined here does things Spring way and is therefore simpler than anything 3rd party could be, with added benefits of not requiring any 3rd party dependencies (or its own transitive dependencies), and not having to manage dependency at Spring Boot level.
@bclozel you might want to also take a look at spring-projects/spring-boot#31560 (which is how I got here in the first place) - over there even @jamesward expressed preference for Spring Framework itself not having to rely on webjars-locator-core
.
The implementation outlined here does things Spring way and is therefore simpler than anything 3rd party could be, with added benefits of not requiring any 3rd party dependencies (or its own transitive dependencies), and not having to manage dependency at Spring Boot level.
Spring Framework is mostly about integration with 3rd party libraries. We usually roll our own implementations for well-known specs or when there's no support in the Java community. This is not the case here.
@bclozel you might want to also take a look at spring-projects/spring-boot#31560 (which is how I got here in the first place)
Thanks @vpavic I was very much aware of the Spring Boot issue. This decision is backed by both teams.
over there even @jamesward expressed preference for Spring Framework itself not having to rely on webjars-locator-core.
In the official Webjars documentation I'm seeing that many projects don't support version agnostic resolution. This is the feature shipped by the webjars-locator-core library. If this feature is not considered useful after all, we could use the existing infrastructure entirely and drop any webjar-specific implementation. Would this solve the problem?
The main purpose of webjars-locator-core
is to translate paths like /webjars/jquery/jquery.js
to the file in the classpath like jar:file:///foo/blah/jquery.jar!/META-INF/resources/webjars/org.webjars/jquery/1.9.0/jquery.js
There is some logic in that transform which could be pulled out into a new library (webjars-locator-core-common
for lack of a more terrible name idea) or could have a formal spec. For Spring, the classpath reading part should really be done in Spring since it has its own way to do this and doing it differently for just WebJars has a performance cost and is not compatible with Spring Native. Let me know if there is anything I can help with on this.
The main purpose of webjars-locator-core is to translate paths like /webjars/jquery/jquery.js to the file in the classpath like jar:file:///foo/blah/jquery.jar!/META-INF/resources/webjars/org.webjars/jquery/1.9.0/jquery.js
As far as I understand, this is the only feature we're using in Spring.
There is some logic in that transform which could be pulled out into a new library (webjars-locator-core-common for lack of a more terrible name idea) or could have a formal spec.
But how this library would know about which version string to use, doesn't that require looking into the classpath?
For Spring, the classpath reading part should really be done in Spring since it has its own way to do this and doing it differently for just WebJars has a performance cost and is not compatible with Spring Native.
Finding out about the version string for each JAR is really the important part and this doesn't depend on Spring. As for the performance cost, there isn't any if the scanning is performed at build time. During the AOT phase, Spring could trigger the scanning and the resolution of the "webjar name"/"version string" pairs required for runtime resolution.
Let me know if there is anything I can help with on this.
With the mechanism I've just described, not only GraaVM native support could be achieved for all consumers, but this could also be used for the vanilla JVM case and bring significant improvements as scanning would not be required during startup anymore. Should I open an issue against the webjars library to discuss this?
doesn't that require looking into the classpath?
Yes, but to get the version you only need to read a classpath resource and it's always in the same place - there's no need for scanning, which is where the extra baggage comes in webjars. Scanning isn't really required at all (as shown by the code snippet I provided originally), so it's a distraction. If @jamesward is open to make the scanning features optional in webjars (either by extracting another jar, or by making the existing dependencies optional), I can definitely help with that. Spring users just want those version-free resource paths.
One of the core goals of WebJars is making the assets easily cachable which is why the artifacts include the versions in the paths. The usage of webjars-locator-core
actually makes things not easily cachable (ETAG, far-future expires, etc). So really, if we are going to change things, I'd rather try to come up with something that fits with the easily cachable goal. With Play Framework we built a number of things that make it easy for users to use WebJars (versionless) but then resolve to the easily cachable artifacts. As an example, in Play users use a layer on top of webjars-locator-core
in their server-side rendering templates so they can do this:
@webJarsUtil.locate("bootstrap.min.css").css()
And that finds the resource and renders something like:
<link rel="stylesheet" href="/webjars/jquery/1.10.0/bootstrap.min.css">
In Play there is a whole JS / CSS pipeline that WebJars fits into as well and I'm not sure if there is something similar in Spring.
All that to say... Ultimately I'd rather help users move away from serving versionless paths to WebJar artifacts but I'm not sure how feasible that is in Spring.
This is exactly what we've been doing since Spring Framework 4.1 - see the reference documentation on static content. We also support appending a content hash to the path for immutable resources with CDNs.
The code snippet I pointed out above uses webjars-locator-core to resolve the correct version and rewrite links (resolutions are of course cached). As far as I understand, Play's WebJarsUtil
is also using WebJarAssetLocator
from webjar-locator-core. So we're essentially doing the same thing.
I think @dsyer 's point still stands.
Just to make sure everyone's on the same page in terms of the current state of WebJarsResourceResolver
vs what's being proposed here, I've pushed a branch with the WIP changes that I did last week.
@dsyer said "It is fairly inefficient", which is very polite of him but I want to stretch on that a bit:
I was profiling a test-suite the other day whose allocations flame-graphs have ~63%
of the frames only matched by classgraph scanning that is entirely caused by resolving/locating webjars.
Now of course this doesn't translate to CPU 1:1 where it's only ~10%, but notice how much is spent in G1 garbage collection on top of that (unsurprisingly).
A test-suite is obviously not a production environment where this isn't as noticeable. But tests usually start several contexts and the general startup routine is executed more often usually. So working on improving that inside Spring directly might be a tremendous boost in developer productivity for certain projects, because it will directly impact test suites, startups etc...
@dreis2211 If you have time and are willing to, maybe you could do the same profiling session against the WebJarsResourceResolver
implementation I've prepared in vpavic/spring-framework/tree/gh-27619? Just for sake of showing the impact of the proposed changes.
I still can't understand downside of having this natively in Spring as:
- it's just some 20 lines of code to maintain
- it removes the need for 3rd party dependency (and its own transitive dependencies)
- it's much more efficient
- it only uses Spring's own facilities to interact with class path resources (nothing more is really needed, because as @dsyer points out, WebJars have a well-defined structure)
@vpavic I've profiled against a slightly trimmed down version of @dsyer's proposal which looks very similar to yours. (But e.g. we don't need the NPM lookup in our projects). The impact on the flamegraphs is pretty obvious. No block of classpath scanning anymore (no purple block anymore on the left)
Unfortunately, the allocation profiles are almost 700MB large and have trouble to be rendered at the moment, but I assume the impact is of course also noticeable there. (I will add that once I have proof - might be next week though).
EDIT: Here we go. As expected no classgraph
frames anymore (no purple blocks).
The impact on GC didn't turn out to be that huge as you can tell from the CPU profile. Which is somewhat explainable by the fact that our environment is fairly memory restricted and probably busy anyway with GC. Even with the included improvement.
Now to the actual timing improvements: hard to say. My local tests (on a Macbook) showed 10-15% improvement in terms of times, tests on our CI environment (Linux Jenkins) only 5-10% (and unfortunately more to the 5% side). It also hardly depends on the test suite of course. E.g. how many test application contexts are started (where the classpath scanning is done) etc.. So some projects might benefit, others won't. For example and fairness reasons, I have some internal projects that are highly optimized to not start too many tests contexts, that will barely benefit, I suppose.
TL;DR: Actual impact on timing could be better - or at least I hoped for more - but it's definitely noticeable.
Optimizing the runtime approach is great but I'd still really like to see a compile time approach as well. For run time - @vpavic if that doesn't make sense in the spring project we could create a webjars-locator-spring project for it. For compile time, could we rely on the spring gradle & maven plugins to provide what we need (webjars on classpath to a mapping file)?
@dreis2211 Thanks for those useful feedback.
I strongly care about this use case, and I understand it can be tempting to solve it on Spring side as a workaround but I agree with @bclozel, we should not implement this logic in Spring Framework especially without any spec, this should be provided by a WebJars library integrated by frameworks like Spring, as that would also benefits to others.
@jamesward I am not aware of all the details of WebJars, but based on @dsyer and @vpavic experiments like https://github.com/vpavic/spring-framework/tree/gh-27619 I am not sure there is a need for a build-time approach. I think the proposed approach which consists in resolving at runtime the version from the embedded pom.properties
is efficient, does not require any scanning, and it also works on native (with the related resource hints). IMO the best outcome would be to have this approach contributed in a WebJars library, ideally directly in webjars-locator-core
, and we would update WebJarsResourceResolver
to leverage it. If it is not possible in WebJarAssetLocator
, could be in another class. If you are open to that, maybe you could provide some guidance in order to allow @dsyer, @vpavic or somebody else who would be interested to contribute it on https://github.com/webjars side?
In the meantime, since we need a short term solution, I have created #29322 to recommend using for the time being WebJars versioned URL + resourceLocations configuration without WebJarsResourceResolver
. Happy to refine it to recommend again using WebJarsResourceResolver
when there will be a way to use it without the scanning overhead and the impact on native image compatibility.
I'd happily contribute this wherever is deemed as best fit, naturally with some guidance if it's outside of Spring.
However, if lack of formal spec is what prevents this from landing in Spring, I really wonder if that route is preferred. Because as @jamesward mentioned in #27619 (comment), class path reading feels like something that should be done in Spring having in mind it has its own facilities to do so.
Looks pretty straightforward to implement without Spring using getResourceAsStream()
+ Properties
unless I miss something.
I also don't understand the reluctance to do something simple in Spring, but @sdeleuze is correct - there's nothing Spring-specific about the implementation, and you can do it simply using only JDK library APIs. I opened an issue over in Webjars in case we can agree on something sensible over there (webjars/webjars-locator-core#96). @jamesward will probably have noticed already.
Thanks all for the progress on this. I had assumed at some point Spring was doing some classpath scanning already that WebJars could potentially piggyback on to avoid double scanning. Has that thread of thought reached the end-of-the-line?
I think it makes sense to move the discussion about improving the runtime locator support to: webjars/webjars-locator-core#96
For build time locator support, a question... would it make any sense to explore doing this as part of Spring's existing Gradle & Maven plugins or should we really move that discussion out to somewhere Spring-less as well?
Thanks all for the progress on this. I had assumed at some point Spring was doing some classpath scanning already that WebJars could potentially piggyback on to avoid double scanning. Has that thread of thought reached the end-of-the-line?
Hey James. Typical Spring Boot application only performs scanning of the user package to discover classes so nothing reusable for WebJars use case.
I think it makes sense to move the discussion about improving the runtime locator support to: webjars/webjars-locator-core#96
I agree.
For build time locator support, a question... would it make any sense to explore doing this as part of Spring's existing Gradle & Maven plugins or should we really move that discussion out to somewhere Spring-less as well?
I don't think a build time approach makes sense here because scanning seems not needed for Spring (and probably most other frameworks) use case, so the extra complexity of a such approach would be hard to justify. I think exploring scanning-less at runtime implementation as part of webjars/webjars-locator-core#96 via Maven property metadata is really the best way to move forward.
As discussed in webjars/webjars-locator-lite#1, I have reopening this issue with the goal to provide an efficient and native-friendly support for WebJars in Spring Framework 6.2 leveraging webjars-locator-lite
dependency.
I've put together a WebJarsResourceResolver
variant based on webjars-locator-lite
in one of my projects shortly after the initial release of lite variant was published, and wanted to open a PR against Spring Framework with that change - should I proceed with that @sdeleuze?
Thanks @vpavic for the offer but I have already something locally so I should be good. But I will welcome early feedback on snapshots when pushed.
Waiting for feedback on webjars/webjars-locator-lite#1 (comment) before merging the feature.
Merged, feedback welcomed. Please use org.webjars:webjars-locator-lite:0.0.3
dependency with Spring Framework 6.2.0-SNAPSHOT
.
Until Spring Boot has updated ConditionalOnEnabledResourceChain
, you may have to add org.webjars:webjars-locator-core:0.58
to the classpath as well (webjars-locator-lite
is used when both webjars-locator-lite
and webjars-locator-core
are found in the classpath).
Sorry for the late feedback, but I only now realized I didn't post anything here.
It feels to me that support for webjars-locator-lite
could've been done in a simpler way, without rolling out new resolver implementations and deprecating existing ones. Since Spring's use of WebJars locator (either old one or new lite variant) is effectively a bi-function I see no reason for existing WebJarsResourceResolver
to not internally treat it as such and deprecating only a constructor that takes WebJarAssetLocator
.
In this case the default constructor does change the behavior since it will work with lite locator, but I'd argue this is in practice isn't an issue since ResourceChainRegistration
does classpath check before choosing what to use and I don't really expect anyone using the default constructor directly (but rather the one that accepts a custom locator instace).
I'll open a draft PR soon to clarify what I mean with some code (update: see #33495).