stanfordnlp/CoreNLP

Upgrade Apache Lucene to resolve vulnerability for consumers

ciscoo opened this issue · 8 comments

Currently, this project uses 7.5.0 of Apache Lucene: https://github.com/stanfordnlp/CoreNLP/blob/main/pom.xml#L77

As a result, the following vulnerability is introduced into projects:

We use Sonartype IQ Server (NexusIQ) to scan for vulnerabilites in our dependencies and that is how this was flagged.

As a workaround, we upgrade the dependencies:

[versions]
lucene = "9.8.0"
configurations.configureEach {
    resolutionStrategy {
        dependencySubstitution {
            substitute(module("org.apache.lucene:lucene-analyzers-common"))
                    .using(module("org.apache.lucene:lucene-analysis-common:${libs.versions.lucene.get()}"))
                    .because("Module was renamed in 9.x release")
        }
        eachDependency {
            if (requested.group == "org.apache.lucene") {
                useVersion(libs.versions.lucene.get())
                because("""
                Resolves IQ issue.
                There does not exist a BOM either https://github.com/apache/lucene/issues/11422, so bump all
                lucene dependencies to keep them in sync rather than the single one.
            """.trimIndent())
            }
        }
    }
}

But as you can see, this adds quite a bit of ceremony.

It would be better if CoreNLP can upgrade Apache Lucene so that the above would not be needed.

It was the minimum non-vulnerable version. I help maintain some projects internally and I'm not familiar enough with the project. So I opted for the minimum allowed version by our internal Sonar IQ server.

Does that publish a snapshot somewhere such as Maven Central? If so, I can try it out Thursday. Otherwise I'd need to wait until a release is made.

Actually, I'm not sure I can update all the way to 9.9.1 w/o breaking Java 1.8 compatibility. Let me check which versions would actually work with Java 1.8, then hopefully there's one which has the necessary patch in it.

Honestly I think we're screwed here. The earliest version of Lucene which has this fix is 9.8.0, and it also targets 11. I'll bring it up with my PI in terms of possibly switching to Java 11 in the future.

Right, most of the Java ecosystem is moving towards targeting more modern versions of Java.