internetarchive/heritrix3

cannot resolve these dependencies

Closed this issue · 10 comments

org.archive.heritrix heritrix-commons 3.4.0-20210923 org.archive.heritrix heritrix-modules 3.4.0-20210923 org.archive.heritrix heritrix-engine 3.4.0-20210923
ato commented

I assume that's a screenshot from some IDE but some more context would be helpful. :-)
What are you trying to do? What software are you using?

That looks as if the extra repositories that Heritrix requires are not being searched. com.sleepycat:je is in https://download.oracle.com/maven and the other three are in http://builds.archive.org/maven2/ (which recent versions of Maven refuse to access by default without a workaround)

i'm using idea ide. a program with Heritrix3 based on springboot.

when i download these dependencies, the mistakes occured.so how can i resolve these problems.thanks!

ato commented

If you're including it as a dependency in a Maven project maybe try adding these repositories to your pom.xml file, although I would normally expect them to be included automatically. You may also need Andy's ~/.m2/settings.xml workaround.

    <repositories>
        <repository>
            <id>builds.archive.org,maven2</id>
            <url>http://builds.archive.org/maven2</url>
        </repository>
        <repository>
            <id>oracleReleases</id>
            <name>Oracle Released Java Packages</name>
            <url>https://download.oracle.com/maven</url>
        </repository>
    </repositories>

If you're using Gradle, I can't help as I've never used it.

However personally I don't recommend embedding Heritrix inside another Java application as it has a lot of dependencies that may cause conflicts and it also does some surprising things like globally seting the JVM's timezone to UTC. I recommend controlling it via the REST API if possible.

thanks,when the crawl data come back,i want process them for other biz. if i use the rest api, i must build another data table to translate the data. i would be complicated。

it would be complicated。