BoilerPipe for Android
GoogleCodeExporter opened this issue · 9 comments
GoogleCodeExporter commented
i have customized boilerpipe to run on Android this is the jar (see attach)
geppo1988@gmail.com
Original issue reported on code.google.com by geppo1...@gmail.com
on 27 Nov 2012 at 8:57
Attachments:
GoogleCodeExporter commented
you still need to add xerces and nekohtml in Build Path
geppo1988@gmail.com
Original comment by geppo1...@gmail.com
on 27 Nov 2012 at 9:00
GoogleCodeExporter commented
Hello. I really need to use this but i still don't know how. I have had no
issues with the regular version, but i don't know how to use the android
version.
I am getting the error:
java.lang.NoClassDefFoundError: de.l3s.boilerpipe.extractors.ArticleExtractor
I have placed the following into my JRE lib folder. Then i added them to the
project's build path as libraries using "Add external Jar"
boilerpipe-1.2.0-android.jar
xerces-2.9.1.jar
nekohtml-1.9.13.jar
The boilerpipe file is the one you linked. The two dependencies are the ones
that came with the original jar file that i got working in java. What
dependencies do i use? Are there android specific ones?
Any help would be greatly appreciated, thanks!
Original comment by m...@issist.com
on 13 Feb 2013 at 6:36
GoogleCodeExporter commented
I was getting the error while Dex Execute saying that the HTMLElement$Element
is duplicate. When I used your android jar the error has gone.
I think you have removed the org.cyberneko.html package from the boilerpipe.jar.
Original comment by chandu12...@gmail.com
on 3 Dec 2012 at 11:33
GoogleCodeExporter commented
:)! Is all ok? Is working fine? Is there bug?
Original comment by geppo1...@gmail.com
on 4 Dec 2012 at 8:23
GoogleCodeExporter commented
It works!
You are a life saver. I spent an entire day uselessly poking around with build
paths, installing the boilerpipe source, and alternating between chandu's
problem and java.lang.NoClassDefFoundError. Don't know why I didn't find this
sooner. Thank you.
Original comment by wmarqua...@gmail.com
on 17 Dec 2012 at 12:08
GoogleCodeExporter commented
In fact i posted to save others time to search for a solution to the problem,
of course it was not anything difficult
Original comment by geppo1...@gmail.com
on 18 Dec 2012 at 11:36
GoogleCodeExporter commented
Oh my gosh thank u so much for this file. I was about to give up on my app.
Everybody don't forget to use this in an async task
Original comment by 96hud...@gmail.com
on 18 Jun 2013 at 3:01
GoogleCodeExporter commented
Thanks a lot. I found multiple duplicate files in both the jars but it was
tedious to find and remove. Your jar file saved lot of work.
Original comment by meetjas...@gmail.com
on 15 Sep 2013 at 11:22
GoogleCodeExporter commented
I really need this to work. Anybody solve the NoClassDefFoundException?
I included nekohtml and xerces as external jars, dropped boilerpipe jar right
into my libs folder. I run:
URL url;
try {
url = new URL("someurlhere");
String text = ArticleExtractor.INSTANCE.getText(url);
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (BoilerpipeProcessingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
And get he following trace:
06-19 22:52:24.748: E/AndroidRuntime(29595): FATAL EXCEPTION: main
06-19 22:52:24.748: E/AndroidRuntime(29595): java.lang.NoClassDefFoundError:
de.l3s.boilerpipe.sax.BoilerpipeHTMLParser
06-19 22:52:24.748: E/AndroidRuntime(29595): at
de.l3s.boilerpipe.sax.BoilerpipeSAXInput.getTextDocument(BoilerpipeSAXInput.java
:51)
06-19 22:52:24.748: E/AndroidRuntime(29595): at
de.l3s.boilerpipe.extractors.ExtractorBase.getText(ExtractorBase.java:69)
06-19 22:52:24.748: E/AndroidRuntime(29595): at
de.l3s.boilerpipe.extractors.ExtractorBase.getText(ExtractorBase.java:87)
Original comment by wesbl...@gmail.com
on 19 Jun 2014 at 10:58