German examples
Closed this issue · 3 comments
Any site which provide example codes for German and which trained models to use?
The following site provide Java.
http://gromgull.net/blog/2010/01/noun-phrase-chunking-for-the-awful-german-language/
`java -cp $CP opennlp.tools.lang.german.SentenceDetector \models/german/sentdetect/sentenceModel.bin.gz
|
java -cp $CP opennlp.tools.lang.german.Tokenizer \models/german/tokenizer/tokenModel.bin.gz |
java -cp $CP -Xmx100m opennlp.tools.lang.german.PosTagger \models/german/postag/posModel.bin.gz |
java -cp $CP opennlp.tools.lang.english.TreebankChunker \models/german/chunking/GermanChunk.bin.gz`
How could we do that using OpenNLP.NET?
I'd say download the German model files on OpenNlp website: http://opennlp.sourceforge.net/models-1.5/
And use them along with the Tokenizer, SentenceDetector and POSTagger in the project.
I've never done that though - and I don't know German - so I can't tell you if it works well.
Keep me posted if you manage to do it.
I ended using models-1.4 and use modelConverter to change the models to nbin. The models-1.5 (.bin format) do not work with the modelConverter.
How to do Parse tree for German, which model files to use? There is no Name entity recognition and Coreference. Any suggestion?
Unfortunately there is no model file for the Parser tree, NER and Coreference that I know of.
You can train your own models (look at MaximumEntropyNameFinder.Train method for instance) but you'll need examples.