when uploading a document the text is not saved in the full text field in the documents table
Closed this issue · 38 comments
Currently I also get the notification "Automatic summarization failed." when uploading a document in the master branch. @MBrouns, do you have an idea what is causing this?
is the fulltext field filled in the database? is the dbPath with which the
jar is called set correctly to match the location on your pc?
On Sun, Mar 30, 2014 at 11:51 AM, Friso Abcouwer
notifications@github.comwrote:
Currently I also get the notification "Automatic summarization failed."
when uploading a document in the master branch. @MBrounshttps://github.com/MBrouns,
do you have an idea what is causing this?Reply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39021629
.
No I meant this is the same issue (fulltext doesn't get put into DB)
@MBrouns @yetti4 Full text should be neatly put in the db. It's just not working for some files. I think it might be sth to do with validation, but I can't really figure out how exactly. You guys got any ideas as to what could be causing this?
Maybe something to do with UTF/ANSI encoding?
Don't think so. It's no problem getting the file contents and/or writing it. It's just a normal string with any file. Some texts just disappear when trying to save the model
Is there a max size to a text field in sqlite?
On Sun, Mar 30, 2014 at 5:39 PM, bouke-nederstigt
notifications@github.comwrote:
Don't think so. It's no problem getting the file contents and/or writing
it. It's just a normal string with any file. Some texts just disappear when
trying to save the modelReply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39028576
.
Nope. Tested with much larger files that where ok. I think it has to do
with characters in the document. Like lists etc. I just need to figure out
which ones or how to filter them out
Bouke Nederstigt
Oude Delft 223
2611HD Delft
MOB: (+31) 65 34 47 826
2014-03-30 17:54 GMT+02:00 MBrouns notifications@github.com:
Is there a max size to a text field in sqlite?
On Sun, Mar 30, 2014 at 5:39 PM, bouke-nederstigt
notifications@github.comwrote:Don't think so. It's no problem getting the file contents and/or writing
it. It's just a normal string with any file. Some texts just disappear
when
trying to save the modelReply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39028576>
.Reply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39029020
.
So the fulltext is still not saved sometimes, but for some reason the summary can be generated? I thought this wasn't possible
shouldn't be possible. My summarizer gets the text to summarize from the
database. it can still summarize when another document's fulltext is empty
though
On Mon, Mar 31, 2014 at 11:15 AM, bouke-nederstigt <notifications@github.com
wrote:
So the fulltext is still not saved sometimes, but for some reason the
summary can be generated? I thought this wasn't possibleReply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39068224
.
@MBrouns @yetti4 Problem seems to be solved by using a mySQL db. Fulltext is now saved in a BLOB column. This does however seem to break the summarization tool.
(int) 0 => 'Database connection established',
(int) 1 => 'java.lang.NullPointerException: null'
Makes sense, I use the sqlite connector. I'll make a new jar for the production environment. Can't test it myself though
so you'll have to do that
On Tue, Apr 1, 2014 at 1:28 PM, bouke-nederstigt
notifications@github.comwrote:
@MBrouns https://github.com/MBrouns @yetti4 https://github.com/yetti4Problem seems to be solved by using a mySQL db. Fulltext is now saved in a
BLOB column. This does however seem to break the summarization tool.(int) 0 => 'Database connection established',
(int) 1 => 'java.lang.NullPointerException: null'Reply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39194842
.
try the new version. pass extra cl argument "mysql" ath the end and use
dbpath in the following way:
jdbc:mysql://localhost/database?"
+ "user=sqluser&password=sqluserpw
On Tue, Apr 1, 2014 at 1:34 PM, Matthijs Brouns
matthijs.brouns@gmail.comwrote:
Makes sense, I use the sqlite connector. I'll make a new jar
SummarizerMYSQL for the production environment. Can't test it myself though
so you'll have to do thatOn Tue, Apr 1, 2014 at 1:28 PM, bouke-nederstigt <notifications@github.com
wrote:
@MBrouns https://github.com/MBrouns @yetti4 https://github.com/yetti4Problem seems to be solved by using a mySQL db. Fulltext is now saved in a
BLOB column. This does however seem to break the summarization tool.(int) 0 => 'Database connection established',
(int) 1 => 'java.lang.NullPointerException: null'Reply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39194842
.
sorry I made a mistake. try the new version again
Not sure I get the cmd right
java -jar C:\websites\crowd-summary\app../summarizers/Summarizer.jar 48 jdbc:mysql://localhost/database?" + "user=root&password=root mysql 2>&1?
(int) 0 => ''password' is not recognized as an internal or external command,',
(int) 1 => 'operable program or batch file.'
Try:
java -jar C:\websites\crowd-summary\app../summarizers/Summarizer.jar 48 "jdbc:mysql://localhost/database?user=root&password=root" mysql
java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
ah ye forgot to add the jar. Try again
On Tue, Apr 1, 2014 at 2:00 PM, bouke-nederstigt
notifications@github.comwrote:
java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
Reply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39197148
.
Getting closer
(int) 1 => 'com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[document_id] FROM sentences WHERE document_id = 52' at line 1'
try again. and put your changes so it works on mysql in your gitignore so it keeps working for us
Thought I put that in my gitignore already. Almost there.
(int) 1 => 'com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'fulltext FROM documents WHERE id = 53' at line 1'
Could you change fulltext to full_text? fulltext is a reserved keyword in mySQL
In the new version I put backticks around fulltext if it uses mysql so you shouldn't need to change anything in your db
Next error
array(
(int) 0 => 'Database connection established',
(int) 1 => 'Apr 01, 2014 2:25:45 PM edu.stanford.nlp.process.PTBLexer next',
(int) 2 => 'WARNING: Untokenizable: ? (U+FFFD, decimal: 65533)',
(int) 3 => 'noOfLines: 15',
(int) 4 => 'Database insertion complete',
(int) 5 => 'Start generating keywords for document',
(int) 6 => 'java.lang.NullPointerException: null'
)
What happens if you use a longtext column instead of blob in mysql
It seems we're finally getting closer to the source of all these troubles. I could only change the column after emptying the table. But at least I now get SQL errors from cake and some signs to go on. I'll see if I can figure out how to sanitize the data. Let me know if you have any ideas.
SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\x93\x09But ...' for column 'fulltext' at row 1
It seems to have to do with encoding of the file after all.
summarizer geeft trouwens de volgende error nu
(int) 0 => 'Database connection established',
(int) 1 => 'noOfLines: 15',
(int) 2 => 'Database insertion complete',
(int) 3 => 'Start generating keywords for document',
(int) 4 => '0.003602441685296153 => inform',
(int) 5 => '0.003533163960578919 => retriev',
(int) 6 => '0.0027018312639721146 => document',
(int) 7 => '0.00187049856736531 => precis',
(int) 8 => '0.0011084435954757392 => system',
(int) 9 => '9.99340523088544E-4 => retrieval"',
(int) 10 => 'Keywords stored in database',
(int) 11 => 'Database connection in classifier established',
(int) 12 => 'training data created',
(int) 13 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)',
(int) 14 => 'java.lang.ArrayIndexOutOfBoundsException: 1'
@MBrouns I think I got conversion of documents figured out now. The text is correctly added to the db, and also to elasticsearch. I am however still getting the above error (for the weird documents, so it's probably still got to do with that).
May be you can investigate what's causing this error because I can't find any anomalies in the database text anymore. It's currently UTF-8. Anything that can't be converted is just ignored.
Different file (that worked before). Seems to be caused by the fact there were no proper sentences in there. Inputting a "." seemed to solve the problem (and recreate the previous one)
array(
(int) 0 => 'Database connection established',
(int) 1 => 'noOfLines: 1',
(int) 2 => 'Database insertion complete',
(int) 3 => 'Start generating keywords for document',
(int) 4 => '0.14527779480227473 => ametlorem',
(int) 5 => 'Keywords stored in database',
(int) 6 => 'Database connection in classifier established',
(int) 7 => 'training data created',
(int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)',
(int) 9 => 'java.lang.ArithmeticException: / by zero',
(int) 10 => ' at main.ClassifierSentence.(ClassifierSentence.java:65)',
(int) 11 => ' at main.Summarizer.main(Summarizer.java:221)',
(int) 12 => ' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)',
(int) 13 => ' at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)',
(int) 14 => ' at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)',
(int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)',
(int) 16 => ' at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)',
(int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'
Wait I might know what the problem is. Are there already personal summaries
(entries in users_sentences where user_id != 0)
On Tue, Apr 1, 2014 at 7:12 PM, bouke-nederstigt
notifications@github.comwrote:
Different file (that worked before)
array(
(int) 0 => 'Database connection established',
(int) 1 => 'noOfLines: 1',(int) 2 => 'Database insertion complete',
(int) 3 => 'Start generating keywords for document',
(int) 4 => '0.14527779480227473 => ametlorem',
(int) 5 => 'Keywords stored in database',
(int) 6 => 'Database connection in classifier established',
(int) 7 => 'training data created',
(int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)',
(int) 9 => 'java.lang.ArithmeticException: / by zero',
(int) 10 => ' at main.ClassifierSentence.(ClassifierSentence.java:65)',
(int) 11 => ' at main.Summarizer.main(Summarizer.java:221)',
(int) 12 => ' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)',
(int) 13 => ' at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
Source)',
(int) 14 => ' at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
Source)',
(int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)',
(int) 16 => ' at
org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)',
(int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'Reply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39231864
.
No. I emptied the db
Bouke Nederstigt
Oude Delft 223
2611HD Delft
MOB: (+31) 65 34 47 826
2014-04-01 19:38 GMT+02:00 MBrouns notifications@github.com:
Wait I might know what the problem is. Are there already personal summaries
(entries in users_sentences where user_id != 0)On Tue, Apr 1, 2014 at 7:12 PM, bouke-nederstigt
notifications@github.comwrote:Different file (that worked before)
array(
(int) 0 => 'Database connection established',
(int) 1 => 'noOfLines: 1',(int) 2 => 'Database insertion complete',
(int) 3 => 'Start generating keywords for document',
(int) 4 => '0.14527779480227473 => ametlorem',
(int) 5 => 'Keywords stored in database',
(int) 6 => 'Database connection in classifier established',
(int) 7 => 'training data created',
(int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)',
(int) 9 => 'java.lang.ArithmeticException: / by zero',
(int) 10 => ' at main.ClassifierSentence.(ClassifierSentence.java:65)',
(int) 11 => ' at main.Summarizer.main(Summarizer.java:221)',
(int) 12 => ' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)',
(int) 13 => ' at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
Source)',
(int) 14 => ' at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
Source)',
(int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)',
(int) 16 => ' atorg.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)',
(int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'Reply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39231864>
.Reply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39234847
.
then the classifier has no data to train on and will probably fail. Can you
add a document manually or reimport a document from the old db?
On Tue, Apr 1, 2014 at 8:02 PM, bouke-nederstigt
notifications@github.comwrote:
No. I emptied the db
Bouke Nederstigt
Oude Delft 223
2611HD DelftMOB: (+31) 65 34 47 826
2014-04-01 19:38 GMT+02:00 MBrouns notifications@github.com:
Wait I might know what the problem is. Are there already personal
summaries
(entries in users_sentences where user_id != 0)On Tue, Apr 1, 2014 at 7:12 PM, bouke-nederstigt
notifications@github.comwrote:Different file (that worked before)
array(
(int) 0 => 'Database connection established',
(int) 1 => 'noOfLines: 1',(int) 2 => 'Database insertion complete',
(int) 3 => 'Start generating keywords for document',
(int) 4 => '0.14527779480227473 => ametlorem',
(int) 5 => 'Keywords stored in database',
(int) 6 => 'Database connection in classifier established',
(int) 7 => 'training data created',
(int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)',
(int) 9 => 'java.lang.ArithmeticException: / by zero',
(int) 10 => ' at main.ClassifierSentence.(ClassifierSentence.java:65)',
(int) 11 => ' at main.Summarizer.main(Summarizer.java:221)',
(int) 12 => ' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)',
(int) 13 => ' at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
Source)',
(int) 14 => ' at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
Source)',
(int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)',
(int) 16 => ' atorg.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)',
(int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'
Reply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39231864>
.Reply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39234847>
.Reply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39237629
.
Just the users_sentences table?
Op 1 apr. 2014 20:03 schreef "MBrouns" notifications@github.com:
then the classifier has no data to train on and will probably fail. Can you
add a document manually or reimport a document from the old db?On Tue, Apr 1, 2014 at 8:02 PM, bouke-nederstigt
notifications@github.comwrote:No. I emptied the db
Bouke Nederstigt
Oude Delft 223
2611HD DelftMOB: (+31) 65 34 47 826
2014-04-01 19:38 GMT+02:00 MBrouns notifications@github.com:
Wait I might know what the problem is. Are there already personal
summaries
(entries in users_sentences where user_id != 0)On Tue, Apr 1, 2014 at 7:12 PM, bouke-nederstigt
notifications@github.comwrote:Different file (that worked before)
array(
(int) 0 => 'Database connection established',
(int) 1 => 'noOfLines: 1',(int) 2 => 'Database insertion complete',
(int) 3 => 'Start generating keywords for document',
(int) 4 => '0.14527779480227473 => ametlorem',
(int) 5 => 'Keywords stored in database',
(int) 6 => 'Database connection in classifier established',
(int) 7 => 'training data created',
(int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)',
(int) 9 => 'java.lang.ArithmeticException: / by zero',
(int) 10 => ' at
main.ClassifierSentence.(ClassifierSentence.java:65)',
(int) 11 => ' at main.Summarizer.main(Summarizer.java:221)',
(int) 12 => ' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)',
(int) 13 => ' at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
Source)',
(int) 14 => ' at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
Source)',
(int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)',
(int) 16 => ' atorg.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)',
(int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'
Reply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39231864>
.
Reply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39234847>
.Reply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39237629>
.Reply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39237824
.
The algorithm needs a document that is split in sentences and some of these
sentences should be selected in personal summaries (users_sentences)
On Tue, Apr 1, 2014 at 8:07 PM, bouke-nederstigt
notifications@github.comwrote:
Just the users_sentences table?
Op 1 apr. 2014 20:03 schreef "MBrouns" notifications@github.com:then the classifier has no data to train on and will probably fail. Can
you
add a document manually or reimport a document from the old db?On Tue, Apr 1, 2014 at 8:02 PM, bouke-nederstigt
notifications@github.comwrote:No. I emptied the db
Bouke Nederstigt
Oude Delft 223
2611HD DelftMOB: (+31) 65 34 47 826
2014-04-01 19:38 GMT+02:00 MBrouns notifications@github.com:
Wait I might know what the problem is. Are there already personal
summaries
(entries in users_sentences where user_id != 0)On Tue, Apr 1, 2014 at 7:12 PM, bouke-nederstigt
notifications@github.comwrote:Different file (that worked before)
array(
(int) 0 => 'Database connection established',
(int) 1 => 'noOfLines: 1',(int) 2 => 'Database insertion complete',
(int) 3 => 'Start generating keywords for document',
(int) 4 => '0.14527779480227473 => ametlorem',
(int) 5 => 'Keywords stored in database',
(int) 6 => 'Database connection in classifier established',
(int) 7 => 'training data created',
(int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)',
(int) 9 => 'java.lang.ArithmeticException: / by zero',
(int) 10 => ' at
main.ClassifierSentence.(ClassifierSentence.java:65)',
(int) 11 => ' at main.Summarizer.main(Summarizer.java:221)',
(int) 12 => ' at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)',
(int) 13 => ' at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
Source)',
(int) 14 => ' at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
Source)',
(int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)',
(int) 16 => ' atorg.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)',
(int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'
Reply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39231864>
.
Reply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39234847>
.
Reply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39237629>
.Reply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39237824>
.Reply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39238268
.
If you want and if you have phpmyadmin available publicly I could set it up
as well
On Tue, Apr 1, 2014 at 8:14 PM, Matthijs Brouns
matthijs.brouns@gmail.comwrote:
The algorithm needs a document that is split in sentences and some of
these sentences should be selected in personal summaries (users_sentences)On Tue, Apr 1, 2014 at 8:07 PM, bouke-nederstigt <notifications@github.com
wrote:
Just the users_sentences table?
Op 1 apr. 2014 20:03 schreef "MBrouns" notifications@github.com:then the classifier has no data to train on and will probably fail. Can
you
add a document manually or reimport a document from the old db?On Tue, Apr 1, 2014 at 8:02 PM, bouke-nederstigt
notifications@github.comwrote:No. I emptied the db
Bouke Nederstigt
Oude Delft 223
2611HD DelftMOB: (+31) 65 34 47 826
2014-04-01 19:38 GMT+02:00 MBrouns notifications@github.com:
Wait I might know what the problem is. Are there already personal
summaries
(entries in users_sentences where user_id != 0)On Tue, Apr 1, 2014 at 7:12 PM, bouke-nederstigt
notifications@github.comwrote:Different file (that worked before)
array(
(int) 0 => 'Database connection established',
(int) 1 => 'noOfLines: 1',(int) 2 => 'Database insertion complete',
(int) 3 => 'Start generating keywords for document',
(int) 4 => '0.14527779480227473 => ametlorem',
(int) 5 => 'Keywords stored in database',
(int) 6 => 'Database connection in classifier established',
(int) 7 => 'training data created',
(int) 8 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)',
(int) 9 => 'java.lang.ArithmeticException: / by zero',
(int) 10 => ' at
main.ClassifierSentence.(ClassifierSentence.java:65)',
(int) 11 => ' at main.Summarizer.main(Summarizer.java:221)',
(int) 12 => ' at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)',
(int) 13 => ' at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown
Source)',
(int) 14 => ' at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
Source)',
(int) 15 => ' at java.lang.reflect.Method.invoke(Unknown Source)',
(int) 16 => ' atorg.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)',
(int) 17 => 'java.lang.ArrayIndexOutOfBoundsException: 1'
Reply to this email directly or view it on GitHub<
.
Reply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39234847>
.
Reply to this email directly or view it on GitHub<
#25 (comment).
Reply to this email directly or view it on GitHub<
https://github.com/yetti4/crowd-summary/issues/25#issuecomment-39237824>
.Reply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39238268
.
Haven't got it publicly. vhosts based on ip's needs multiple instances of apache. Anyway it seems you were right. Next error:
array(
(int) 0 => 'Database connection established',
(int) 1 => 'noOfLines: 15',
(int) 2 => 'Database insertion complete',
(int) 3 => 'Start generating keywords for document',
(int) 4 => '0.010981828582962786 => retriev',
(int) 5 => '0.005813909249803828 => precis',
(int) 6 => '0.002583959666579479 => vector',
(int) 7 => '0.0019379697499346095 => object',
(int) 8 => '0.0017226397777196528 => query.',
(int) 9 => '0.0015073098055046962 => result',
(int) 10 => 'Keywords stored in database',
(int) 11 => 'Database connection in classifier established',
(int) 12 => 'Added relevant sentence to training: ClassifierSentence [sentenceID=16510, content=It's also Google's approach to the endeavor--its willingness to let third-party developers deeper into the stack and, potentially, to let users define the experience for themselves--that could help make it a hit., length=32, posInDocument=13, keywordSimilarity=0.0hasNote=false]',
(int) 13 => 'Added relevant sentence to training: ClassifierSentence [sentenceID=16529, content=',
(int) 14 => '',
(int) 15 => ' This is where a lightweight user interface is key, and it seems like Google's got a promising foundation, mixing concise, swipe-able cards with optional voice commands., length=27, posInDocument=49, keywordSimilarity=0.0hasNote=false]',
(int) 16 => 'Added relevant sentence to training: ClassifierSentence [sentenceID=16534, content=These could include urgent notifications, like text messages, that buzz your wrist when they come in, or morsels of data that get silently added to your stack, like scores of sports games., length=32, posInDocument=58, keywordSimilarity=0.0hasNote=false]',
(int) 17 => 'training data created',
(int) 18 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)',
(int) 19 => 'java.lang.ArrayIndexOutOfBoundsException: 1'
that still seems to go wrong in the generating of the classifier. maybe
more sentences in ther users_sentences are needed?
On Tue, Apr 1, 2014 at 8:21 PM, bouke-nederstigt
notifications@github.comwrote:
Haven't got it publicly. vhosts based on ip's needs multiple instances of
apache. Anyway it seems you were right. Next error:array(
(int) 0 => 'Database connection established',
(int) 1 => 'noOfLines: 15',(int) 2 => 'Database insertion complete',
(int) 3 => 'Start generating keywords for document',
(int) 4 => '0.010981828582962786 => retriev',
(int) 5 => '0.005813909249803828 => precis',
(int) 6 => '0.002583959666579479 => vector',
(int) 7 => '0.0019379697499346095 => object',
(int) 8 => '0.0017226397777196528 => query.',
(int) 9 => '0.0015073098055046962 => result',
(int) 10 => 'Keywords stored in database',
(int) 11 => 'Database connection in classifier established',
(int) 12 => 'Added relevant sentence to training: ClassifierSentence
[sentenceID=16510, content=It's also Google's approach to the endeavor--its
willingness to let third-party developers deeper into the stack and,
potentially, to let users define the experience for themselves--that could
help make it a hit., length=32, posInDocument=13,
keywordSimilarity=0.0hasNote=false]',
(int) 13 => 'Added relevant sentence to training: ClassifierSentence
[sentenceID=16529, content=',
(int) 14 => '',
(int) 15 => ' This is where a lightweight user interface is key, and it
seems like Google's got a promising foundation, mixing concise, swipe-able
cards with optional voice commands., length=27, posInDocument=49,
keywordSimilarity=0.0hasNote=false]',
(int) 16 => 'Added relevant sentence to training: ClassifierSentence
[sentenceID=16534, content=These could include urgent notifications, like
text messages, that buzz your wrist when they come in, or morsels of data
that get silently added to your stack, like scores of sports games.,
length=32, posInDocument=58, keywordSimilarity=0.0hasNote=false]',
(int) 17 => 'training data created',
(int) 18 => 'Iter 1 []<> 0.000E0 (delta: 0.000E0)',
(int) 19 => 'java.lang.ArrayIndexOutOfBoundsException: 1'Reply to this email directly or view it on GitHubhttps://github.com//issues/25#issuecomment-39239987
.
BOOM! Finally closed this one. Summarization seems to be working and indexing as well. We can start adding training data now.