fozziethebeat/S-Space

File text is not readable?

Closed this issue · 4 comments

Dear all,
I test the output format with "TEXT" and "SPARSE_TEXT" but they all bring the files which are not readable. They are the same as the binary files.
I try to read the code in the file SemanticSpaceIO but it does not work.
Please tell me how to show the content of these files?
Thanks a lot.

Hi,

Both formats should produce human-readable files. Could you please give
an example of how you're trying to create these files and what errors you
see if you try to read them?

Thanks,
David

On Wed, May 13, 2015 at 11:25 AM, hbthien notifications@github.com wrote:

Dear all,
I test the output format with "TEXT" and "SPARSE_TEXT" but they all bring
the files which are not readable. They are the same as the binary files.
I try to read the code in the file SemanticSpaceIO but it does not work.
Please tell me how to show the content of these files?
Thanks a lot.


Reply to this email directly or view it on GitHub
#67.

Hi David,
I already tried with both text by using "--outputFormat=TEXT", (and "--outputFormat=SPARSE_TEXT") for testing LSAMain.java in eclipse with arguments:

-d corpus.txt -F --outputFormat=TEXT exclude=stopwords.txt my-lsa-output-no-stopwords.sspace

The file ".sspace" could not be readable.
I don't know why. Please tell me to fix.

In addition, the file "corpus.txt" contains the punctuation (like ".", ","...) but when I debug the program, I recognize that they are not removed. I tried to find the function to remove them but didn't see.

Your options look correct, so the file should be in a correct format. What
error are you seeing when you try to read it? Also, what program are you
using to view it?

On Wed, May 13, 2015 at 1:41 PM, hbthien notifications@github.com wrote:

Hi David,
I already tried with both text by using "--outputFormat=TEXT", (and
"--outputFormat=SPARSE_TEXT") for testing LSAMain.java in eclipse with
arguments:

-d corpus.txt -F --outputFormat=TEXT exclude=stopwords.txt
my-lsa-output-no-stopwords.sspace

The file ".sspace" could not be readable.
I don't know why. Please tell me to fix.

In addition, the file "corpus.txt" contains the punctuation (like ".",
","...) but when I debug the program, I recognize that they are not
removed. I tried to find the function to remove them but didn't see.


Reply to this email directly or view it on GitHub
#67 (comment)
.

Hi,
I don't know why the file using in ubuntu is unable to read, but it's ok for windows.
Thank you.