请教一下,有部分没看懂
Thoye opened this issue · 6 comments
请问extract.cpp里面的word2vec.txt,bags_train.txt,bags_test.txt 数据集里怎么没有这些文件? 还有我要换数据集的话,是不是还得处理成train.txt里面的格式?代码里好像没有处理成train.txt格式的过程。谢谢解答!
先解压NYT_data.zip文件 unzip NYT_data/NYT_data.zip
嗯嗯,是的解压了,就是自己 bash precess.sh 的时候,报错了,帮忙看看,谢谢!
Init Begin.
wordTotal= 114042
Word dimension= 50
preprocess.sh: line 2: 12694 Segmentation fault (core dumped) ./extract
bags_train.txt
Traceback (most recent call last):
File "data2pkl.py", line 83, in
data2pickle('bags_train.txt','train_temp.pkl',1)
File "data2pkl.py", line 75, in data2pickle
data = readData(input, mode)
File "data2pkl.py", line 20, in readData
f = codecs.open(filename, 'r')
File "/home/mrc/anaconda3/envs/env3/lib/python3.6/codecs.py", line 895, in open
file = builtins.open(filename, mode, buffering)
FileNotFoundError: [Errno 2] No such file or directory: 'bags_train.txt'
load test and train raw data...
Traceback (most recent call last):
File "pickledata.py", line 236, in
testData = pickle.load(open('test_temp.pkl', 'rb'), encoding='utf-8')
FileNotFoundError: [Errno 2] No such file or directory: 'test_temp.pkl'
rm: cannot remove 'temp': No such file or directory
extract.cpp中267行,fout.open(("word2id.txt"),ios::out);但是也没有这个word2id.txt文件。
- 确定文件结构是下面这样的
Intra-Bag-and-Inter-Bag-Attentions
|-- figure
|-- CNNmethods.pdf
|-- PCNNmethods.pdf
|-- model
|-- embedding.py
|-- model_bagatt.py
|-- pcnn.py
|-- NYT_data
|-- relation2id.txt
|-- test.txt
|-- train.txt
|-- vec.bin
|-- preprocess
|-- data2pkl.py
|-- extract.cpp
|-- pickledata.py
|-- preprocess.sh
|-- plot.py
|-- README.md
|-- train.py
- 在preprocess文件夹下执行的
bash preprocess.sh
cd preprocess; bash preprocess.sh; cd ..
你可以单独运行extract.cpp文件获得那些pkl文件,再手动运行data2pkl和pkl2data就可以了
你可以单独运行extract.cpp文件获得那些pkl文件,再手动运行data2pkl和pkl2data就可以了
请问怎么单独运行呢,用什么?