tensorlayer/text-antispam

ValueError

hcolde opened this issue · 27 comments

版本1
python3.5.2
tensorflow == 2.2.0
tensorlayer == 2.2.1

版本2
python3.6.12
tensorflow == 2.2.0
tensorlayer == 2.2.3

错误信息
root@948eae80bec6:/opt/text-antispam/network# python3 rnn_classifier.py
2020-12-01 01:34:08.114620: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-12-01 01:34:08.121494: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-12-01 01:34:08,265 INFO Input input_layer: [None, None, 200]
2020-12-01 01:34:08.266184: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-12-01 01:34:08.266226: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (948eae80bec6): /proc/driver/nvidia/version does not exist
2020-12-01 01:34:08.266494: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-12-01 01:34:08.271857: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3192000000 Hz
2020-12-01 01:34:08.272342: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4e4fbf0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-01 01:34:08.272372: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-12-01 01:34:08,280 INFO RNN rnn_1: cell: LSTMCell, n_units: 64
2020-12-01 01:34:08,387 INFO Dense dense: 2 softmax_v2
2020-12-01 01:34:08,417 INFO batch_size: 128
2020-12-01 01:34:08,417 INFO Start training the network...
Traceback (most recent call last):
File "rnn_classifier.py", line 445, in
train(model)
File "rnn_classifier.py", line 100, in train
range(max_seq_len - len(d))]
ValueError: operands could not be broadcast together with shapes (1,200) (0,) (1,200)

版本
python3.5.2

tensorflow == 2.2.0
tensorlayer == 2.2.1

错误信息
root@948eae80bec6:/opt/text-antispam/network# python3 rnn_classifier.py
2020-12-01 01:34:08.114620: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-12-01 01:34:08.121494: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-12-01 01:34:08,265 INFO Input input_layer: [None, None, 200]
2020-12-01 01:34:08.266184: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-12-01 01:34:08.266226: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (948eae80bec6): /proc/driver/nvidia/version does not exist
2020-12-01 01:34:08.266494: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-12-01 01:34:08.271857: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3192000000 Hz
2020-12-01 01:34:08.272342: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4e4fbf0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-01 01:34:08.272372: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-12-01 01:34:08,280 INFO RNN rnn_1: cell: LSTMCell, n_units: 64
2020-12-01 01:34:08,387 INFO Dense dense: 2 softmax_v2
2020-12-01 01:34:08,417 INFO batch_size: 128
2020-12-01 01:34:08,417 INFO Start training the network...
Traceback (most recent call last): File "rnn_classifier.py", line 445, in train(model) File "rnn_classifier.py", line 100, in train range(max_seq_len - len(d))] ValueError: operands could not be broadcast together with shapes (1,200) (0,) (1,200)

你好,我也是这个报错,请问你最后是怎么解决的呢?

版本
python3.5.2
tensorflow == 2.2.0
tensorlayer == 2.2.1
错误信息
root@948eae80bec6:/opt/text-antispam/network# python3 rnn_classifier.py
2020-12-01 01:34:08.114620: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-12-01 01:34:08.121494: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-12-01 01:34:08,265 INFO Input input_layer: [None, None, 200]
2020-12-01 01:34:08.266184: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-12-01 01:34:08.266226: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (948eae80bec6): /proc/driver/nvidia/version does not exist
2020-12-01 01:34:08.266494: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-12-01 01:34:08.271857: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3192000000 Hz
2020-12-01 01:34:08.272342: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4e4fbf0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-01 01:34:08.272372: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-12-01 01:34:08,280 INFO RNN rnn_1: cell: LSTMCell, n_units: 64
2020-12-01 01:34:08,387 INFO Dense dense: 2 softmax_v2
2020-12-01 01:34:08,417 INFO batch_size: 128
2020-12-01 01:34:08,417 INFO Start training the network...
Traceback (most recent call last): File "rnn_classifier.py", line 445, in train(model) File "rnn_classifier.py", line 100, in train range(max_seq_len - len(d))] ValueError: operands could not be broadcast together with shapes (1,200) (0,) (1,200)

你好,我也是这个报错,请问你最后是怎么解决的呢?

这个错误是行列不一致造成的,我把shape=(0,)的忽略掉

...
tmp = [tf.convert_to_tensor(np.zeros(200), dtype=tf.float32).reshape(1, 200) for i in range(max_seq_len - len(d))]
if tmp:
    batch_x[i] += tmp
...

即使这样解决了当前问题,但在执行cnn_classifier.py也会报同样的错。

版本
python3.5.2
tensorflow == 2.2.0
tensorlayer == 2.2.1
错误信息
root@948eae80bec6:/opt/text-antispam/network# python3 rnn_classifier.py
2020-12-01 01:34:08.114620: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-12-01 01:34:08.121494: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-12-01 01:34:08,265 INFO Input input_layer: [None, None, 200]
2020-12-01 01:34:08.266184: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-12-01 01:34:08.266226: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (948eae80bec6): /proc/driver/nvidia/version does not exist
2020-12-01 01:34:08.266494: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-12-01 01:34:08.271857: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3192000000 Hz
2020-12-01 01:34:08.272342: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4e4fbf0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-01 01:34:08.272372: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-12-01 01:34:08,280 INFO RNN rnn_1: cell: LSTMCell, n_units: 64
2020-12-01 01:34:08,387 INFO Dense dense: 2 softmax_v2
2020-12-01 01:34:08,417 INFO batch_size: 128
2020-12-01 01:34:08,417 INFO Start training the network...
Traceback (most recent call last): File "rnn_classifier.py", line 445, in train(model) File "rnn_classifier.py", line 100, in train range(max_seq_len - len(d))] ValueError: operands could not be broadcast together with shapes (1,200) (0,) (1,200)

你好,我也是这个报错,请问你最后是怎么解决的呢?

这个错误是行列不一致造成的,我把shape=(0,)的忽略掉

...
tmp = [tf.convert_to_tensor(np.zeros(200), dtype=tf.float32).reshape(1, 200) for i in range(max_seq_len - len(d))]
if tmp:
    batch_x[i] += tmp
...

即使这样解决了当前问题,但在执行cnn_classifier.py也会报同样的错。

害 那这个能跑通吗 请问你也是研究这方面的嘛 方便以后一起交流嘛T.T

版本
python3.5.2
tensorflow == 2.2.0
tensorlayer == 2.2.1
错误信息
root@948eae80bec6:/opt/text-antispam/network# python3 rnn_classifier.py
2020-12-01 01:34:08.114620: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-12-01 01:34:08.121494: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-12-01 01:34:08,265 INFO Input input_layer: [None, None, 200]
2020-12-01 01:34:08.266184: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-12-01 01:34:08.266226: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (948eae80bec6): /proc/driver/nvidia/version does not exist
2020-12-01 01:34:08.266494: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-12-01 01:34:08.271857: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3192000000 Hz
2020-12-01 01:34:08.272342: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4e4fbf0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-01 01:34:08.272372: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-12-01 01:34:08,280 INFO RNN rnn_1: cell: LSTMCell, n_units: 64
2020-12-01 01:34:08,387 INFO Dense dense: 2 softmax_v2
2020-12-01 01:34:08,417 INFO batch_size: 128
2020-12-01 01:34:08,417 INFO Start training the network...
Traceback (most recent call last): File "rnn_classifier.py", line 445, in train(model) File "rnn_classifier.py", line 100, in train range(max_seq_len - len(d))] ValueError: operands could not be broadcast together with shapes (1,200) (0,) (1,200)

你好,我也是这个报错,请问你最后是怎么解决的呢?

这个错误是行列不一致造成的,我把shape=(0,)的忽略掉

...
tmp = [tf.convert_to_tensor(np.zeros(200), dtype=tf.float32).reshape(1, 200) for i in range(max_seq_len - len(d))]
if tmp:
    batch_x[i] += tmp
...

即使这样解决了当前问题,但在执行cnn_classifier.py也会报同样的错。

害 那这个能跑通吗 请问你也是研究这方面的嘛 方便以后一起交流嘛T.T

跑不通,我不是做这个方向的,业余小白爱好者

tensorflow == 2.2.0
tensorlayer == 2.2.3
可以正常运行

版本
python3.5.2
tensorflow == 2.2.0
tensorlayer == 2.2.1
错误信息
root@948eae80bec6:/opt/text-antispam/network# python3 rnn_classifier.py
2020-12-01 01:34:08.114620: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-12-01 01:34:08.121494: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-12-01 01:34:08,265 INFO Input input_layer: [None, None, 200]
2020-12-01 01:34:08.266184: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-12-01 01:34:08.266226: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (948eae80bec6): /proc/driver/nvidia/version does not exist
2020-12-01 01:34:08.266494: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-12-01 01:34:08.271857: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3192000000 Hz
2020-12-01 01:34:08.272342: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4e4fbf0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-01 01:34:08.272372: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-12-01 01:34:08,280 INFO RNN rnn_1: cell: LSTMCell, n_units: 64
2020-12-01 01:34:08,387 INFO Dense dense: 2 softmax_v2
2020-12-01 01:34:08,417 INFO batch_size: 128
2020-12-01 01:34:08,417 INFO Start training the network...
Traceback (most recent call last): File "rnn_classifier.py", line 445, in train(model) File "rnn_classifier.py", line 100, in train range(max_seq_len - len(d))] ValueError: operands could not be broadcast together with shapes (1,200) (0,) (1,200)

你好,我也是这个报错,请问你最后是怎么解决的呢?

这个错误是行列不一致造成的,我把shape=(0,)的忽略掉

...
tmp = [tf.convert_to_tensor(np.zeros(200), dtype=tf.float32).reshape(1, 200) for i in range(max_seq_len - len(d))]
if tmp:
    batch_x[i] += tmp
...

即使这样解决了当前问题,但在执行cnn_classifier.py也会报同样的错。

害 那这个能跑通吗 请问你也是研究这方面的嘛 方便以后一起交流嘛T.T

跑不通,我不是做这个方向的,业余小白爱好者

tensorflow == 2.2.0
tensorlayer == 2.2.3
可以正常运行

好的好的 我试试 谢谢谢谢

tensorflow == 2.2.0
tensorlayer == 2.2.3
可以正常运行

ValueError: non-broadcastable output operand with shape (1,200) doesn't match the broadcast shape (19,200)
请问一下cnn_classfier这个怎么解决呢 好像还是有一些报错 而且准确率只有四五十T.T

tensorflow == 2.2.0
tensorlayer == 2.2.3
可以正常运行

请问下您用哪个版本的python

tensorflow == 2.2.0
tensorlayer == 2.2.3
可以正常运行

请问下您用哪个版本的python

我是3.7 但是cnn_classifier.py这个出错我觉得还是维度问题 可能和第一次出现的ValueError那个有关?

建议试一下python3.8

可以,xiongtianyu@mgv.ai。微信发我邮箱我加你。

建议试一下python3.8

mac下python3.8也尝试了,报相同的错误

版本1
python3.5.2
tensorflow == 2.2.0
tensorlayer == 2.2.1

版本2
python3.6.12
tensorflow == 2.2.0
tensorlayer == 2.2.3

错误信息
root@948eae80bec6:/opt/text-antispam/network# python3 rnn_classifier.py
2020-12-01 01:34:08.114620: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session started.
2020-12-01 01:34:08.121494: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-12-01 01:34:08,265 INFO Input input_layer: [None, None, 200]
2020-12-01 01:34:08.266184: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-12-01 01:34:08.266226: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (948eae80bec6): /proc/driver/nvidia/version does not exist
2020-12-01 01:34:08.266494: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-12-01 01:34:08.271857: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3192000000 Hz
2020-12-01 01:34:08.272342: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4e4fbf0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-01 01:34:08.272372: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-12-01 01:34:08,280 INFO RNN rnn_1: cell: LSTMCell, n_units: 64
2020-12-01 01:34:08,387 INFO Dense dense: 2 softmax_v2
2020-12-01 01:34:08,417 INFO batch_size: 128
2020-12-01 01:34:08,417 INFO Start training the network...
Traceback (most recent call last): File "rnn_classifier.py", line 445, in train(model) File "rnn_classifier.py", line 100, in train range(max_seq_len - len(d))] ValueError: operands could not be broadcast together with shapes (1,200) (0,) (1,200)

你好,可以用记事本打开sample_seq_pass.npz看一下(虽然是乱码),但如果每一行都是同样的乱码,说明在将每一行文本转成词向量序列的text_features程序中出现了问题,他对每个句子都给出的是一个UNK的词向量,所以会出现维度报错。
解决方法:可能是text_regularization程序的最后几行:
if match:
text = "".join(match)
else:
text = ""
空格的缺失导致的,将空字符变为空格再重新将文本转换成词向量试试。

if match:
text = "".join(match)
else:
text = ""

请问是改成
if match:
text = " ".join(match)
else:
text = " "
两边都加空格然后再重新进行训练嘛

if match:
text = "".join(match)
else:
text = ""

请问是改成
if match:
text = " ".join(match)
else:
text = " "
两边都加空格然后再重新进行训练嘛

嗯对,你打开那个文件看看是一样的乱码吗

if match:
text = "".join(match)
else:
text = ""

请问是改成
if match:
text = " ".join(match)
else:
text = " "
两边都加空格然后再重新进行训练嘛

嗯对,你打开那个文件看看是一样的乱码吗

是的
<鱑7?諅健樾=滦�?佩=�q??鄲?k�?桫藿*9a糓攚<??i
始m 窠a籞?O樇?j綍�?k�缂U餑街?簕?Ф糥�?鰃?/y�?摻>?籉條=芽'絊彛<{掚<-sv絆bb=*艻荐羚;iL藿菥澖��N=駹E= ﹍=?呓��?\<锛哫砑实 籵纋紱=? 鹘�駬=~媵=|μ=���>�?=孉??楧�紸 G=V�?E?"斀鞥?E矩??靳"�=m愍終椐<��?航�嬵="=敖dH?CЫ娂睓??K=T{?Ih奂餳冀?拞?'(捊J��=jL呒�抈=詘鼋$屗粿解伎h?p解f?^↑界辔<;澷=�茪?,凹<鎅=�]?WJ?梶战軝?���絃A缂滢?R!鸾^?:z餂=/澥=??H鷢=% ?yg两?6=P#牻?(佳|~綑耂?艍?�?漵O胶潉;�頀=.V考2着=槗a綌酱醎絚虑=脽埥禶�<�斆=�f?)?=邡U<0k缂嶌?颚�:?冉陳摻薓?&}=栖€=淼?祅r絾骋<=M椊覢?澘€=L期=;肆蝗lb=阸劢?W姐�〗堷陆C|=? ?ⅶ=:=SR==雍墙6堵?鰜=(?€?���絈r娼巳E=庑嗉3gO=�oV=�簘絝p⒑叟毥-kO;喺m=? o<<嬁l溅BH阶�?酏战!僵=齟唤轎榧标蚣内櫥T�?D<?战鶫d絑K=鷓墙?A=铇?%!?S� <鱑7?諅健樾=滦�?佩=�q??鄲?k�?桫藿*9a糓攚<??i 始m 窠a籞?O樇?j綍�?k�缂U餑街?簕?Ф糥�?鰃?/y�?摻>?籉條=芽'絊彛<{掚<-sv絆bb=*艻荐羚;iL藿菥澖��N=駹E= ﹍=?呓��?<锛哫砑实 籵纋紱=? 鹘�駬=~媵=|μ=���>�?=孉??楧�紸 G=V�?E?"斀鞥?E矩??靳"�=m愍終椐<��?航�嬵="=敖dH?CЫ娂睓??K=T{?Ih奂餳冀?拞?'(捊J��=jL呒�抈=詘鼋$屗粿解伎h?p解f?^↑界辔<;澷=�茪?,凹<鎅=�]?WJ?梶战軝?���絃A缂滢?R!鸾^?:z餂=/澥=??H鷢=%
?yg两?6=P#牻?(佳|~綑耂?艍?�?漵O胶潉;�頀=.V考2着=槗a綌酱醎絚虑=脽埥禶�<�斆=�f?)?=邡U<0k缂嶌?颚�:?冉陳摻薓?&}=栖€=淼?祅r絾骋<=M椊覢?澘€=L期=;肆蝗lb=阸劢?W姐�〗堷陆C|\=? ?ⅶ=:=SR==雍墙6堵?鰜=(?€?���絈r娼巳E=庑嗉3gO=�oV=�簘絝p⒑叟毥-kO;喺m=? o<<嬁l溅BH阶�?酏战!僵=齟唤轎榧标蚣内櫥T�?D<?战鶫d絑K=鷓墙?A=铇?%!?S�

if match:
text = "".join(match)
else:
text = ""

请问是改成
if match:
text = " ".join(match)
else:
text = " "
两边都加空格然后再重新进行训练嘛

嗯对,你打开那个文件看看是一样的乱码吗

是的
<鱑7?諅健樾=滦�?佩=�q??鄲?k�?桫藿*9a糓攚<??i
始m 窠a籞?O樇?j綍�?k�缂U餑街?簕?Ф糥�?鰃?/y�?摻>?籉條=芽'絊彛<{掚<-sv絆bb=*艻荐羚;iL藿菥澖��N=駹E= ﹍=?呓��?\<锛哫砑实 籵纋紱=? 鹘�駬=~媵=|μ=���>�?=孉??楧�紸 G=V�?E?"斀鞥?E矩??靳"�=m愍終椐<��?航�嬵="=敖dH?CЫ娂睓??K=T{?Ih奂餳冀?拞?'(捊J��=jL呒�抈=詘鼋$屗粿解伎h?p解f?^↑界辔<;澷=�茪?,凹<鎅=�]?WJ?梶战軝?���絃A缂滢?R!鸾^?:z餂=/澥=??H鷢=% ?yg两?6=P#牻?(佳|~綑耂?艍?�?漵O胶潉;�頀=.V考2着=槗a綌酱醎絚虑=脽埥禶�<�斆=�f?)?=邡U<0k缂嶌?颚�:?冉陳摻薓?&}=栖€=淼?祅r絾骋<=M椊覢?澘€=L期=;肆蝗lb=阸劢?W姐�〗堷陆C|=? ?ⅶ=:=SR==雍墙6堵?鰜=(?€?���絈r娼巳E=庑嗉3gO=�oV=�簘絝p⒑叟毥-kO;喺m=? o<<嬁l溅BH阶�?酏战!僵=齟唤轎榧标蚣内櫥T�?D<?战鶫d絑K=鷓墙?A=铇?%!?S� <鱑7?諅健樾=滦�?佩=�q??鄲?k�?桫藿*9a糓攚<??i 始m 窠a籞?O樇?j綍�?k�缂U餑街?簕?Ф糥�?鰃?/y�?摻>?籉條=芽'絊彛<{掚<-sv絆bb=*艻荐羚;iL藿菥澖��N=駹E= ﹍=?呓��?<锛哫砑实 籵纋紱=? 鹘�駬=~媵=|μ=���>�?=孉??楧�紸 G=V�?E?"斀鞥?E矩??靳"�=m愍終椐<��?航�嬵="=敖dH?CЫ娂睓??K=T{?Ih奂餳冀?拞?'(捊J��=jL呒�抈=詘鼋$屗粿解伎h?p解f?^↑界辔<;澷=�茪?,凹<鎅=�]?WJ?梶战軝?���絃A缂滢?R!鸾^?:z餂=/澥=??H鷢=%
?yg两?6=P#牻?(佳|~綑耂?艍?�?漵O胶潉;�頀=.V考2着=槗a綌酱醎絚虑=脽埥禶�<�斆=�f?)?=邡U<0k缂嶌?颚�:?冉陳摻薓?&}=栖€=淼?祅r絾骋<=M椊覢?澘€=L期=;肆蝗lb=阸劢?W姐�〗堷陆C|\=? ?ⅶ=:=SR==雍墙6堵?鰜=(?€?���絈r娼巳E=庑嗉3gO=�oV=�簘絝p⒑叟毥-kO;喺m=? o<<嬁l溅BH阶�?酏战!僵=齟唤轎榧标蚣内櫥T�?D<?战鶫d絑K=鷓墙?A=铇?%!?S�

所以是text_features将文本转为词向量出问题了嘛

if match:
text = "".join(match)
else:
text = ""

请问是改成
if match:
text = " ".join(match)
else:
text = " "
两边都加空格然后再重新进行训练嘛

嗯对,你打开那个文件看看是一样的乱码吗

是的
<鱑7?諅健樾=滦�?佩=�q??鄲?k�?桫藿*9a糓攚<??i
始m 窠a籞?O樇?j綍�?k�缂U餑街?簕?Ф糥�?鰃?/y�?摻>?籉條=芽'絊彛<{掚<-sv絆bb=*艻荐羚;iL藿菥澖��N=駹E= ﹍=?呓��?\<锛哫砑实 籵纋紱=? 鹘�駬=~媵=|μ=���>�?=孉??楧�紸 G=V�?E?"斀鞥?E矩??靳"�=m愍終椐<��?航�嬵="=敖dH?CЫ娂睓??K=T{?Ih奂餳冀?拞?'(捊J��=jL呒�抈=詘鼋$屗粿解伎h?p解f?^↑界辔<;澷=�茪?,凹<鎅=�]?WJ?梶战軝?���絃A缂滢?R!鸾^?:z餂=/澥=??H鷢=% ?yg两?6=P#牻?(佳|~綑耂?艍?�?漵O胶潉;�頀=.V考2着=槗a綌酱醎絚虑=脽埥禶�<�斆=�f?)?=邡U<0k缂嶌?颚�:?冉陳摻薓?&}=栖€=淼?祅r絾骋<=M椊覢?澘€=L期=;肆蝗lb=阸劢?W姐�〗堷陆C|=? ?ⅶ=:=SR==雍墙6堵?鰜=(?€?���絈r娼巳E=庑嗉3gO=�oV=�簘絝p⒑叟毥-kO;喺m=? o<<嬁l溅BH阶�?酏战!僵=齟唤轎榧标蚣内櫥T�?D<?战鶫d絑K=鷓墙?A=铇?%!?S� <鱑7?諅健樾=滦�?佩=�q??鄲?k�?桫藿*9a糓攚<??i 始m 窠a籞?O樇?j綍�?k�缂U餑街?簕?Ф糥�?鰃?/y�?摻>?籉條=芽'絊彛<{掚<-sv絆bb=*艻荐羚;iL藿菥澖��N=駹E= ﹍=?呓��?<锛哫砑实 籵纋紱=? 鹘�駬=~媵=|μ=���>�?=孉??楧�紸 G=V�?E?"斀鞥?E矩??靳"�=m愍終椐<��?航�嬵="=敖dH?CЫ娂睓??K=T{?Ih奂餳冀?拞?'(捊J��=jL呒�抈=詘鼋$屗粿解伎h?p解f?^↑界辔<;澷=�茪?,凹<鎅=�]?WJ?梶战軝?���絃A缂滢?R!鸾^?:z餂=/澥=??H鷢=%
?yg两?6=P#牻?(佳|~綑耂?艍?�?漵O胶潉;�頀=.V考2着=槗a綌酱醎絚虑=脽埥禶�<�斆=�f?)?=邡U<0k缂嶌?颚�:?冉陳摻薓?&}=栖€=淼?祅r絾骋<=M椊覢?澘€=L期=;肆蝗lb=阸劢?W姐�〗堷陆C|\=? ?ⅶ=:=SR==雍墙6堵?鰜=(?€?���絈r娼巳E=庑嗉3gO=�oV=�簘絝p⒑叟毥-kO;喺m=? o<<嬁l溅BH阶�?酏战!僵=齟唤轎榧标蚣内櫥T�?D<?战鶫d絑K=鷓墙?A=铇?%!?S�

所以是text_features将文本转为词向量出问题了嘛

是的,修改代码后无报错,感谢!

if match:
text = "".join(match)
else:
text = ""

请问是改成
if match:
text = " ".join(match)
else:
text = " "
两边都加空格然后再重新进行训练嘛

嗯对,你打开那个文件看看是一样的乱码吗

是的
<鱑7?諅健樾=滦�?佩=�q??鄲?k�?桫藿*9a糓攚<??i
始m 窠a籞?O樇?j綍�?k�缂U餑街?簕?Ф糥�?鰃?/y�?摻>?籉條=芽'絊彛<{掚<-sv絆bb=*艻荐羚;iL藿菥澖��N=駹E= ﹍=?呓��?\<锛哫砑实 籵纋紱=? 鹘�駬=~媵=|μ=���>�?=孉??楧�紸 G=V�?E?"斀鞥?E矩??靳"�=m愍終椐<��?航�嬵="=敖dH?CЫ娂睓??K=T{?Ih奂餳冀?拞?'(捊J��=jL呒�抈=詘鼋$屗粿解伎h?p解f?^↑界辔<;澷=�茪?,凹<鎅=�]?WJ?梶战軝?���絃A缂滢?R!鸾^?:z餂=/澥=??H鷢=% ?yg两?6=P#牻?(佳|~綑耂?艍?�?漵O胶潉;�頀=.V考2着=槗a綌酱醎絚虑=脽埥禶�<�斆=�f?)?=邡U<0k缂嶌?颚�:?冉陳摻薓?&}=栖€=淼?祅r絾骋<=M椊覢?澘€=L期=;肆蝗lb=阸劢?W姐�〗堷陆C|=? ?ⅶ=:=SR==雍墙6堵?鰜=(?€?���絈r娼巳E=庑嗉3gO=�oV=�簘絝p⒑叟毥-kO;喺m=? o<<嬁l溅BH阶�?酏战!僵=齟唤轎榧标蚣内櫥T�?D<?战鶫d絑K=鷓墙?A=铇?%!?S� <鱑7?諅健樾=滦�?佩=�q??鄲?k�?桫藿*9a糓攚<??i 始m 窠a籞?O樇?j綍�?k�缂U餑街?簕?Ф糥�?鰃?/y�?摻>?籉條=芽'絊彛<{掚<-sv絆bb=*艻荐羚;iL藿菥澖��N=駹E= ﹍=?呓��?<锛哫砑实 籵纋紱=? 鹘�駬=~媵=|μ=���>�?=孉??楧�紸 G=V�?E?"斀鞥?E矩??靳"�=m愍終椐<��?航�嬵="=敖dH?CЫ娂睓??K=T{?Ih奂餳冀?拞?'(捊J��=jL呒�抈=詘鼋$屗粿解伎h?p解f?^↑界辔<;澷=�茪?,凹<鎅=�]?WJ?梶战軝?���絃A缂滢?R!鸾^?:z餂=/澥=??H鷢=%
?yg两?6=P#牻?(佳|~綑耂?艍?�?漵O胶潉;�頀=.V考2着=槗a綌酱醎絚虑=脽埥禶�<�斆=�f?)?=邡U<0k缂嶌?颚�:?冉陳摻薓?&}=栖€=淼?祅r絾骋<=M椊覢?澘€=L期=;肆蝗lb=阸劢?W姐�〗堷陆C|\=? ?ⅶ=:=SR==雍墙6堵?鰜=(?€?���絈r娼巳E=庑嗉3gO=�oV=�簘絝p⒑叟毥-kO;喺m=? o<<嬁l溅BH阶�?酏战!僵=齟唤轎榧标蚣内櫥T�?D<?战鶫d絑K=鷓墙?A=铇?%!?S�

所以是text_features将文本转为词向量出问题了嘛

是的,修改代码后无报错,感谢!

请问你运行后结果是什么样的呀 三种分类器我都试了,虽然分类器结果显示准确率很高,但实际上对于长文本运行预测结果基本都是1,短文本预测也不准确,大多数是0。示例运行RNN之后predict也是1
image
image
三个分类器好像都有这种问题T.T

你这些文本似乎是自己写的文本吧,你可以试试把训练集和验证集的文本放进去看看准确率如何。要是没问题的话就不是模型的问题,而是数据集存在bias或者不足。如果也有问题那就是模型训练/部署的问题。

你这些文本似乎是自己写的文本吧,你可以试试把训练集和验证集的文本放进去看看准确率如何。要是没问题的话就不是模型的问题,而是数据集存在bias或者不足。如果也有问题那就是模型训练/部署的问题。

可能是数据集的问题,项目应该是面向垃圾短信的分类识别?我看项目里数据集基本都是短信促销类,对暴恐、反动、色情之类的敏感违规信息没有训练检测功能

你这些文本似乎是自己写的文本吧,你可以试试把训练集和验证集的文本放进去看看准确率如何。要是没问题的话就不是模型的问题,而是数据集存在bias或者不足。如果也有问题那就是模型训练/部署的问题。

可能是数据集的问题,项目应该是面向垃圾短信的分类识别?我看项目里数据集基本都是短信促销类,对暴恐、反动、色情之类的敏感违规信息没有训练检测功能

if match:
text = "".join(match)
else:
text = ""

请问是改成
if match:
text = " ".join(match)
else:
text = " "
两边都加空格然后再重新进行训练嘛

嗯对,你打开那个文件看看是一样的乱码吗

你的三个分类器预测精度是多少啊

你这些文本似乎是自己写的文本吧,你可以试试把训练集和验证集的文本放进去看看准确率如何。要是没问题的话就不是模型的问题,而是数据集存在bias或者不足。如果也有问题那就是模型训练/部署的问题。

可能是数据集的问题,项目应该是面向垃圾短信的分类识别?我看项目里数据集基本都是短信促销类,对暴恐、反动、色情之类的敏感违规信息没有训练检测功能

if match:
text = "".join(match)
else:
text = ""

请问是改成
if match:
text = " ".join(match)
else:
text = " "
两边都加空格然后再重新进行训练嘛

嗯对,你打开那个文件看看是一样的乱码吗

你的三个分类器预测精度是多少啊

你这些文本似乎是自己写的文本吧,你可以试试把训练集和验证集的文本放进去看看准确率如何。要是没问题的话就不是模型的问题,而是数据集存在bias或者不足。如果也有问题那就是模型训练/部署的问题。

可能是数据集的问题,项目应该是面向垃圾短信的分类识别?我看项目里数据集基本都是短信促销类,对暴恐、反动、色情之类的敏感违规信息没有训练检测功能

if match:
text = "".join(match)
else:
text = ""

请问是改成
if match:
text = " ".join(match)
else:
text = " "
两边都加空格然后再重新进行训练嘛

嗯对,你打开那个文件看看是一样的乱码吗

你的三个分类器预测精度是多少啊

RNN:
image
CNN:
image
MLP:
image

你这些文本似乎是自己写的文本吧,你可以试试把训练集和验证集的文本放进去看看准确率如何。要是没问题的话就不是模型的问题,而是数据集存在bias或者不足。如果也有问题那就是模型训练/部署的问题。

可能是数据集的问题,项目应该是面向垃圾短信的分类识别?我看项目里数据集基本都是短信促销类,对暴恐、反动、色情之类的敏感违规信息没有训练检测功能

if match:
text = "".join(match)
else:
text = ""

请问是改成
if match:
text = " ".join(match)
else:
text = " "
两边都加空格然后再重新进行训练嘛

嗯对,你打开那个文件看看是一样的乱码吗

你的三个分类器预测精度是多少啊

你这些文本似乎是自己写的文本吧,你可以试试把训练集和验证集的文本放进去看看准确率如何。要是没问题的话就不是模型的问题,而是数据集存在bias或者不足。如果也有问题那就是模型训练/部署的问题。

可能是数据集的问题,项目应该是面向垃圾短信的分类识别?我看项目里数据集基本都是短信促销类,对暴恐、反动、色情之类的敏感违规信息没有训练检测功能

if match:
text = "".join(match)
else:
text = ""

请问是改成
if match:
text = " ".join(match)
else:
text = " "
两边都加空格然后再重新进行训练嘛

嗯对,你打开那个文件看看是一样的乱码吗

你的三个分类器预测精度是多少啊

RNN:
image
CNN:
image
MLP:
image

嗯嗯,了解了,谢谢啦