TypeError: 'NoneType' object is not iterable for a string object
farazk86 opened this issue · 11 comments
Hi,
I am getting the above error for a string that is definitely not NoneType
.
I have a list of 8000 strings that I want to analyze with pke. I am using the below minimal code to do so:
!pip install git+https://github.com/boudinfl/pke.git
import nltk
nltk.download('stopwords')
import pke
total_rows = len(quick_list)
mainlist = []
mainlist_str = []
for text in quick_list:
print(text)
print(type(text))
pos = {'NOUN', 'PROPN', 'ADJ'}
extractor = pke.unsupervised.SingleRank()
extractor.load_document(input=text,
language='en',
normalization=None)
extractor.candidate_selection(pos=pos)
extractor.candidate_weighting(window=10,
pos=pos)
keyphrases = extractor.get_n_best(n=3) # 3 keywords from each ticket
sublist = []
for k in keyphrases:
sublist.append(k[0])
for j in range(0, total_rows, total_rows):
mainlist_str = ', '.join(map(str, sublist))
mainlist.append(mainlist_str)
pke_list = mainlist
Below is the output and stack trace:
ERROR:root:No spacy model for 'en' language.
ERROR:root:A list of available spacy models is available at https://spacy.io/models.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data] Package stopwords is already up-to-date!
business world - user unable to log in
<class 'str'>
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
[<ipython-input-7-027265c91041>](https://localhost:8080/#) in <module>()
19 extractor.load_document(input=text,
20 language='en',
---> 21 normalization=None)
22 extractor.candidate_selection(pos=pos)
23 extractor.candidate_weighting(window=10,
[/usr/local/lib/python3.7/dist-packages/pke/base.py](https://localhost:8080/#) in load_document(self, input, language, stoplist, normalization, spacy_model)
121
122 else:
--> 123 for i, sentence in enumerate(self.sentences):
124 self.sentences[i].stems = [w.lower() for w in sentence.words]
125
TypeError: 'NoneType' object is not iterable
as can be seen above the text
printed is not None and is of type str
I even ensured that not a single Null
or None
exists within my list by iterating over all elements:
NoneType = type(None)
for text in quick_list:
if type(text) == NoneType:
print('We have a NoneType')
The above loop does not print anything.
P.S. Maybe this is related to version 2.0 as I did not have this problem a couple of months ago.
Hi,
I think your issue is related to spacy models, please check that you have downloaded the en spacy model using python -m spacy validate
(and ensure that python
is the same that is running pke
).
Please update this thread as necessary :)
Hi @ygorg
I am still getting the same error. Below is after manually downloading the spacy model:
Installing collected packages: en-core-web-sm
Attempting uninstall: en-core-web-sm
Found existing installation: en-core-web-sm 2.2.5
Uninstalling en-core-web-sm-2.2.5:
Successfully uninstalled en-core-web-sm-2.2.5
Successfully installed en-core-web-sm-3.2.0
✔ Download and installation successful
You can now load the package via spacy.load('en_core_web_sm')
ERROR:root:No spacy model for 'en' language.
ERROR:root:A list of available spacy models is available at https://spacy.io/models.
business world - user unable to log in
<class 'str'>
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
[<ipython-input-27-07fb979eea6a>](https://localhost:8080/#) in <module>()
26 extractor.load_document(input=text,
27 language='en',
---> 28 normalization=None)
29 extractor.candidate_selection(pos=pos)
30 extractor.candidate_weighting(window=10,
[/usr/local/lib/python3.7/dist-packages/pke/base.py](https://localhost:8080/#) in load_document(self, input, language, stoplist, normalization, spacy_model)
121
122 else:
--> 123 for i, sentence in enumerate(self.sentences):
124 self.sentences[i].stems = [w.lower() for w in sentence.words]
125
TypeError: 'NoneType' object is not iterable
And below is the output of python -m spacy validate
✔ Loaded compatibility table
================= Installed pipeline packages (spaCy v3.2.3) =================
ℹ spaCy installation: /usr/local/lib/python3.7/dist-packages/spacy
NAME SPACY VERSION
en_core_web_sm >=3.2.0,<3.3.0 3.2.0 ✔
Hm, please try loading the file beforehand, to see whether this is linked to spacy or pke.
i'm sorry that you encounter this problem.
nlp = spacy.load('en_core_web_sm')
for text in quick_list:
doc = nlp(doc)
extractor = pke.unsupervised.SingleRank()
extractor.load_document(
input=nlp(text), language='en',
normalization=None
)
Hi, loading spacy beforehand appears to be the solution. The extractor is working now.
Thanks :)
Hi @ygorg, I also want to mention that I have been getting the same error since the new version(2.0), I installed pke using the instructions (spacy model == 3.2.3) in my new virtual machine, but I am getting the error "'NoneType' object is not iterable". When you load spacy model beforehand it works.
Note, I have an older virtual machine where I installed pke before (4 months ago), it does not give the error and works fine without loading spacy model beforehand.
Hello,
I can't reproduce your issue on my machine. Can you provide me with a minimal working example so I can examine further?
f.
Hi, I also tried in a virtual environment and encountered no problems (see below).
Reproducing issue
python3.7 -m venv test
source test/bin/activate
pip install -U setuptools wheel pip # problem when installing spacy
pip install git+https://github.com/boudinfl/pke.git
python -m spacy download en_core_web_sm
python <<< """
import nltk
nltk.download('stopwords')
import pke
quick_list = ['business world - user unable to log in']
for text in quick_list:
print(text)
print(type(text))
pos = {'NOUN', 'PROPN', 'ADJ'}
extractor = pke.unsupervised.SingleRank()
extractor.load_document(input=text,
language='en',
normalization=None)
extractor.candidate_selection(pos=pos)
extractor.candidate_weighting(window=10,
pos=pos)
"""
Could you please provide the output of the following code ? If spacy.util.get_installed_models
does not output the right thing then my guess is that there might be some spacy model linking issue between different version of python/spacy ?
import spacy
print(spacy.info())
print(spacy.util.get_installed_models())
nlp = spacy.load('en_core_web_sm')
print(nlp._path) # should be the same as spacy.info()['location']
Hi, This error didn't came when i tried 3 weeks ago. Does anything get changed in the new update?. And yea, i tried loading spacy beforehand. That solves this error. But a new problem arise,
candidate_selection() got an unexpected keyword argument 'stoplist'
It is only accepting pos
parameter. Is there anyway to pass stoplist
?
Documentation is still showing the old implementation.
TIA
Hi, This error didn't came when i tried 3 weeks ago. Does anything get changed in the new update?. And yea, i tried loading spacy beforehand. That solves this error. But a new problem arise,
candidate_selection() got an unexpected keyword argument 'stoplist'
It is only accepting
pos
parameter. Is there anyway to passstoplist
?Documentation is still showing the old implementation.
TIA
same here
Hello @ygorg , this is the output of your sample code:
{'spacy_version': '3.2.3', 'location': '/home/ubuntu/.local/lib/python3.6/site-packages/spacy', 'platform': 'Linux-5.4.0-1071-aws-x86_64-with-Ubuntu-18.04-bionic', 'python_version': '3.6.9', 'pipelines': {'en_core_web_sm': '3.2.0', 'en_core_web_trf': '3.2.0'}}
['en_core_web_sm', 'en_core_web_trf']
/home/ubuntu/.local/lib/python3.6/site-packages/en_core_web_sm/en_core_web_sm-3.2.0
Hi,
I just updated the docs with for pke v.2.0.
@EdwardChan5000 and @Karthick47v2, from v2.0 the stoplist
parameter (e.g. list of stopwords) should be passed to the load_document()
function.
Here is a minimal example:
extractor = pke.unsupervised.FirstPhrases()
extractor.load_document(input='input, language='en', stoplist=["list", "of", "stopwords"])
extractor.candidate_selection()
...
By default the stoplist from spacy
that corresponds to the language parameter is used.
f.