JohnSnowLabs/spark-nlp-workshop

AnalysisException: Reference 'bert-embedding' is ambiguous, could be: bert-embedding, bert-embedding.

piyu18 opened this issue · 4 comments

AnalysisException Traceback (most recent call last)
in ()
----> 1 predictions = ner_model_bert.transform(test_data)

5 frames
/usr/local/lib/python3.7/dist-packages/pyspark/sql/utils.py in deco(*a, **kw)
115 # Hide where the exception came from that shows a non-Pythonic
116 # JVM exception message.
--> 117 raise converted from None
118 else:
119 raise

AnalysisException: Reference 'bert-embedding' is ambiguous, could be: bert-embedding, bert-embedding.
I am getting this error when I am trying NER using BertEmbeddings with pyspark==3.1.1 and spark-nlp==3.0.2. My code is working fine wid previous version of pyspark . Can you please help me out.

Please share your full code, information about your environment including your OS, where do you use this, how do you start SparkSession, etc. The more info and steps to reproduce the better. (right now with 1 line is impossible to say anything)

Hi, can you please have a look at colab link

@piyu18

You already have a BertEmbeddings in your pipeline, you cannot transform the DataFrame with BertEmbeddings separately and pass that to your pipeline. It will end up with to identical embeddings column which is the issue:

Remove this part

test_data = bert_annotator.transform(test_data)

And just do this:

#Creating CoNLL(Conference on Natural Language Learning) files
from sparknlp.training import CoNLL

test_data = CoNLL().readDataset(spark, '/content/test.txt')
test_data.show(5)

test_predictions = ner_model_bert.transform(test_data)