run issue

Question

run issue

Closed this issue 4 years ago · 5 comments

rajae-Bens commented 4 years ago

Hi,

I tried to execute ur code but I got an error when fitting the model

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Any ideas please?

Thanks

Answer 1 · 2020-08-29T11:59:20.000Z

Hi,
Due to license problems of Twitter data, we shared only IDs of tweets in Google Drive:
https://drive.google.com/file/d/1jnIoobE2qHDO0FtveWjPIx2KSji35thI/view?usp=sharing

You have to download tweets from these IDs by using Twitter API. There are only dummy data in the repo, right now. The cause of your problem might be this. Can you also share the name of the notebook you have tried?

Answer 2 · 2020-08-29T14:27:38.000Z

Hi,
Due to license problems of Twitter data, we shared only IDs of tweets in Google Drive:
https://drive.google.com/file/d/1jnIoobE2qHDO0FtveWjPIx2KSji35thI/view?usp=sharing

You have to download tweets from these IDs by using Twitter API. There are only dummy data in the repo, right now. The cause of your problem might be this. Can you also share the name of the notebook you have tried?

Merhaba Abdullatif,

İlk olarak böyle değerli bir paylaşımda bulunduğun için sana ve ekibine çok teşekkür ederim. Fakat bir kaç problemle karşılaştım bunları iletmek istedim. Paylaştığınız ID'lere ait tweet'lere ulaşmak için Twitter API'sini kullandım (tweepy kütüphanesi kullanıldı). Fakat bütün tweet'lere ulaşamadım.

Paylaştığın tweet sayısı:

negative-test: 279
negative-train: 1120
notr-test: 843
notr-train: 3372
positive-test: 469
positive-train: 1877

Benim ulaştığım tweet sayısı:

negative-test: 177
negative-train: 809
notr-test: 724
notr-train: 2988
positive-test: 356
positive-train: 621

Bu kayıpların sebebi sence ne olabilir?

Ek olarak "positive-train" dosyası içerisinde ID: "5.76859E+17" var. Yani "Scientific Notation" olarak verilmiş. Bu "bilimsel gösterim" postive-train dosyasını okuturken bende problem yarattı (Pandas kütüphanesi kullanırken). Tek bir değer bilimsel gösterimle verildiği için bütün ID'ler bu formatta okundu. Bunun sonucu olarak da positive-train içerisindeki tweet'lere hiç ulaşamadım. İlgili değeri çıkardığımda ise 1877 tweet'ten sadece 621 tanesine ulaşabildim.

ID'si verilmiş bütün tweetlere ulaşmak istiyorum. Önerin ne olabilir?

İyi günler, iyi çalışmalar diliyorum,
E. Kaan Ülgen

Answer 3 · 2020-08-30T07:13:01.000Z

Hi,
Due to license problems of Twitter data, we shared only IDs of tweets in Google Drive:
https://drive.google.com/file/d/1jnIoobE2qHDO0FtveWjPIx2KSji35thI/view?usp=sharing

You have to download tweets from these IDs by using Twitter API. There are only dummy data in the repo, right now. The cause of your problem might be this. Can you also share the name of the notebook you have tried?

Hi Abdullatif,

Thank u for answering. I will try this data but I think the error is related to the class_weights function
the notebook I tried is BERT Features with Keras.ipynb

Thanks

Answer 4 · 2020-08-30T10:42:23.000Z

@rajae-Bens
I couldn't replicate the error but I changed the logic of the class_weight calculation. Instead of sklearn functions, I used simple numpy argmax with list iteration. I pushed new version of BERT Features with Keras.ipynb. Hope this helps!

Answer 5 · 2020-08-30T10:47:51.000Z

Merhaba @kaanulgen,
ID konusunda, o twit ilk etapta scientific formatta yazılmış, o da problem oluşturmuş. Orijinal ID'ye erişemediğim için dosyadan o ID'yi silip drive'i güncelledim. Sadece bir örnek vardı.

Twitlerin eksikliği de kullanıcıların kendi twitlerini silmesinden veya hesaplarını gizlemesinden kaynaklanıyor. Twitter'in lisansında tweet metinlerini açık bir şekilde paylaşmamamızı istemesinin sebeplerinden biri de bu. Bu sebeple kullanıcılar örn. twitlerini sildiği takdirde bu metinlere erişim mümkün olmayacaktır.

Size de iyi çalışmalar!