Copyright [2022] [AI Engineer: Ahmed]
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Twitter has become an important communication channel in times of emergency. The ubiquitousness of smartphones enables people to announce an emergency they’re observing in real-time. Because of this, more agencies are interested in programatically monitoring Twitter (i.e. disaster relief organizations and news agencies).
This dataset was created by the company figure-eight and originally shared on their ‘Data For Everyone’ website here.
Tweet source: https://twitter.com/AnyOtherAnnaK/status/629195955506708480
I decided to use the Transfer Learning in this project and Fine-tune it since I believe in not to re-invent the wheels. We're going to use one of my favourite libraries; it's Hugging Face. It is the library – you need to use to inject your model with a pre-trained model that has trained on billions of examples.
Variables | Definition |
---|---|
id | a unique identifier for each tweet |
text | the text of the tweet |
location | the location the tweet was sent from (may be blank) |
keyword | a particular keyword from the tweet (may be blank) |
target | in train.csv only, this denotes whether a tweet is about a real disaster (1) or not (0) |