This repository contains the code and resources for scraping data from Telegram channels using scrape.ipynb
and training the LLaMA_7B model using Pipeline.ipynb
.
-
scrape.ipynb
: Jupyter Notebook that includes the code for scraping data from Telegram channels. It utilizes the Telegram API and Python libraries to collect the required data. The notebook provides step-by-step instructions and explanations of the scraping process. -
Pipeline.ipynb
: Jupyter Notebook that contains the code for training the LLaMA_7B model. It demonstrates the pipeline for data preprocessing, model training, and evaluation. The notebook includes detailed explanations and comments to guide the user through the training process.
To use the code in this repository, follow these steps:
-
Obtain API credentials for accessing the Telegram API. Refer to the Telegram API documentation for instructions on obtaining the necessary credentials.
-
Run
scrape.ipynb
to scrape data from Telegram channels. Make sure to provide the required API credentials and specify the channels to scrape. The scraped data will be saved in thedata/
directory. -
Run
Pipeline.ipynb
to preprocess the scraped data, train the LLaMA_7B model, and evaluate its performance. The trained model will be saved in themodels/
directory.
Contributions to this project are welcome. If you encounter any issues or have suggestions for improvements, please open an issue in the GitHub repository.
We would like to acknowledge the contributions of the open-source community and the developers of the libraries and tools used in this project. Their efforts and dedication are greatly appreciated.