Persian Instruct Dataset

University of Tehran (NLP Lab)

Introduction

Welcome to the Semi-Alpaca Instruction Tuning Dataset repository for the Persian language. This repository aims to collect a high-quality dataset of semi-alpaca instructions in Persian, which can be used for various natural language processing (NLP) tasks, including machine translation, text generation, and more.

Data Collection

This dataset has been collaboratively gathered with the invaluable contributions of students from Tehran University. We extend our gratitude to the students for their assistance in curating and compiling this dataset.

Contact

For any questions, suggestions, or inquiries related to this dataset, please feel free to contact us at [mostafa.amiri@ut.ac.ir].

Happy semi-alpaca instruction research!

Citation

If you use this dataset in your research or applications, please cite this repository as follows: @misc{semi-alpaca-instructions-persian, author = {Mostafa Amiri}, title = {Semi-Alpaca Instruction Tuning Dataset (Persian)}, year = {2024}, publisher = {GitHub}, journal = {GitHub Repository}, howpublished = {\url{https://github.com/mostafaamiri/Persian_instruct_dataset}}, }