QA-Portuguese-small

Portuguese preprocessed split from MQA dataset
This repository contains a dataset of question-answer pairs in Portuguese, uploaded to Hugging Face. The dataset consists of 500,000 rows, with each row containing a question and its corresponding answer in Portuguese.

Usage

from datasets import load_dataset

data = load_dataset("Jpzinn654/qa-portuguese-small")

Overview

The project involves:

Loading a large question-answer dataset from Hugging Face.
Selecting the first 500,000 rows of the dataset.
Saving the dataset in both CSV and JSON formats.
Pushing the processed dataset to the Hugging Face hub for easy access and sharing.

Dataset Details

Name: qa-portuguese
Source: Hugging Face
Rows: 500,000 question-answer pairs
Languages: Portuguese

Jpzinn654/qa-portuguese-v1

QA-Portuguese-small

Usage

Overview

Dataset Details