TaskComplexity Dataset 👉Read the paper

This project addresses the challenge of classifying and assigning programming tasks. A novel dataset containing a total of 4,112 programming tasks was created by systematically extracting tasks from various websites using web scraping techniques.

Dataset Overview download

The TaskComplexity dataset provides a comprehensive collection of programming problems, each labeled by classification, enabling the development and evaluation of models based on task difficulty.

Features

Each entry in the dataset includes the following attributes:

  • Task Title: The name or title of the programming problem.
  • Problem Description: A detailed description of the programming task.
  • Input/Output Specification: A breakdown of the expected input and output for the problem.
  • Examples: Sample cases to clarify the task requirements.
  • Problem Class: A categorical label indicating the general type of problem.
  • Complexity Score: A difficulty rating to help classify the problem’s complexity.
TaskComplexity Dataset Image

Complexity Levels

The dataset categorizes tasks into three levels of complexity:

  • Easy
  • Medium
  • Hard

Applications

This dataset is designed to support research and development in:

  • Machine Learning Models: Building classification models to assign complexity levels to programming tasks.

Technical Details

  • Total Tasks: 4,112
  • Number of Classes: 3 (Easy, Medium, Hard)
  • Data Collection Method: Web scraping of programming problem websites

Citation

If you find this work useful for your research, please consider citing the paper:

@article{rasheed2024taskcomplexity,
  title={TaskComplexity: A Dataset for Task Complexity Classification with In-Context Learning, FLAN-T5 and GPT-4o Benchmarks},
  author={Areeg Fahad Rasheed, M. Zarkoosh, Safa F. Abbas, Sana Sabah Al-Azzawi},
  journal={arXiv preprint arXiv:2409.20189},
  year={2024}
  dio={https://arxiv.org/abs/2409.20189}
}