This repository is mainly about cleaning, converting and checking LLM training datasets.
New datasets cleaned and created by this project:
Copyright (c) 2024 Philip May
Licensed under the MIT License (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License by reviewing the file LICENSE in the repository.