/Stop-Words-List

The stop words list for all languages around the world made by the contributors around the world! Start your contributions now!

Primary LanguagePythonMIT LicenseMIT

Stop-Words-List

License: MIT Open Source Love svg1 PRs Welcome GitHub contributors GitHub Hacktoberfest combined status first-timers-onlycontributions welcome   MaintenanceGitHub forks GitHub Repo stars

A beginner friendly project to help you in open source contributions. An attempt to bring the stop words lists from all languages around the world.

What is stop word?

In computing, stop words are words which are filtered out before or after processing of natural language data.

- Wikipedia -

In SEO terminology, stop words are the most common words that most search engines avoid, for the purposes of saving space and time in processing of large data during crawling or indexing. This helps search engines to save space in their databases. For example, at, which, is, the, and are some words categorized as stop words.

How to Contribute?

There are 2 ways to contribute in this repo:

  • Add new stop words list file.
  • Edit and do some improvements to existing stop words list.

Here are the steps to contribute to this repo:

  1. Fork this repository

  2. Clone the repository to your local git clone https://github.com/<YOUR-USERNAME>/Stop-Words-List.git

  3. Create a .txt file in list/ directory and rename it to following format: [YOUR_LANGUAGE_IN_ENGLISH].txt. For example:

    • english.txt
    • chinese.txt
    • arabic.txt

    Ignore this step if your language stop words list has already exist in this repo.

  4. Put the stop words list in the respective file you have made on step 3/existing stop words list file. Place only one word in one line! If you are editing the existing stop words list file, please DO NOT DELETE/EDIT anything that already exist. Please ensure that the words you want to add to list have not exist yet in the txt file.

  5. Don't forget to put your name in CONTRIBUTORS.md and follow the format there.

  6. Save the file, commit and push to your forked repository.

  7. Create the pull request.

  8. Congratulations! You have made the priceless contribution.

Contributing Rules

  • Place only one word in one line in the stop words list txt file.
  • Only lowercase alphabet is accepted for the languages using alphabet characters.
  • To be counted as a contribution, you need to add at least 10 lines in your respective language file.
  • Check the whole list and ensure there's no any duplicate words.
  • DO NOT EDIT/DELETE anything that already contributed by another users unless the words are not really considered as stop word. For this case, please tell and describe this issue on your pull request.
  • DO NOT DELETE the previous contributors' names in the CONTRIBUTIORS.md
  • When filling the CONTRIBUTORS.md, please make sure the list is arranged in dictionary order based on the language name.