Data source: Google ngrams English data set version 20120701, years 1950 to 2012.
This compilation is licensed under a Creative Commons Attribution 3.0 Unported License.
These collections can save you time and data, if you don't want to download and process the google ngram data yourself. The scripts to generate the wordlists are also available. The wordlists may need additional filtering depending on your planned use (e.g. in the dataset there are some non-english characters and words, letters followed by numbers, symbols, www addresses etc.).
If you are looking for more sanitised data, for a spell checker for example, or want forms/variations of a word, check out SCOWL and friends.
The words are sorted so the most frequently used words appear first.
top_english_words_lower_1000000.txt
top_english_words_lower_500000.txt
top_english_words_lower_100000.txt
top_english_words_lower_50000.txt
top_english_words_lower_20000.txt
top_english_words_lower_10000.txt
top_english_words_mixed_1000000.txt
top_english_words_mixed_500000.txt
top_english_words_mixed_100000.txt
top_english_words_mixed_50000.txt
top_english_words_mixed_20000.txt
top_english_words_mixed_10000.txt
top_english_nouns_lower_500000.txt
top_english_nouns_lower_100000.txt
top_english_nouns_lower_50000.txt
top_english_nouns_lower_20000.txt
top_english_nouns_lower_10000.txt
top_english_nouns_mixed_500000.txt
top_english_nouns_mixed_100000.txt
top_english_nouns_mixed_50000.txt
top_english_nouns_mixed_20000.txt
top_english_nouns_mixed_10000.txt
top_english_verbs_lower_100000.txt
top_english_verbs_lower_50000.txt
top_english_verbs_lower_20000.txt
top_english_verbs_lower_10000.txt
top_english_verbs_mixed_100000.txt
top_english_verbs_mixed_50000.txt
top_english_verbs_mixed_20000.txt
top_english_verbs_mixed_10000.txt
top_english_adjs_lower_100000.txt
top_english_adjs_lower_50000.txt
top_english_adjs_lower_20000.txt
top_english_adjs_lower_10000.txt
top_english_adjs_mixed_100000.txt
top_english_adjs_mixed_50000.txt
top_english_adjs_mixed_20000.txt
top_english_adjs_mixed_10000.txt
top_english_advs_lower_10000.txt
top_english_advs_mixed_10000.txt
top_english_prons_lower_10000.txt
top_english_prons_mixed_10000.txt
top_english_nums_lower_500.txt
top_english_conjs_lower_500.txt
top_english_dets_lower_500.txt