An informal analysis, but one of my favorites. Contains character frequency analyses both including and not including coding text in the input corpus. It also has some word frequency AND symbol frequency! Symbol frequency can be surprisingly hard to find.
Example screenshot from sitehttps://mdickens.me/typing/letter_frequency.html
bigram frequencies along with the code of how he did it - very useful
https://gist.github.com/lydell/c439049abac2c9226e53
Many downloadable files containing info on how often different Ngrams occur in the google corpus.
Incredible amount of data, but presented very unintuitively. Largely unhelpful imo
http://storage.googleapis.com/books/ngrams/books/datasetsv2.html
Peter took the Google Ngram data and made it useful. Basically a distillation of all the important stuff you would want to know, presented in a much more helpful format. Much better.
Example screenshot from sitehttp://norvig.com/mayzner.html
Very interesting site with several frequency lists for both written and spoken English, broken down in several ways which are not commonly found in other data.
The simplest and most directly applicable to keyboard typing is the Word Frequency In Written English list.
https://ucrel.lancs.ac.uk/bncfreq/flists.html
What can we say about Vivian Cook. Idk who this woman is, or why she put this data together. Her website is terrible. But her data is incredible.
Example screenshot from siteFrequency of letter position in word (this is very hard to find elsewhere!)
Bigram Contact Chart (how often two letters touch each other)
BONUS: I took her contact chart data and made this bigram heatmap out of it:
The English Corpora website will allow you to browse and download a lot of different corpora like the Wikipedia Corpus or the Corpus of American Soap Operas
They don't have everything, and some cost money. So here are some other options: