/top_human_languages

Analysis and list of top 3 and top 11 human languages for product and service translations.

MIT LicenseMIT

TOP HUMAN LANGUAGES

When developing products or services that are destined for world-wide distribution, one naturally must consider which languages that should be supported. As there are over 6000 known languages; supporting all of them would be nearly impossible. Even supporting 100 would be incredibly expensive and difficult.

So, which languages should be chosen?

Ultimately, such a decision is very likely unique for each product or service. But this project and repository is designed to give you a starting point. A list of languages to support based on real-world numbers and a predictable formula.

To start with the results:

TOP 3 LANGUAGES

Language code Score Percent of World Reached
English en 0.186 13%
Mandarin Chinese zh 0.168 14%
Spanish es 0.072 7%

TOP 10 LANGUAGES

Language code Score Percent of World Reached
English en 0.186 13%
Mandarin Chinese zh 0.168 14%
Spanish es 0.072 7%
Arabic ar 0.055 6%
Hindi hi 0.055 6%
Russian ru 0.041 4%
Portuguese pt 0.033 3%
French fr 0.030 3%
Bengali bn 0.030 3%
German de 0.029 2%
Japanese ja 0.026 2%

THE DETAILS

Most of the details are spelled out in the spreadsheet labeled calculations.ods.

But, in general, four different weights are assigned to each language. Those weights are then multiplied by relative impact.

15% : native speaker count

Weighs the languages in favor of the number of native fluent speakers.

50% : total speaker count

Weighs the languages in favor reaching the most people in general. So, this includes both native speakers and speakers who speak the language as a second language. The "Percent of World Reached" number is from the total speaker count.

10% : web content

Weighs the languages in favor of web page language counts.

25% : PPP GDP

Weighs the languages in favor of the economic impact of the language. Specifically, the relative purchasing power of the populations of speakers.

NOTE

Please note that while the numbers are from good sources, the 15%/50%/10%/25% distribution of the importance of those values is an arbitrary decision of the author.

The spreadsheet is designed to make it easy to change that distribution if you would like to make your own TOP 3 and/or TOP 10 list.