The phoneme inventory is based on an analysis of the phonemes of 50 languages which have at least 20 million native speakers: Amharic, Arabic, Azerbaijani, Bengali, Burmese, Cantonese, Dutch, English, French, Gan, German, Gujarati, Hakka, Hausa, Hindi, Indonesian, Italian, Japanese, Javanese, Kannada, Korean, Kurdish, Malayalam, Mandarin, Marathi, Min, Oriya, Oromo, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Serbo-Croatian, Sindhi, Spanish, Sundanese, Tagalog, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Wu, and Yoruba.

There are five vowels, {a, e, i, o, u}, and eleven consonants, {g, h, k, l, m, n, p, s, t, v, y}. All letters have their standard IPA values, with the one exception that y represents [j]. Because there are few phonemes, quite a lot of free variation is permissible. Most notably:

  • G can be realized as either [g] or [ŋ]. Except for some dialects of Arabic, every language on the list uses at least one of these.
  • H can be realized as [h], [ɦ], [x], [f], or [ɸ]. This accommodates every language except Tamil and Telugu, and even Tamil uses [f] in some foreign loanwords.
  • L can be realized as [l] or, for the benefit of the Japanese, [ɺ].
  • P can be realized as either [p] or [b]. This accommodates Arabic, Hausa, Oromo, and Yoruba, which lack [p].
  • V can be realized as [v], [w], or [ʋ].


Every root has the canonical form CVCV, with the restriction that yi and vu are not allowed. This means that there are 53 permissible CV syllables, for a total of 2,809 possible roots.

Many roots can be abbreviated by removing redundant letters. The rules for this are as follows:

  • If the two vowels are identical, it is always permissible to delete the second vowel. For example, yene can be shortened to yen.
  • If the two consonants are identical, the initial consonant may be deleted, but this is allowed only when the root is word-initial. For example, momi can be shortened to omi when it stands alone, or when it begins a word, like (m)omituka; but when it appears elsewhere in a word, as in tukamomi, shortening is not allowed.
  • If both the two consonants and the two vowels are identical, the entire second syllable may be omitted. This is only allowed (a) when the root stands alone or (b) when it is immediately preceded and/or followed by a contracted CVC or VCV root. When these conditions are not met, the root may be abbreviated in one of the other two ways mentioned above. For example, koko when it stands alone can become ko (or kok or oko). Kokotene can become koten, koktene, kokten, okotene, or okoten — but not *kotene. Nanakoko could be abbreviated in even more ways: nanakok, nankoko, nankok, nanko, anakoko, anakok, anako, or nakok — but not *nakoko, *nanako, or *nako (the latter being a root in its own right).
