The following table lists groups of IPA characters and the Unicode blocks in which they can be found. The U+ prefix is a convention that identifies Unicode; they are 16-bit hexadecimal values.
| IPA Characters | Unicode block |
|---|---|
| Standard Latin | U+0041 -- U+00FF |
| European and Extended Latin | U+0010 -- U+01F0 |
| Standard phonetic characters | U+0250 -- U+02AF |
| Modifier letters (spacing) | U+02B0 -- U+02FF |
| Diacritical marks (nonspacing) | U+0300 -- U+036F |
The symbols used for American English phonemes are listed below. Each phoneme symbol is accompanied by an example, as well as the IPA description, the Unicode name for the glyph shape used in the IPA standard phonetic charts, and the Unicode value. Some phonemic labels are described as diphthongs or affricate clusters. For these, it may be preferable to rely on the MS labels, rather than the Unicode clusters of their component phonemes, since some TTS engines will provide single combined data points for these phonemes, rather than synthesize them as combinations of separately modeled phonemes. In the Unicode names, 'LATIN' means 'LATIN SMALL LETTER' and 'GREEK' means 'GREEK SMALL LETTER'.
| MS | Example | IPA Description | Unicode name | Unicode |
|---|---|---|---|---|
| iy | feel, eve, me | front close unrounded | LATIN I | U+0069 |
| ih | fill, hit, lid | front close unrounded (lax) | LATIN CAPITAL I | U+026A |
| ae | at, carry, gas | front open unrounded (tense) | LATIN AE | U+00E6 |
| aa | father, ah, car | back open unrounded | LATIN ALPHA | U+0251 |
| ah | cut, bud, up | open-mid back unrounded | LATIN TURNED V | U+028C |
| ao | dog, lawn, caught | open-mid back round | LATIN OPEN O | U+0254 |
| ay | tie, ice, bite | diphthong with quality: aa + ih | ||
| ax | ago, comply | central close mid (schwa) | LATIN SCHWA | U+0259 |
| ey | ate, day, tape | front close-mid unrounded (tense) | LATIN E | U+0065 |
| eh | pet, berry, ten | front open-mid unrounded | LATIN OPEN E | U+025B |
| er | turn, fur, meter | central open-mid unrounded rhoticized | LATIN SCHWA W/HOOK | U+025A |
| ow | go, own, tone | back close-mid rounded | LATIN O | U+006F |
| aw | foul, how, our | diphthong with quality: aa + uh | ||
| oy | toy, coin, oil | diphthong with quality: ao + ih | ||
| uh | book, pull, good | back close-mid unrounded (lax) | LATIN UPSILON | U+028A |
| uw | tool, crew, moo | back close round | LATIN U | U+0075 |
| b | big, able, tab | voiced bilabial plosive | LATIN B | U+0062 |
| p | put, open, tap | voiceless bilabial plosive | LATIN P | U+0070 |
| d | dig, idea, wad | voiced alveolar plosive | LATIN D | U+0064 |
| t | talk, sat | voiceless alveolar plosive & | LATIN T | U+0074 |
| meter | alveolar flap | LATIN R W/FISHHOOK | U+027E | |
| g | gut, angle, tag | voiced velar plosive | LATIN SCRIPT G | U+0067 |
| k | cut, oaken, take | voiceless velar plosive | LATIN K | U+006B |
| f | fork, after, if | voiceless labiodental fricative | LATIN F | U+0066 |
| v | vat, over, have | voiced labiodental fricative | LATIN V | U+0076 |
| s | sit, cast, toss | voiceless alveolar fricative | LATIN S | U+0073 |
| z | zap, lazy, haze | voiced alveolar fricative | LATIN Z | U+007A |
| th | thin, nothing, truth | voiceless dental fricative | GREEK THETA | U+03B8 |
| dh | then, father, scythe | voiced dental fricative | LATIN ETH | U+00F0 |
| sh | she, cushion, wash | voiceless postalveolar fricative | LATIN ESH | U+0283 |
| zh | genre, azure | voiced postalveolar fricative | LATIN EZH | U+0292 |
| l | lid | alveolar lateral approximant | LATIN L | U+006C |
| elbow, sail | velar lateral approximant | LATIN L W/MIDDLE TILDE | U+026B | |
| r | red, part, far | retroflex approximant | LATIN R | U+0279 |
| y | yacht, onion, yard | palatal sonorant glide | LATIN J | U+006A |
| w | with, away | labiovelar sonorant glide | LATIN W | U+0077 |
| hh | help, ahead, hotel | voiceless glottal fricative | LATIN H | U+0068 |
| m | mat, amid, aim | bilabial nasal | LATIN M | U+006D |
| n | no, end, pan | alveolar nasal | LATIN N | U+006E |
| nx | sing, anger, drink | velar nasal | LATIN ENG | U+014B |
| ch | chin, archer, march | voiceless alveolar affricate: t + sh | U+02A7 | |
| jh | joy, agile, edge | voiced alveolar affricate: d + zh | U+02a4 |
The following symbols can be used to construct phoneme strings and phonetic input to a TTS engine.
The precise effects may vary in different TTS engines.
| MS | Description | Unicode name | Unicode | Usage/Effect |
|---|---|---|---|---|
| - | syllable boundary | HYPHEN-MINUS | U+002D | separates syllables |
| # | word boundary | NUMBER SIGN | U+0023 | separates words |
| (space) | word boundary | SPACE | U+0020 | separates words |
| _ | silence | UNDERLINE | U+005f | indicates silent period |
| 1 | primary stress | MODIFIER LETTER VERTICAL LINE | U+02C8 | precedes affected vowel |
| 2 | secondary stress | MODIFIER LETTER LOW VERTICAL LINE | U+02CC | precedes affected vowel |
| (blank) | word boundary | SPACE | U+0020 | separates words |
| . | period | FULL STOP | U+002E | pitch fall, pause |
| ? | question mark | QUESTION MARK | U+003F | pitch rise, pause |
| ! | exclamation | EXCLAMATION MARK | U+0021 | raised pitch range, pause |
| , | comma | COMMA | U+002C | continuation rise, pause |
Use the Prn control tag to indicate how to pronounce text by passing the phonetic equivalent to the engine. For information about Prn, see "Text-to-Speech Control Tags."
Note these rules:
For more information about IPA characters and Unicode, see the following publications: