Combining diacritics

From DejaVu

Jump to: navigation, search

Contents

Positioning

Vera comes with a lot of precomposed characters. The characters composed with combining diacritics should match those.

Basic scripts

Basic scripts are the "most commons" ones or those sharing the same features and the common diacritics with them. Latin, Cyrillic, Greek and Tifinagh share the same Mark anchors.

Mark to Base
  • above
  • below

They have not yet been added to all potential base characters. Sans and Serif have better support than Mono fonts. Other anchors need to be added, such as :

  • middle for overstruck/overstricking diacritics, even if they are not advised
  • ogonek for ogonek
  • cedilla for cedilla
  • left for attaching diacritics on the left
  • hook for hook positioning
Mark to Mark
  • below-mark
  • above-mark

We still need to figure out how to compose Polytonic Greek to match precomposed forms. The same mechanism could be used for Touareg Tifinagh aligning diacritics above instead of superposing them.

Lao

Lao was made into it's own ScriptLang and thus has it's own set of above and below diacritics.

RTL scripts

RTL scripts are right-to-left scripts. Arabic, Hebrew and N’ko share the same anchors. Do they share the same marks? Maybe they should be split?

Case Form

There is a ccmp lookup for contextual substitution of combining diacritics on uppercase letters as well as lowercase letters with ascenders.

The proposed list of base characters triggering the substitution is :
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z b d f h k l t Agrave Aacute Acircumflex Atilde Adieresis Aring AE Ccedilla Egrave Eacute Ecircumflex Edieresis Igrave Iacute Icircumflex Idieresis Eth Ntilde Ograve Oacute Ocircumflex Otilde Odieresis Oslash Ugrave Uacute Ucircumflex Udieresis Yacute Thorn germandbls Amacron Abreve Aogonek Cacute Ccircumflex Cdotaccent Dcaron Dcroat Emacron Ebreve Edotaccent Eogonek Ecaron Gcircumflex Gbreve Gdotaccent Gcommaaccent Hcircumflex hcircumflex Hbar hbar Itilde Imacron Ibreve Iogonek Idotaccent IJ Jcircumflex Kcommaaccent Lacute lacute Lcommaaccent lcommaaccent Lcaron lcaron Ldot ldot Lslash lslash Nacute Ncommaaccent Ncaron Eng Omacron Obreve Ohungarumlaut OE Racute Rcommaaccent Rcaron Sacute Scircumflex Scedilla Scaron Tcommaaccent Tcaron Tbar Utilde Umacron Ubreve Uring Uhungarumlaut Uogonek Wcircumflex Ycircumflex Ydieresis Zacute Zdotaccent Zcaron longs uni0180 uni0181 uni0182 uni0183 uni0184 uni0185 uni0186 uni0187 uni0188 uni0189 uni018A uni018B uni018C uni018E uni018F uni0190 uni0191 florin uni0193 uni0194 uni0195 uni0196 uni0197 uni0198 uni0199 uni019A uni019B uni019C uni019D uni019F Ohorn uni01A2 uni01A4 uni01A5 uni01A6 uni01A7 uni01A9 uni01AA uni01AB uni01AC uni01AD uni01AE Uhorn uni01B1 uni01B2 uni01B3 uni01B5 uni01B7 uni01B8 uni01BB uni01BC uni01BE uni01C0 uni01C1 uni01C2 uni01C3 uni01CD uni01CF uni01D1 uni01D3 uni01D5 uni01D7 uni01D9 uni01DB uni01DE uni01E0 uni01E2 uni01E4 Gcaron uni01E8 uni01E9 uni01EA uni01EC uni01EE uni01F4 uni01F6 uni01F7 uni01F8 Aringacute AEacute Oslashacute uni0200 uni0202 uni0204 uni0206 uni0208 uni020A uni020C uni020E uni0210 uni0212 uni0214 uni0216 Scommaaccent uni021A uni021C uni021E uni0220 uni0221 uni0222 uni0223 uni0224 uni0226 uni0228 uni022A uni022C uni022E uni0230 uni0232 uni0234 uni0236 uni0238 uni023A uni023B uni023D uni023E uni0241 uni0243 uni0244 uni0245 uni0246 uni0248 uni024A uni024C uni024E uni0253 uni0256 uni0257 uni0260 uni0266 uni0267 uni026B uni026C uni026D uni026E uni0278 uni0283 uni0284 uni0286 uni0288 uni0294 uni0295 uni0296 uni0297 uni029B uni02A0 uni02A1 uni02A2 Alpha Beta Gamma uni0394 Epsilon Zeta Eta Theta Iota Kappa Lambda Mu Nu Xi Omicron Pi Rho Sigma Tau Upsilon Phi Chi Psi Omega Iotadieresis Upsilondieresis upsilondieresistonos beta delta zeta theta lambda xi uni03D0 theta1 Upsilon1 uni03D3 uni03D4 phi1 uni03D8 uni03DA uni03DC uni03DD uni03DE uni03DF uni03E0 uni03E2 uni03E4 uni03E6 uni03E8 uni03EA uni03EC uni03EE uni03F4 uni03F9 uni03FA uni03FD uni03FE uni03FF uni0400 uni0401 uni0402 uni0403 uni0404 uni0405 uni0406 uni0407 uni0408 uni0409 uni040A uni040B uni040C uni040D uni040E uni040F uni0410 uni0411 uni0412 uni0413 uni0414 uni0415 uni0416 uni0417 uni0418 uni0419 uni041A uni041B uni041C uni041D uni041E uni041F uni0420 uni0421 uni0422 uni0423 uni0424 uni0425 uni0426 uni0427 uni0428 uni0429 uni042A uni042B uni042C uni042D uni042E uni042F uni0431 uni0452 uni045B uni0460 uni0462 uni0463 uni0464 uni0466 uni0468 uni046A uni046C uni046E uni0470 uni0472 uni0474 uni0476 uni0478 uni047A uni047C uni047E uni0480 uni048A uni048C uni048D uni048E uni0490 uni0492 uni0494 uni0496 uni0498 uni049A uni049C uni049E uni04A0 uni04A2 uni04A4 uni04A6 uni04A8 uni04AA uni04AC uni04AE uni04B0 uni04B2 uni04B4 uni04B6 uni04B8 uni04BA uni04BB uni04BC uni04BE uni04C1 uni04C3 uni04C5 uni04C7 uni04C9 uni04CB uni04CD uni04CF uni04D0 uni04D2 uni04D4 uni04D6 uni04D8 uni04DA uni04DC uni04DE uni04E0 uni04E2 uni04E4 uni04E6 uni04E8 uni04EA uni04EC uni04EE uni04F0 uni04F2 uni04F4 uni04F6 uni04F8 uni04FA uni04FC uni04FE uni0500 uni0502 uni0504 uni0506 uni0508 uni050A uni050C uni050E uni0510 uni0512 uni1E00 uni1E02 uni1E03 uni1E04 uni1E05 uni1E06 uni1E07 uni1E08 uni1E0A uni1E0B uni1E0C uni1E0D uni1E0E uni1E0F uni1E10 uni1E11 uni1E12 uni1E13 uni1E14 uni1E16 uni1E18 uni1E1A uni1E1C uni1E1E uni1E1F uni1E20 uni1E22 uni1E23 uni1E24 uni1E25 uni1E26 uni1E27 uni1E28 uni1E29 uni1E2A uni1E2B uni1E2C uni1E2E uni1E30 uni1E31 uni1E32 uni1E33 uni1E34 uni1E35 uni1E36 uni1E37 uni1E38 uni1E39 uni1E3A uni1E3B uni1E3C uni1E3D uni1E3E uni1E40 uni1E42 uni1E44 uni1E46 uni1E48 uni1E4A uni1E4C uni1E4E uni1E50 uni1E52 uni1E54 uni1E56 uni1E58 uni1E5A uni1E5C uni1E5E uni1E60 uni1E62 uni1E64 uni1E66 uni1E68 uni1E6A uni1E6B uni1E6C uni1E6D uni1E6E uni1E6F uni1E70 uni1E71 uni1E72 uni1E74 uni1E76 uni1E78 uni1E7A uni1E7C uni1E7E Wgrave Wacute Wdieresis uni1E86 uni1E88 uni1E8A uni1E8C uni1E8E uni1E90 uni1E92 uni1E94 uni1E96 uni1E97 uni1E9B uni1EA0 uni1EA2 uni1EA4 uni1EA6 uni1EA8 uni1EAA uni1EAC uni1EAE uni1EB0 uni1EB2 uni1EB4 uni1EB6 uni1EB8 uni1EBA uni1EBC uni1EBE uni1EC0 uni1EC2 uni1EC4 uni1EC6 uni1EC8 uni1ECA uni1ECC uni1ECE uni1ED0 uni1ED2 uni1ED4 uni1ED6 uni1ED8 uni1EDA uni1EDC uni1EDE uni1EE0 uni1EE2 uni1EE4 uni1EE6 uni1EE8 uni1EEA uni1EEC uni1EEE uni1EF0 Ygrave uni1EF4 uni1EF6 uni1EF8 uni2C60 uni2C61 uni2C62 uni2C63 uni2C64 uni2C66 uni2C67 uni2C68 uni2C69 uni2C6A uni2C6B uni2C75 Eng.alt J.alt uni0478.monograph I.alt

Contextual substitution

In some cases, it is necessary to have contextual substitions if a character is followed by a combining diacritics.

Dotted characters

A contextual substitution can take place when a combining mark above it attached to a variant of i or j. In that context the dot is dropped and replaced by the combining mark.

original context single substitution example
i i <U+0069> + mark_above ı <U+0131> i + U+0301 → í = NFD(í)
j <U+006A> j <U+006A> + mark_above ȷ <U+0237> j + U+0301 → j́
ɨ <U+0268> ɨ <U+0268> + mark_above uni0268.dotless ɨ + U+0301 → ɨ́
ɉ <U+0249> ɉ <U+0249> + mark_above uni0249.dotless ɉ + U+0301 → ɉ́
original context multiple substitution example
į <U+012F> į <U+012F> + mark_above ı <U+0131> + <U+0328> + mark_above į + U+0301 → į́ = NFD(į́)
ḭ <U+1E2D> ḭ <U+1E2D> + mark_above ı <U+0131> + <U+0330> + mark_above ḭ + U+0301 → ḭ́ = NFD(ḭ́)
ị <U+1ECB> ị <U+1ECB> + mark_above ı <U+0131> + <U+0323> + mark_above ị + U+0301 → ị́ = NFD(ị́)
ị <U+1ECB> ị <U+1ECB> + mark_above ı <U+0131> + <U+0323> + mark_above ị + U+0301 → ị́ = NFD(ị́)

There are more complex cases when the i or j variant can have below marks thus changing the context. A reasonable of contexts can be added, for example i + mark_below + mark_above.

Marks above

gravecomb acutecomb uni0302 tildecomb uni0304 uni0305 uni0306 uni0307 uni0308 hookabovecomb uni030A uni030B uni030C uni030D uni030E uni030F uni0310 uni0311 uni0312 uni0313 uni0314 uni033D uni033E uni033F uni0340 uni0341 uni0342 uni0343 uni0344 uni0346 uni034A uni034B uni034C uni0351 uni0352 uni0357 uni0483 uni0484 uni0485 uni0486 uni20D0 uni20D1 uni20D6 uni20D7

Marks below

uni0316 uni0317 uni0318 uni0319 uni031C uni031D uni031E uni031F uni0320 uni0321 uni0322 dotbelowcomb uni0324 uni0325 uni0326 uni0327 uni0328 uni0329 uni032A uni032B uni032C uni032D uni032E uni032F uni0330 uni0331 uni0332 uni0333 uni0339 uni033A uni033B uni033C uni0345 uni0347 uni0348 uni0349 uni034D uni034E

Personal tools