Second-Level Reference Label Generation Rules
ICANN has developed second-level Internationalized Domain Name (IDN) tables in machine-readable format or Label Generation Rules (LGRs) that registry operators can reference while designing their IDN tables. These reference LGRs will be used by ICANN org when reviewing IDN tables submitted for use with the generic top-level domains (gTLDs).
The reference LGRs have been developed using guidelines, which have been reviewed by the community. These LGRs are provided below in the XML format along with a more readable HTML format.
If you have questions or feedback regarding these reference LGRs, please send an email to IDNprogram@icann.org.
Current Version (25 October 2024)
The current version of Second-Level Reference LGRs are developed in consultation with the respective script communities. Other resources are also consulted where available, e.g., the Root Zone Label Generation Rules (RZ-LGR). They are finalized after a Public Comment proceeding.
This version includes: Balinese script and Thaana script LGRs as well as Inuktitut language LGR. Common LGR is updated to incorporate these new LGRs.
See the Overview and Summary document for further details about these LGRs. The package of all LGRs is available here[ZIP, 13.6MB]. The LGRs with normative updates are marked with (*) and the changes are documented in the LGR document.
Script-based LGRs
| Name | Language Tag1 | LGR Document |
|---|---|---|
| Arabic | und-Arab | HTML, XML |
| Armenian | und-Armn | HTML, XML |
| Balinese | und-Bali |
HTML, XML Supporting Document |
| Bangla (Bengali) | und-Beng | HTML, XML |
| Cyrillic | und-Cyrl | HTML, XML |
| Devanagari | und-Deva | HTML, XML |
| Ethiopic | und-Ethi | HTML, XML |
| Georgian | und-Geor | HTML, XML |
| Greek | und-Grek | HTML, XML |
| Gujarati | und-Gujr | HTML, XML |
| Gurmukhi | und-Guru | HTML, XML |
| Hebrew | und-Hebr | HTML, XML |
| Japanese | und-Jpan | HTML, XML |
| Kannada | und-Knda | HTML, XML |
| Khmer | und-Khmr | HTML, XML |
| Lao | und-Laoo | HTML, XML |
| Latin | und-Latn | HTML, XML |
| Malayalam | und-Mlym | HTML, XML |
| Myanmar | und-Mymr | HTML, XML |
| Oriya | und-Orya | HTML, XML |
| Sinhala | und-Sinh | HTML, XML |
| Tamil | und-Taml | HTML, XML |
| Telugu | und-Telu | HTML, XML |
| Thaana | und-Thaa |
HTML, XML Supporting Document |
| Thai | und-Thai | HTML, XML |
1: The prefix 'und' (Undetermined) identifies linguistic content whose language is not determined. Please see RFC5646 for details of the language tag syntax and IANA language sub tag registry for the available language tags.
Full Variant Set LGRs and Common LGR
A set of "full-variant" LGR has been defined that collectively contains the cross-script variants identified to mitigate whole-script homograph labels mostly within the related scripts.
| Name | Language Tag | Script Collection | LGR Document |
|---|---|---|---|
| Chinese (Full Variant Set) | und-Hani | Han used in Chinese, Korean, Japanese scripts | HTML, XML |
| Devanagari (Full Variant Set) | und-Deva | Devanagari, Bengali, and Gurmukhi | HTML, XML |
| Korean (Full Variant Set) | und-Kore | Hangul and Han used in Chinese and Korean script | HTML, XML |
| Latin (Full Variant Set) | und-Latn | Armenian, Cyrillic, Greek, Hebrew, and Latin | HTML, XML |
| Myanmar (Full Variant Set)* | und-Mymr | Georgian, Latin, Malayalam, Myanmar, and Oriya | HTML, XML |
| Tamil (Full Variant Set) | und-Taml | Tamil and Malayalam | HTML, XML |
| Telugu (Full Variant Set) | und-Telu | Kannada and Telugu | HTML, XML |
| Common LGR | Multiple Tags | All scripts | HTML, XML |
Language-based LGRs
| Name | Language Tag2 | LGR Document |
|---|---|---|
| Arabic | ar | HTML, XML |
| Belarusian | be | HTML, XML |
| Bosnian (Cyrillic) | bs-Cyrl | HTML, XML |
| Bosnian (Latin) | bs | HTML, XML |
| Bulgarian | bg | HTML, XML |
| Chinese | zh | HTML, XML |
| Danish | da | HTML, XML |
| English | en | HTML, XML |
| Finnish | fi | HTML, XML |
| French | fr | HTML, XML |
| German | de | HTML, XML |
| Hebrew | he | HTML, XML |
| Hindi | hi | HTML, XML |
| Hungarian | hu | HTML, XML |
| Icelandic | is | HTML, XML |
| Inuktitut | iu-Cans |
HTML, XML Supporting Document |
| Italian | it | HTML, XML |
| Japanese (Standalone) | ja | HTML, XML |
| Korean (Hangul) | ko | HTML, XML |
| Latvian | lv | HTML, XML |
| Lithuanian | lt | HTML, XML |
| Macedonian | mk | HTML, XML |
| Montenegrin | cnr-Cyrl | HTML, XML |
| Norwegian | no | HTML, XML |
| Polish | pl | HTML, XML |
| Portuguese | pt | HTML, XML |
| Russian | ru | HTML, XML |
| Serbian | sr-Cyrl | HTML, XML |
| Spanish* | es | HTML, XML |
| Swedish | sv | HTML, XML |
| Thai | th | HTML, XML |
| Ukrainian | uk | HTML, XML |
2: Where the default script is not identified, the script information is included to avoid ambiguity.
Full Variant Set LGRs for RSP Evaluation Progarm
All languages and scripts have associated "rsp-full-variant" LGRs which include the injected cross-repertoire variant sets. They are used as part of the Registry Service Provider (RSP) Evaluation Program. Further details are available at: https://newgtldprogram.icann.org/en/application-rounds/round2/rsp/full-variant-set-lgrs
