address_cjkvt grammar

The address_cjkvt grammar is found in the set PII.

entity description examples components checksum
address/CC A postal address.
In general, a score of one is given to an address that includes a numbered, common format street address (for example "23 North Road"), a known city (for example "London"), and a postal code in a viable format for the country (for example "SW1A 2AA"). Deviations from this form lead to score penalties. The ordering of these elements varies by country. OpenText recommends that you use prefiltering to improve the performance for this grammar. See Configure Prefiltering. This entity returns the addresses in a normalized format by default. The normalized form standardizes apartment and house numbers, expands shortened forms of region names, removes additional punctuation, and converts the text to uppercase. For example "ABIDEI HURRIYET CD TANER PALAS APT 9, KAT:7, D:9, 34437 ISTANBUL". The exact order depends on the country. You can turn off normalization by setting normalize_addresses=false in the address_stoplist.lua script. This option can improve performance when you do not need normalization.
1-2-34, Yaesu 4-Chome, Nanae, Atsuta, Hekinan, Kagoshima, 123-4567, Japan
162-168 Regent Street, London, W1B 5TG
Abidei Hurriyet Cd Taner Palas Han 9 Kat:7 Dayre 9, 34437 Istanbul
Avenida Juan Xxiii 20, 41006, Sevilla
No.55, Sec. 2, Jinshan S. Rd., Daan Dist., Taipei City 10603
No.99 Xingjian Road, Cixi City, Zhejiang Province, China
Schlosshoferstrasse 20, 1210 Vienna
北京市房山区长阳镇北广阳城大街8号
日本、〒123-4567神奈川県津島市城南区月形町八重洲四丁目1番2-34号
10603台北市大安區金山南路2段55號
✔️
address/country/context/CC A country, with context.
国: 日本
Country: United Kingdom
国:中華民國
✔️
address/country/landmark/CC A country landmark. Country
address/country/nocontext/cjkvt/CC A country in CJKVT native script, without context. 中華民國
日本
✔️
address/country/nocontext/CC A country, without context. Japan
United Kingdom
日本
✔️
address/country/nocontext/latin/CC A country in romanized text, without context. Japan ✔️
address/landmark/CC A postal address landmark. 住所
Address
address/postcode/context/CC A postal code, with context. Postcode: 1234567
Postcode: CB4 0WZ
郵便番号: 123-4567
郵遞區號106-409
✔️
address/postcode/landmark/CC A postal code landmark. 郵便番号
Postcode
郵遞區號
address/postcode/nocontext/CC A postal code, without context. 123-4567
106-409
CB4 0WZ
✔️
address/region/context/CC A region, with context. For Taiwan, this is a county (縣 xiàn) or municipality (市 shì). 縣市:宜蘭縣"
Prefecture: Kagoshima
都道府県: 神奈川県
✔️
address/region/landmark/CC A region landmark. Prefecture
縣市
都道府県
address/region/nocontext/cjkvt/CC A region in CJKVT native script, without context. 宜蘭縣
神奈川県
✔️
address/region/nocontext/CC A region in CJKVT native script or romanized text, without context. Kagoshima
新北市
神奈川県
✔️
address/region/nocontext/latin/CC A region in romanized text, without context. Kagoshima
Yilan County
✔️
address/settlement/context/CC A settlement, with context. For example, in Japan a town or city, or in Taiwan a district (區 qū) or township (鎮 zhèn/鄉 xiāng). City/Ward/Town/Village: Nanae, Atsuta, Hekinan
市区町村: 津島市城南区月形町
鄉鎮市區:板橋區
✔️
address/settlement/landmark/CC A settlement landmark. City/Ward/Town/Village
市区町村
鄉鎮市區
address/settlement/nocontext/cjkvt/CC A settlement in CJKVT native script, without context. or "板橋區"
津島市城南区月形町
✔️
address/settlement/nocontext/CC A settlement in CJKVT native script or romanized text, without context. Nanae, Atsuta, Hekinan
板橋區
津島市城南区月形町
✔️
address/settlement/nocontext/latin/CC A settlement in romanized text, without context. Banqiao District
Nanae, Atsuta, Hekinan
✔️
address/streetlocation/context/CC A street location (house number and street name), with context. Address: 123, Mill Road
Address: No.55, Sec. 2, Jinshan S. Rd.
住所: 八重洲 四丁目1番2 -34号
✔️
address/streetlocation/landmark/CC A street location landmark. Address
住址
住所
address/streetlocation/nocontext/cjkvt/CC An address first line in CJKVT native script, without context. or "金山南路2段55號
八重洲四丁目1番2-34号"
✔️
address/streetlocation/nocontext/CC A street location (house number and street name), without context. 1-2-34, Yaesu 4-Chome
123, Mill Road
八重洲四丁目1番2-34号
✔️
address/streetlocation/nocontext/latin/CC An address first line in romanized text, without context. Jinshan S. Rd.
Sec. 2
Yaesu 4-Chome"
or "No.55
1-2-34
✔️

The following components are produced by the above entities.

components description examples context
ALLEY alley name SIETTA ALLEY Sietta Alley, Gana Lane, Hualian Road, Section 1, 3, Anji County, Huzhou, Zhejiang
APARTMENT apartment number APT 4 Apt 4, Nasmah Tower, Al Ittihad Road, Al Nahda 1, Dubai
APARTMENT_PENALISED INTERNAL
BLOCK postal block 110 гр. Варна, жк. Младост, бл.110, вх.4 ет. 2
BUILDING building name AURORA BUILDING Level 12, Aurora Building, 147 Pirie Street, Adelaide South Australia 5000
BU_BLOCK block number 1 1-2 Echizen, Nanao-shi, Ishikawa-ken 926-8611
CITY city name AL AIN Sheikh Zayed Road 123, Al Ain, United Arab Emirates
CITY_BLOCK city block name HONGO-DORI 8-1 Hongo-dori Higashi, Chuo-ku, Sapporo-shi
CITY_BLOCK_DIRECTION compass direction indicating subpart of city block KITA 4-2 Kita-1-jo Nishi, Chuo-ku, Sapporo-shi
CITY_BLOCK_LANDMARK (post-processing) KOTONI 2-3-4 Kotoni-1-jo, Nishi-ku, Sapporo-shi, Hokkaido
CITY_BLOCK_NUMBER city block number 1 4-2 Kita-1-jo Nishi, Chuo-ku, Sapporo-shi
COUNTRY country name UNITED ARAB EMIRATES Sheikh Zayed Road 123, Al Ain, United Arab Emirates
COUNTY county name VILJANDI Allika talu, Halliste alevik, 69501 VILJANDIMAA
DISTRICT district AL NAHDA 1 Apt 4, Nasmah Tower, Al Ittihad Road, Al Nahda 1, Dubai
DISTRICT_DIRECTION compass direction indicating a subpart of the district HIGASHI 8-1 Hongo-dori Higashi, Chuo-ku, Sapporo-shi
DISTRICT_NUMBER district number 日本、〒123-4567神奈川県津島市城南区月形町八重洲四丁目1番2-34号
DISTRICT_NUMBER_JIKKAN jikkan (十間) number 新潟県新潟市江南区楚川甲180-1
DISTRICT_PENALISED
FLOOR floor number LEVEL 12 Level 12, Aurora Building, 147 Pirie Street, Adelaide South Australia 5000
LANE lane name GANA Sietta Alley, Gana Lane, Hualian Road, Section 1, 3, Anji County, Huzhou, Zhejiang
NEIGHBORHOOD neighborhood name МЛАДОСТ гр. Варна, жк. Младост, бл.110, вх.4 ет. 2
NUMBER number of the match (eg house, site or building number as part of a street address; or id number etc) 123 Sheikh Zayed Road 123, Al Ain, United Arab Emirates
POSTCODE postal code 1210 Schlosshoferstrasse 20,1210 Vienna
POST_OFFICE post office name or code STN A 1425 JAMES ST, PO BOX 4001 STN A,VICTORIA BC V8X 3X4
PO_BOX postal box (PO Box) number 57 PO Box 57, Sharjah, United Arab Emirates
PREFECTURE prefecture name DONGGUAN No. 3 Gaolong Avenue, Huanzhuli, Changping Town, Dongguan, Guangdong Province
REGION region name 北京市 北京市房山区长阳镇北广阳城大街8号
ROOM room number 1315 上海漕宝路103号1315室
RURAL_AREA rural area ASHIGARASHIMO 6-22, Aza Uwadaira Oaza Nagahama, Kazamaura, Ashigarashimo, Toyama
RURAL_SUBUNIT rural subunit AZA-UWADAIRA-OAZA-NAGAHAMA 6-22, Aza Uwadaira Oaza Nagahama, Kazamaura, Ashigarashimo, Toyama
STREET street name (without the number) SHEIKH ZAYED ROAD Sheikh Zayed Road 123, Al Ain, United Arab Emirates
STREET_SECTION number of the subsection of a street 1 Sietta Alley, Gana Lane, Hualian Road, Section 1, 3, Anji County, Huzhou, Zhejiang
TOWN town name UMTATA 110101 Corana, Umtata, 5100
TOWNSHIP township name 长阳镇 北京市房山区长阳镇北广阳城大街8号
UNIT unit number 01-01 12 Mount Elizabeth #01-01 Singapore 228511
VILLAGE village name ANTRA СВ Slavyanska 29, с. Antra, Bulgaria
WARD ward name 城南区 日本、〒123-4567神奈川県津島市城南区月形町八重洲四丁目1番2-34号

In the tables, ✔️ indicates that the item is always used, and ❌ that the item is never used. The ✓ symbol indicates that the item is only used for some entities (that is, for some but not all countries or languages). For details, refer to the relevant entity page.