Version 1.0: 2001-12-26 Stefan Probst -- initial idea and first part
format of original file: "charset=ISO 8859-1"
Version 1.1: 2001-12-27 jDo -- added UTF-8 section, format of file therefore
"UTF-8"
Version 1.2: 2001-12-28 Stefan Probst -- minor editorial touch up and added
"Notes" for educational purposes.
Version 1.3: 2002-01-05 Stefan Probst -- added search test and test form for
feedback.
Version 1.4: 2002-03-22 Stefan Probst -- encoded all "&" as "&"
to make it compliant with XHTML"
1) Fully Precomposed: Unicode Normalization Form C ("NFC"): |
|
Việt Nam: Việt Nam | Việt Nam |
2) "VN-Combining", i.e. characters pre-composed, tone marks combining Note: There is no international standard for this behaviour: |
|
Viê ̣t Nam: Việt Nam | Việt Nam |
3) "VN-Canonical", i.e. only combining characters, tone marks sorted last Note: There is no international standard for this order: |
|
Vie ^ ̣t Nam: Việt Nam | Việt Nam |
4) Only combining characters, sorted by canonical order: Unicode Normalization Form D ("NFD"): |
|
Vie ̣^t Nam: Việt Nam | Việt Nam |
1) Fully Precomposed: Unicode Normalization Form C ("NFC"): |
|
Việt Nam: Viá»t Nam | Việt Nam |
2) "VN-Combining", i.e. characters pre-composed, tone marks combining Note: There is no international standard for this behaviour: |
|
Viê ̣t Nam: Việt Nam | Việt Nam |
3) "VN-Canonical", i.e. only combining characters, tone marks sorted last Note: There is no international standard for this order: |
|
Vie ^ ̣t Nam: VieÌÌ£t Nam | Việt Nam |
4) Only combining characters, sorted by canonical order: Unicode Normalization Form D ("NFD"): |
|
Vie ̣^t Nam: VieÌ£Ìt Nam | Việt Nam |
Notes:
- Both file encodings (and a few more) can be used to represent "Unicode characters".
- NCRs can use decimal values (&#...;) or hex values (&#x...;).
- In HTML files, the charset specification in the header does not only tell
the browser
how to read the file, but also how to encode input from
forms,
when sending the data back to the server.
Test Instructions:
Open this file in a web browser (e.g. Internet Explorer).
- do all the characters appear ok?
Print it.
- do all the characters appear ok?
Do a search test:
- copy all eight forms (from A1 to B4) of the Vietnamese word "Viet Nam" one
by one,
paste them into your browser's "find/search" dialog box,
and check in each case which of the eight occurrences are
found.
Then copy the whole page and paste it into your word processor (e.g. MS Word).
- are the characters ok?
Change the font size of the whole document, e.g. to "12", "14", etc.
- how are the characters rendered? (i.e. quality of the "ệ": is the dot exactly
below it?)
Repeat the search test like for the browser.
Copy the following form into your eMail program,
fill it as far as possible, and send it to Unicode-Tests@isoc-vn.org
******************************************* Results of Unicode Tests Used Testpage: 1.3 1) Platform: OS (kind, version) : Browser (incl. version): Wordprocessor : Printer : 2) Results: Display in Browser ok: not ok: comments: Print from Browser ok: not ok: comments: Find in Browser find version A1 finds: find version A2 finds: find version A3 finds: find version A4 finds: find version B1 finds: find version B2 finds: find version B3 finds: find version B4 finds: Display in Wordprocessor ok: not ok: comments: Print from Wordprocessor ok: not ok: comments: Find in Wordprocessor find version A1 finds: find version A2 finds: find version A3 finds: find version A4 finds: find version B1 finds: find version B2 finds: find version B3 finds: find version B4 finds: Other comments : Tested by : *******************************************Example:
******************************************* Results of Unicode Tests Used Testpage: 1.3 1) Platform: OS (kind, version) : Windows ME Browser (incl. version): Internet Explorer 5.5 Wordprocessor : MS Word 97 Printer : HP Laserjet 4L 2) Results: Display in Browser ok: all not ok: comments: Print from Browser ok: A2, A3, A4, B2, B3, B4 not ok: A1, B1 (prints question marks "?") comments: Find in Browser find version A1 finds: A1, B1 find version A2 finds: A2, B2 find version A3 finds: A3, B3 find version A4 finds: A4, B4 find version B1 finds: A1, B1 find version B2 finds: A2, B2 find version B3 finds: A3, B3 find version B4 finds: A4, B4 Display in Wordprocessor ok: A1, A2, B1, B2 not ok: A3, A4, B3, B4 (squares) comments: dot in A2 and B2 in some font sizes not exactly below the "e", but far left. Print from Wordprocessor ok: A1, A2, B1, B2 not ok: A3, A4, B3, B4 (squares) comments: dot in A2 and B2 in some font sizes not exactly below the "e", but far left. Find in Wordprocessor find version A1 finds: A1, B1 find version A2 finds: A2, B2 find version A3 finds: A3, B3 find version A4 finds: A4, B4 find version B1 finds: A1, B1 find version B2 finds: A2, B2 find version B3 finds: A3, B3 find version B4 finds: A4, B4 Other comments : Tested by : Stefan Probst *******************************************