Order Tray | Contact Us | Home | SIG Lists

[aprssig] Please,standardize UTF-8 for APRS

Heikki Hannikainen hessu at hes.iki.fi
Fri Dec 18 06:48:03 UTC 2009


On Thu, 17 Dec 2009, Robert Bruninga wrote:

> To clarify things, I have now created a UTF-8 discussion
> document that summarizes any issues with UTF-8 and suggests
> bounds on where it can be used.  I welcome any suggestions or
> additions to this document to make sure we are all working from
> the same sheet.  It is the first link listed on:
> http://aprs.org/aprs12.html

I think you're making it too complicated on that page, for the casual 
developer, and mixing a lot of terminology. UTF-8 and ASCII are exactly 
THE SAME for values 0 to 127.

Bytes in the ASCII range of 0–127 represent themselves in UTF, thereby 
providing backward compatibility.

The whole APRS packet, and the mic-e packet, can be transmitted in UTF-8 
encoded format, as the UTF-8 encoded version of the APRS/mic-e protocol 
data is EXACTLY equal to ASCII. So most of the text on that page is 
actually rather unnecessary!

I think the required specification boils down to:

-------------------- cut here ------------------

- Using UTF-8 is recommended for all free-form message data:
 	- APRS messaging data
 	- Comment field in compressed and uncompressed packets
 	- Mic-E Status Text Field
 	- Status messages
 	- Bulleting message content
 	- Beacon text

- Using 7-bit printable ASCII characters is mandatory for:
 	- All callsigns
 	- Bulletin / message destinations
 	- Object and item names

The whole APRS packet can be encoded as UTF-8, but character values above 
127 must not be used in callsign fields which have strict 
requirements for length (in bytes) in the encoded form.

-------------------- cut here ------------------

Could you please replace the utf-8.txt contents with that?

Also, it is not necessary to rant about not using UTF-8 in international 
communications. I think it's rather obvious that when writing to you I 
should write in English. Which uses characters under the < 127 values, 
which are EQUAL in UTF-8 and ASCII - it will just work even if I am 
transmitting in UTF-8 and you are expecting ASCII. So that's covered 
automatically.

   - Hessu


More information about the aprssig mailing list