Order Tray | Contact Us | Home | SIG Lists

[aprssig] I see several languages

Matti Aarnio oh2mqk at sral.fi
Sat Feb 14 21:04:01 UTC 2009


On Fri, Feb 13, 2009 at 11:33:33AM -0500, Mr Jeffrey L Ross wrote:
> 
> hi all, I see several lanuguages being used on aprs. I know its world wide 
> but is there a translater program to add on to say winaprs/uiview to be 
> able to read all the messages? would someone please make one. thank you,
>   kc8gkf
> 

That is not a trivial thing.

First of all, APRS messages are defined only for ASCII, but people outside
the continental USA have realized that it is insufficient for messaging use.

Luckily most sane i-gate systems are able to ignore the stupid original
specification from TNC2 text dump interface and pass binary transparent
frames with KISS interface  -->  8th bit high bytes are possible.  Even
zero bytes traverse the APRS-IS.  The zero-byte are NOT very friendly to
several programming languages that use zero-byte as end-of-string, and
such should not ever be sent.

Serious problem with this character set expanding is that everybody is using
whatever happens to be supported in their machine.  Most use some variation
of either PC-DOS or Windows character sets.  Some japanese systems use
JIS-2022-JP, some using AWG's software use UTF-16 two-byte characters WITH
zero-bytes at every second byte in character pairs when text carries ASCII
compatible characters intermixed with Kanji characters.

This causes incompability already in between Finnish users depending upon
what software they are using, and what happens to be the 8-bit character set
that program uses.

The PROBLEM here is that BOB does not see it as an issue and is toying with
nonsense like aprs-pagers and DTMF for message sending, which are all ASCII-
limited and therefore useless outside USA.

Only after fixing the internationalization issue there can be meaningful
international use for messages -- not to mention coherent market for devices
so that the there _could_ be any incentive for vendors to make aprs-messaging
support into their radios.

The use of completely unspecified 8-bit character sets MUST finally be stomped
upon and specification must be amended to specify what is the CORRECT CHARACTER
SET.  Internet RFC writers became aware of this issue over 20 years ago, and
developed solutions to it, however email MIME mechanisms do not work in APRS.
There is no room to carry information like message character set.

There will be big problem of legacy software, some of which people insist on
using, while its source code has deliberately been thrown away!


The "best" character set available at the moment is UNICODE with UTF-8
serialization encoding.  CJK users will probably need 3 bytes per character,
but then they get usually a WORD per character, unlike us phoneme-derived
writing system users.

As people are aware, UTF-8 contains ASCII as low 128 byte codes, and defines
varying byte number encoding schema to handle character code points above 127.
More importantly:  IT DOES NOT PUT EXTRA ZERO BYTES ON DATASTREAM !


Now then, presuming the solution to this character set shamble can be
ratified, the next thing is that character set does not tell the language...
(Which I take as requirement for the "translation".)


> aprssig mailing list
> aprssig at tapr.org
> https://www.tapr.org/cgi-bin/mailman/listinfo/aprssig

73 de Matti, OH2MQK



More information about the aprssig mailing list