Discussion:
[BitPim-devel] Unicode in BitPim revisit
Joe Pham
2006-07-09 00:43:40 UTC
Permalink
Even though we have a means to handle unicode data from the phone,
the basic issue still remains (and borrowing a quote from this
article http://www.joelonsoftware.com/articles/Unicode.html):

"It does not make sense to have a string without knowing what
encoding it uses. "

And that's the problem when BitPim gets a string from the phone.
Currently, we decode most, if not all, strings from the phone using
iso8859_1 codec, which obviously does not work 100% of the time.
Some options to consider:

1. Do nothing and tell users to use nothing but basic ascii.
2. Decode with 'replace' or 'ignore'.
3. Try to determine/guess the right codec (if there's such a thing)
for each phone model .
4. Iterate through all codecs until the decode is successful.
5. Allow users to specify which codec to use through some sort of
GUI. This codec info can be stored along with the data as well as
for specific phone.

Comments and suggestions are welcome.

-Joe Pham



_____________________________________________________________________
PrivatePhone - FREE telephone number & voicemail.
A number so private, you can make it public.
http://www.privatephone.com
Simon C
2006-07-09 01:39:04 UTC
Permalink
Post by Joe Pham
And that's the problem when BitPim gets a string from the phone.
Currently, we decode most, if not all, strings from the phone using
iso8859_1 codec, which obviously does not work 100% of the time.
No, we use ascii for most of the phones, iso8859_1 is only used for phones
that we know use this encoding. There have been no user complaints about the
phonebook/calendar/memo on the (newer) LG phones since they started using
the USTRING correctly, it works 100% AFAIK.
The only user messages that come through now are where the phone has not had
the encoding set correctly in the packet definitions. E.g. the recent vx4650
calendar issue (fixed in rev. 3434).
A recent sanyo 8100 issue is because the packet definition is wrong, the fix
is to change
16 USTRING {'raiseonunterminatedread': False, 'raiseontruncate': False,
'terminator': None} name
to
16 USTRING {'encoding': PHONE_ENCODING, 'raiseonunterminatedread':
False, 'raiseontruncate': False, 'terminator': None} name

All filesystem brew commands use ascii even on phones we know use iso8859_1.
We need to spend the time putting the correct encoding into the packets,
this is a little more involved as we cannot just assume all phones are
8859-1, pelephone for instance is 8859-3.

All that we need to do is put the correct encoding into the PACKETS.

Simon
Joe Pham
2006-07-09 02:52:23 UTC
Permalink
the recent vx4650 calendar issue (fixed in rev. 3434).
In this particular instance, using 'iso8859_1' codec did not fix the
problem (it did not produce the correct string), it just no longer
raises the exception. It would have been much simpler if that's all
we want: suppressing the exceptions.

-Joe Pham




_____________________________________________________________________
PrivatePhone - FREE telephone number & voicemail.
A number so private, you can make it public.
http://www.privatephone.com
Simon C
2006-07-09 04:48:05 UTC
Permalink
Post by Joe Pham
the recent vx4650 calendar issue (fixed in rev. 3434).
In this particular instance, using 'iso8859_1' codec did not
fix the problem (it did not produce the correct string), it
just no longer raises the exception.
The trace from the user showed the character with a hex value (0xb0) that
mapped to the degree character in the 8859-1 spec.

I tested this on the vx8100 and vx6100(which shares the same calendar code
with the 4650) and it decodes OK reading and writing, the character displays
correctly in bitpim and the phone.

I don't have a 4650 but the conversion code is the same as for the
8100/6100, can you describe the difference between what the phone displays
and what bitpim shows, and copy the trace of the packet.

Simon

Loading...