[BitPim-devel] [Fwd: Re: DSV replacement with csv]

Discussion:

Mike Rovner

2005-05-22 21:40:15 UTC

How do you handle unicode? (The csv module barfs, I think DSV gets it
right by accident not design).

Unicode is outside of the scope of csv. As recently discussed on clp,
csv deals exclusively with bytes. However IIRC Unicode inside "" passes.
I have to do some extra tests.

The DSV module has two very nice features. One is that it guesses the
delimiter (and even gets it right sometimes :-) and the other is that
it guesses if the first row is a header row (and gets that right
sometimes
as well :-)

Same for csv.

BitPim currently has almost every possible Python/C integration method
in use! There is Swig for the usb stuff, half Swig/half C++ for Windows
Address Book (currently unused) and full custom code for APSW and string
matching.

I have experience with swig, boost and Pyrex.

I presume you saw the todo list - http://bitpim.org/todo.html

sure

There are several other items not on there as well such as tighter
integration
on Linux and Mac. Are you interested in algorithms, user interface, OS
plumbing?

Currently I can use only Windows and yes, I am in that order.

Mike

Mike Rovner

2005-05-22 21:41:04 UTC

Permalink

How do you handle unicode? (The csv module barfs, I think DSV gets it
right by accident not design).

Unicode is outside of the scope of csv. As recently discussed on clp,
csv deals exclusively with bytes. However IIRC Unicode inside "" passes.
I have to do some extra tests.

Same for csv.

I have experience with swig, boost and Pyrex.

I presume you saw the todo list - http://bitpim.org/todo.html

sure

There are several other items not on there as well such as tighter
integration
on Linux and Mac. Are you interested in algorithms, user interface, OS
plumbing?

Currently I can use only Windows and yes, I am in that order.

Mike

Roger Binns

2005-05-22 22:48:59 UTC

Permalink

Post by Mike Rovner
Unicode is outside of the scope of csv. As recently discussed on clp,
csv deals exclusively with bytes. However IIRC Unicode inside "" passes.
I have to do some extra tests.

We definitely need unicode handled correctly. Programs on Mac in particular
like to spew out unicode text files.

Post by Mike Rovner
Same for csv.

Cool. I had looked originally when the csv module was first introduced
but obviously wasn't paying enough attention. Other than potential
unicode issues, dropping DSV will be a really nice gain.

Post by Mike Rovner
Currently I can use only Windows and yes, I am in that order.

On the algorithms side, we desperately need to be able to do syncing.
In theory this is simple - record the state of information before and
after on a data source (Outlook, the phone etc) and then apply the diffs
to the data elsewhere (eg in BitPim). If smart enough you can allow for
updates in multiple sources and still get the results right.

The problem we have in BitPim is that the phones are lousy. They munge the
data, have restricted fields, lose information etc. Heck even Outlook has
arbitrary limitiations (eg only 3 email addresses). So the question is how
do you do syncing despite all the sources of data messing with it? NB the
BitPim internal format is very strong and doesn't lose data (much).

Roger