Issue666484
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2003-01-12 03:46 by suzuki_hisao, last changed 2022-04-10 16:06 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
ja-codecs-0.6.tar.bz2 | suzuki_hisao, 2003-01-12 03:49 |
Messages (6) | |||
---|---|---|---|
msg42401 - (view) | Author: SUZUKI Hisao (suzuki_hisao) | Date: 2003-01-12 03:46 | |
This is an implementation of a set of Japanese Unicode codecs for Python 2.2 and 2.3. Three major encodings are supported: EUC-JP, Shift_JIS and ISO-2022-JP. It is in pure Python, of a reasonable size (< 80KB), and with an effective means to modify the mapping tables. |
|||
msg42402 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2003-01-12 12:33 | |
Logged In: YES user_id=38388 Are you aware of the codecs written by Tamito KAJIYAMA ? http://www.asahi-net.or.jp/~rd6t-kjym/python/ These are written in C and provide a much improved performance over Python based ones. They cover the same set of encodings you have in your packagea dn also include a complete test suite for the codecs. |
|||
msg42403 - (view) | Author: SUZUKI Hisao (suzuki_hisao) | Date: 2003-01-16 00:22 | |
Logged In: YES user_id=495142 Yes, I know KAJIYAMA's work from version 1.0 to version 1.4.9. Indeed I had contributed a patch to JapaneseCodecs-1.2. Please read the README file included in the tar-ball for rationale of ja-codecs. As for the efficiency, ja-codecs is fairly fast and small in practice. In addition, its mapping possesses a good mathematical property, encode(decode(c)) == c for every valid character c, which is pragmatically useful for many applications. (The last version (1.4.9) of KAJIYAMA's codecs has also remedied it on a particular character: REVERSE SOLIDUS. It seems to lack a validation test like that of ja-codes-0.6/ja/map_jisx206.py, though.) As you know, KAJIYAMA's codecs set does not also cover all the encodings used in Japan today. For example, it does not support those of Macintosh. It might be almost impossible to make a perfect set of codecs in a realistic size. It would be best for "standard library" to prepare a few "standard" (based on public specifications and in use over various platforms) encodings, which can be _easily_ modified by users/developers in order to be adapted to their specific platforms (in the spirit of "open source" ;-). So I think it would be mandatory for Japanese codecs of standard library to be written in Python cleanly as well as efficiently enough, or at least, to effectively allow users to modify character mappings. |
|||
msg42404 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2003-01-16 09:28 | |
Logged In: YES user_id=38388 Sorry for not having read the README earlier. You do have a point in that it is useful to be able to modify encodings in user-specific ways. Of course, this needs to be done by creating new codecs and Python files sure make this process easier. Now, AFAIK, none of the current Python developers know much about Japanese, so we'd need a maintainer for the codecs. If you would be able to take over this part, then I see a good chance of getting the codecs into the Python core (Tamito's codecs didn't get accepted for the core distribution because of their size). Perhaps you could team up with Tamito in this effort ?! |
|||
msg42405 - (view) | Author: SUZUKI Hisao (suzuki_hisao) | Date: 2003-01-18 08:58 | |
Logged In: YES user_id=495142 It will be very nice if Japanese codecs are got into the core. Nowadays even Perl 5.8 has them. I am very willing to help you and Tamito in codec maintenance. I am sorry, but I am so occupied with my work that I am afraid it might be difficult to take time off to do it everyday. Perhaps I will be able to make responses not daily but weekly. |
|||
msg42406 - (view) | Author: Hyeshik Chang (hyeshik.chang) * | Date: 2004-02-06 03:22 | |
Logged In: YES user_id=55188 Python got Japanese codecs by importing CJK codecs. Thank you for your efforts anyway! |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:06:07 | admin | set | github: 37761 |
2003-01-12 03:46:21 | suzuki_hisao | create |