Issue943953
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2004-04-28 19:43 by siva1311, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
patch_str_maketrans | siva1311, 2004-04-28 19:43 | patch file to add maketrans method to str object |
Messages (11) | |||
---|---|---|---|
msg45874 - (view) | Author: Gyro Funch (siva1311) | Date: 2004-04-28 19:43 | |
Added maketrans method to the string object. This functionality is currently only in the string module. string module -> str object string.maketrans(from,to) -> from.maketrans(to) Attached is the diff for stringobject.c and string_test.py I am not a proficient C coder, but things look okay to me and the tests pass. |
|||
msg45875 - (view) | Author: Denis S. Otkidach (ods) * | Date: 2004-05-14 12:59 | |
Logged In: YES user_id=63454 I think maketrans is a low-level function and should be hidden from user. Traditional tr interface is much more tentative and it can be the same both for str and unicode: s.translate((from, to) [, delete]). |
|||
msg45876 - (view) | Author: Gyro Funch (siva1311) | Date: 2004-05-14 13:51 | |
Logged In: YES user_id=679947 Although I think your suggestion would have merit if this method were new, in the current situation this change would break all code currently using the 'translate' method. I can't imagine that this would be acceptable. My suggested change would be backward compatible and would be consistent with bringing methods out of the string module and into str object methods. Since 'translate' has already been made a str method, why not make 'maketrans' a str method too? |
|||
msg45877 - (view) | Author: Denis S. Otkidach (ods) * | Date: 2004-05-14 14:05 | |
Logged In: YES user_id=63454 No, my suggestion doesn't interfer with current interface, so it can be added without breaking compatibility. Something like the following. For str: if isinstance(table, str): ...old behavior... else: t_from, t_to = table; ... For unicode: if isinstance(tables, dict): ...old behavior... else: t_from, t_to = table; ... |
|||
msg45878 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2004-05-14 14:13 | |
Logged In: YES user_id=38388 I'm -1 on adding such a method or trying to tweak .translate(): We should not make use of .translate() more wide-spread than it already is. It is easy enough to write a charset based codec that fullfills the same need. See the codecs in the encodings package on how to use this codec for purpose very similar to those of .translate(). The advantage of this approach is not only to make the resulting translation easily available to the whole application; it also works for both Unicode and plain strings by virtue of the charset codec. |
|||
msg45879 - (view) | Author: Gyro Funch (siva1311) | Date: 2004-05-14 14:30 | |
Logged In: YES user_id=679947 Okay. I wasn't even aware of the encodings package (blush). Should the docs be updated to reflect the fact that users should consider using codecs instead of translate/maketrans? Perhaps an example of how maketrans/translate is subsumed by codecs would be helpful. |
|||
msg45880 - (view) | Author: Denis S. Otkidach (ods) * | Date: 2004-05-14 14:40 | |
Logged In: YES user_id=63454 > I'm -1 on adding such a method or trying to tweak .translate(): > We should not make use of .translate() more wide-spread > than it already is. Then why it's not deprecated yet? Now translate has many disadvantages: it's difficult ro use for str (need maketrans), the interface of it differ for str and unicode. I suggested to add unified interface and avoid maketrans use. I'm not the first who doesn't like maketrans: http://mail.python.org/pipermail/patches/2000-May/000781.html > It is easy enough to write a charset based > codec that fullfills the same need. This approach won't work for both str and unicode due to over-restricted implementation of codecs: http://groups.google.com/groups?th=a68a7b5a2e1f294 . Moreover, AFAIK this requires regestering encoding (making it global), but this is often a bad idea. |
|||
msg45881 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2004-05-14 14:40 | |
Logged In: YES user_id=38388 Probably... patches are welcome :-) Writing these codecs is really easy. Just have a look at e.g. rot13.py ... you basically copy the template to a new module say rot14, edit the mapping dictionary, save it and then use the module name in the string .encode() method: 'abc'.encode('rot14') The codec registry will first look in the encodings package for the codec and then continue the search on the PYTHONPATH, so you may have to provide the complete package name if you place the codec into a package, e.g. 'abc'.encode('my.new.app.rot14') That's it. |
|||
msg45882 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2004-05-14 14:45 | |
Logged In: YES user_id=38388 >>It is easy enough to write a charset based >> codec that fullfills the same need. > > This approach won't work for both str and unicode due to > over-restricted implementation of codecs: > http://groups.google.com/groups?th=a68a7b5a2e1f294 . > Moreover, AFAIK this requires regestering encoding > (making it global), but this is often a bad idea. Indeed. I've always argued for putting codecs into packages for this reason. About your note about the Unicode .decode() method: I completely agree. The codec was never designed to be Unicode vs. the rest of the world. It was designed as general purpose encoding and decoding system. However, a few python-dev'ers seem to have misunderstood this intention and still believe that codecs are only about Unicode. |
|||
msg45883 - (view) | Author: Denis S. Otkidach (ods) * | Date: 2004-05-14 15:07 | |
Logged In: YES user_id=63454 > About your note about the Unicode .decode() method: > I completely agree. The codec was never designed to > be Unicode vs. the rest of the world. It was designed > as general purpose encoding and decoding system. If so, we need to relax mentioned restriction and allow Codec instance as argument of encode/decode methods. Without this codecs never become general purpose encoding and decoding system. |
|||
msg45884 - (view) | Author: Hyeshik Chang (hyeshik.chang) * | Date: 2004-07-17 06:04 | |
Logged In: YES user_id=55188 unicode object have got .decode() method. It seems the originator may feel happy to write his own codec package now. :) |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:03 | admin | set | github: 40193 |
2004-04-28 19:43:26 | siva1311 | create |