Issue505705
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2002-01-19 09:21 by loewis, last changed 2022-04-10 16:04 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
pickle.diff | loewis, 2002-08-11 20:47 | |||
pickle.diff | loewis, 2002-08-12 23:40 | version 3 |
Messages (10) | |||
---|---|---|---|
msg38752 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2002-01-19 09:21 | |
This patch removes the use of eval in pickle and cPickle. It does so by: - moving the actual parsing from compile.c:parsestr to PyString_DecodeEscape - introducing a new codec string-escape - removing the code that checks that a string-to-unpickle is properly escaped throughout, and replaces this with a check whether it is properly quoted, - unquoting the string in load_string, then passing it to the codec. This fixes #502503. |
|||
msg38753 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2002-01-19 09:25 | |
Logged In: YES user_id=21627 BTW, this patch has #500002 as a prerequisite. |
|||
msg38754 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2002-08-09 04:12 | |
Logged In: YES user_id=6380 I'll review this. |
|||
msg38755 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2002-08-09 14:48 | |
Logged In: YES user_id=6380 I like this idea. But the patch is out of date. Can you rework the patch? How much faster does this make the test program from 1026940226.16076.python-list%40python.org">http://groups.google.de/groups?hl=en&lr=&ie=UTF-8&selm=mailman.1026940226.16076.python-list%40python.org ??? |
|||
msg38756 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2002-08-11 14:09 | |
Logged In: YES user_id=6380 This would fix bug #593656 too. |
|||
msg38757 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2002-08-11 20:47 | |
Logged In: YES user_id=21627 Updated to current CVS. |
|||
msg38758 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2002-08-12 01:51 | |
Logged In: YES user_id=6380 Closer. - Why bother stripping triple quotes in the pickle/cPickle load code? These will never happen as a result of a pickle dump AFAIK, and the code you are replacing doesn't accept these either AFAICT. - There's something missing (the previous version of the patch had it I believe) that's needed to register the codec; as a consequence, pickle.loads() doesn't work. - escape_encode() uses repr() of a string to do the work. But that means the outcome for embedding string quotes is confusing, because of the "smarts" in repr() that use " for surrounding quotes when there's a ' in the string, and vice versa. Thus, a single quote or a double quote is returned unquoted; but if they both occur in the same string, the single quote is quoted. I don't think that's particularly useful. Maybe there should be an underlying primitive operation that gives you a choice and which is invoked both by escape_encode() and string repr()? - I don't understand the recode_encoding stuff, but it looks like something like that was present before too. :-) |
|||
msg38759 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2002-08-12 23:40 | |
Logged In: YES user_id=21627 New version: - removed tripple quote support in pickle/cPickle - added Lib/encodings/string_escape.py again - added PyString_Repr, which takes a smartquotes argument - recode_encoding is for PEP 263: the parser generates UTF-8 in the abstract syntax, which needs to be re-encoded with the original encoding. Unfortunately, \-escaping and UTF-8 may interleave, hence the convoluted code. On the Sam Penrose article: Without patch: dumping list of 1000 dicts: dumped: 0.192386031151 loading list of 1000 dicts: loaded: 2.46496498585 dumping list of 10000 dicts: dumped: 1.92456102371 loading list of 10000 dicts: loaded: 24.6884089708 with patch: dumping list of 1000 dicts: dumped: 0.201091051102 loading list of 1000 dicts: loaded: 0.469774007797 dumping list of 10000 dicts: dumped: 1.94221496582 loading list of 10000 dicts: loaded: 4.8661159277 So loading speed is up by a factor of 5, for this benchmark. |
|||
msg38760 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2002-08-13 20:43 | |
Logged In: YES user_id=6380 This looks good. I measured a similar speedup. Go for it! |
|||
msg38761 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2002-08-14 08:14 | |
Logged In: YES user_id=21627 Applied as stringobject.h 2.36; pickle.py 1.69; string_escape.py 1.1; pickletester.py 1.18; _codecsmodule.c 2.14; cPickle.c 2.95; stringobject.c 2.177; compile.c 2.256; It turns out that #593656 isn't fixed: nobody checks the trailing \, so it will be unpickled as '\\\x00' (consistently in pickle and cPickle, though). |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:04:54 | admin | set | github: 35952 |
2002-01-19 09:21:24 | loewis | create |