Issue 1257525: Encodings iso8859_1 and latin_1 are redundant

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/42270

classification

Title:	Encodings iso8859_1 and latin_1 are redundant
Type:		Stage:
Components:	Unicode	Versions:	Python 2.4

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	lemburg	Nosy List:	exa, lemburg, liturgist
Priority:	normal	Keywords:

Created on 2005-08-12 12:22 by liturgist, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (7)
msg26033 - (view)	Author: liturgist (liturgist)	Date: 2005-08-12 12:22
./lib/encodings contains both: iso8859_1.py latin_1.py Only one should be present. Martin says that latin_1 is faster. Using the 'iso' name would correlate better with the other ISO encodings provided. If the latin_1 code is faster, then it should be in the iso8859_1.py file. If an automated process produces the 'iso*' encodings, then it should either produce the faster code or stop producing iso8859_1. Regardless, one of the files should be removed.
msg26034 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2005-08-12 12:49
Logged In: YES user_id=38388 Good point. The iso8859_1.py codec should be removed and added as alias to latin-1. Martin is right: the latin-1 codec is not only faster, but the Unicode integration code also has a lot of short-cuts for the "latin-1" encoding, so overall performance is better if you use that name for the encoding.
msg26035 - (view)	Author: liturgist (liturgist)	Date: 2005-08-12 13:12
Logged In: YES user_id=197677 Ok. How about if we specify iso8859_1 as "(see latin_1)" in the documentation? The code will work the same regardless of which encoding name the developer uses. Right?
msg26036 - (view)	Author: liturgist (liturgist)	Date: 2005-08-12 14:01
Logged In: YES user_id=197677 Where could one see some of the "shortcuts" in the Unicode integration code that make using "latin_1" faster in the runtime? I greped .py and .c, but could not readily identify any candidates.
msg26037 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2005-08-12 14:30
Logged In: YES user_id=38388 To answer your questions: Yes, the encoding is the same for both latin-1 and iso8859-1. Specifying latin-1 instead of iso8859-1 will allow the code to use short-cuts. You have to grep for 'latin-1'.
msg26038 - (view)	Author: Eray Ozkural (exa)	Date: 2005-10-11 21:22
Logged In: YES user_id=1454 i understand that there ought to be one fast implementation, but i suppose both names should be available.
msg26039 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2005-10-21 14:03
Logged In: YES user_id=38388 I've added an alias entry from iso8859_1 to latin_1.

History
Date	User	Action	Args
2022-04-11 14:56:12	admin	set	github: 42270
2005-08-12 12:22:28	liturgist	create