Issue 557946: Ebcdic compliancy in stringobject source

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/36617

classification

Title:	Ebcdic compliancy in stringobject source
Type:		Stage:
Components:	Interpreter Core	Versions:	Python 2.2

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	coli, jymen, loewis
Priority:	normal	Keywords:	patch

Created on 2002-05-19 14:20 by jymen, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
pyconfig.h.in.diff	jymen, 2002-05-26 16:38
stringobject.c.diff	jymen, 2002-05-26 16:42	using HAVE_EBCDIC define diff

Messages (10)
msg40044 - (view)	Author: Jean-Yves MENGANT (jymen)	Date: 2002-05-19 14:20
the printable character set test made inside strincgobject.c is not compliant with EBCDIC systems(OS390 or OS400)
msg40045 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2002-05-22 17:09
Logged In: YES user_id=21627 Is it really worth fixing this? Python assumes that the character set of byte strings is an ASCII superset in many places. If there is any change made here, it should be based on C library functions, rather than on static knowledge of the operating system.
msg40046 - (view)	Author: Jean-Yves MENGANT (jymen)	Date: 2002-05-23 08:38
Logged In: YES user_id=513881 when porting to OS390(EBCDIC os) , the only place I found a bad ASCII asumption which leeds to further python's startup interpreter troubles is the one pointed here. When I fixed it I have been able to use the python interpreter kernel without troubles.Some modules like xmllib may make some ascii asumption but modules portability is a different story since those modules may be declared non EBCDIC compliant. On the second topic using a C library function I am 100% ok the only question is that I am persuaded that using for instance the isascii XPG C function will generate more complex and slower code when trying to keep it in compliancy both with EBCDIC/ASCII targets. Having a more generic #define like : #define EBCDIC inside the config.h set by ./configure when platform is EBCDIC is IMO the best compromise here.
msg40047 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2002-05-23 09:54
Logged In: YES user_id=21627 I believe there are a number of places where the code assumes that 'a' .. 'z' covers all Latin letters, and only those, e.g. pypcre.c, regexpr.c, sre.py.
msg40048 - (view)	Author: Jean-Yves MENGANT (jymen)	Date: 2002-05-23 11:47
Logged In: YES user_id=513881 I am still 100% with you on that ,my only remark here is that those are mainly either modules or py lib which are not part of python basic kernel. And the idea here is to be able to get a running minimal python kernel on an EBCDIC machine. After that when the basic kernel is up in EBCDIC mode you'll need to deal with some module/lib EBCDIC portability and decide wether or not to adress them if you need to use them.... But the important idea here is to have the python kernel running in order not to be obliged to use REXX if you're prefering python :=)
msg40049 - (view)	Author: Jean-Yves MENGANT (jymen)	Date: 2002-05-26 16:38
Logged In: YES user_id=513881 The last attached diff files contains a more robust patch by defining the HAVE_EBCDIC inside the pyconfig.h and using this file inside the stringobject.c
msg40050 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2002-05-28 09:58
Logged In: YES user_id=21627 Modifying pyconfig.h.in (alone) is a mistake: this is a generated file, edit configure.in instead. When producing patches, please produce a single file containing all changes (e.g. with diff -r); this makes processing the patch simpler. I'm still opposed to singling-out a specific encoding; instead, I believe that the approach taken in patch #479898 is more general and ought to solve your problem as well. Can you please study this patch, and see whether you can make it work on your system?
msg40051 - (view)	Author: Jean-Yves MENGANT (jymen)	Date: 2002-06-02 18:32
Logged In: YES user_id=513881 I look at the approach taken in patch #479898 , looks fine so I made a quick test on OS390 EBCDIC platform just extracting the SINGLE_BYTE isprint based changed which works fine on OS390 too. It works well and is definitivelly the best approach for the problem. I looked also at the PRINT_MULTIBYTE_STRING approach based on iswprint. Looking at IBM's doc it should also work for OS390 EBCDIC too , allthough I am not able to test it on my OS390 box.
msg40052 - (view)	Author: coleman corrigan (coli)	Date: 2002-07-30 15:19
Logged In: YES user_id=586691 This is an ugly patch, The pcre module elegantly avoids this issue by using isprint(), why not do the same thing here ?.
msg40053 - (view)	Author: Jean-Yves MENGANT (jymen)	Date: 2002-07-30 15:40
Logged In: YES user_id=513881 >The pcre module elegantly avoids this issue by using isprint (), > >why not do the same thing here ?. If you look at my answer dated 2002-06-02 , I indicated that the isprint is definitively the best approach to the problem , I made a test with it on OS390 and it works fine.

History
Date	User	Action	Args
2022-04-10 16:05:19	admin	set	github: 36617
2002-05-19 14:20:36	jymen	create