This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: unknown parsing error
Type: Stage:
Components: None Versions:
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: loewis Nosy List: gazum, loewis, nnorwitz
Priority: normal Keywords:

Created on 2004-07-03 19:39 by gazum, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
coding.patch nnorwitz, 2004-07-21 03:16 patch 1 to address missing break, more strict -*- coding check
Messages (3)
msg21389 - (view) Author: Igor Sidorenkov (gazum) Date: 2004-07-03 19:39
I am getting "unknown parsing error" when trying to run 
a script with a following first line: 
#@+leo-encoding=cp1251.

If I add a couple of empty lines or 
# -*- coding: cp1251 -*-
then everything is ok.

I am using ActiveState python 2.3.3 on
Win2K server.

---------- Python ----------
error=22
  File "test.py", line 1
SyntaxError: unknown parsing error

Output completed (0 sec consumed) - Normal 
Termination
------------------------------
#@+leo-encoding=cp1251.
#@+node:0::@file test.py
#@+body
for i in range(5):
	print i
#@-body
#@-node:0::@file test.py
#@-leo
msg21390 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2004-07-21 03:16
Logged In: YES 
user_id=33168

Martin, I hope you don't mind me assigning this to you.  I
think you implemented the coding spec.  I briefly read the
PEP and while the code does what the PEP states (ie, use a
regex), the behaviour doesn't match the examples.  It also
seems like it could be error prone to allow r'#.*coding[:=]'

I think there are two issues.  
1) in pythonrun.c in E_DECODE there is a missing break
2) the check for # -*- coding is not strict enough
    The patch makes the check r'# (-\*-)? coding[:=]'

The attached patch addresses both issues, although I'm not
sure you will agree #2 is a problem.  

Feel free to checkin, assign back to me or whatever.  I'm
not sure what the error message in pythonrun should be,
right now it's "unknown decode error."  Perhaps that should
be "invalid encoding" or something?
msg21391 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-07-21 05:36
Logged In: YES 
user_id=21627

The patch is wrong. The PEP deliberately allows for
arbitrary occurrences of the substring "coding", in
particular inside "encoding". This was made so that other
editors, like vi or LEO, can continue to use their own
encoding declarations, and Python would recognize them.

Unfortunately, LEO decided to add a full stop at the end of
the line, so Python looks for an encoding named "cp1251.".
We agree with the LEO author that this is a problem in LEO,
and will be fixed. Alternatively, we could amend the PEP and
declare that trailing dots are not part of the encoding name.

The other part of the patch is correct; I have applied it as
pythonrun.c 2.195.6.6 and 2.207. It would be even better if
we could display the actual cause of the problem, but that
is currently not supported in the parser.
History
Date User Action Args
2022-04-11 14:56:05adminsetgithub: 40502
2004-07-03 19:39:22gazumcreate