This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Take advantage of BaseException/Exception split in cookielib
Type: Stage:
Components: Library (Lib) Versions: Python 2.5
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: brett.cannon, georg.brandl, jjlee, nnorwitz
Priority: normal Keywords: patch

Created on 2006-04-29 16:43 by jjlee, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
cookielib_baseexception.patch jjlee, 2006-04-29 16:43
cookielib_baseexception_2.patch jjlee, 2006-04-30 14:35
Messages (11)
msg50131 - (view) Author: John J Lee (jjlee) Date: 2006-04-29 16:43
The patch takes advantage of the exception hierarchy
reorganisation to remove some ugly code in cookielib. 
It clarifies a couple of exception messages, too.
msg50132 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2006-04-29 21:12
Logged In: YES 
user_id=357491

Overall the patch looks fine (on vacation so not up for
applying and handling any possible failures so not going to
assign to myself).  But a question and a suggestion.

Why were the error strings changed to use the repr instead
of the string representation?  What does it buy you?

And if you are going to be changing the function name, you
might want to consider using a leading underscore to prevent
people from using it or getting exported.  Otherwise I would
define __all__ for the module.
msg50133 - (view) Author: John J Lee (jjlee) Date: 2006-04-30 00:27
Logged In: YES 
user_id=261020

I changed the exception detail strings to use %r to get
quotes around the filename and quoted "bad" line read from
the file.  That makes it clearer what is part of the
explanatory English text and what is part of the filename
(or part of the quoted bad line, as the case may be). 
Filenames can and do contain spaces, commas, etc.

Your other point stunned me a bit.  I don't think it had
ever even really *occurred* to me that stdlib users might
consider stdlib module globals that are not documented as
public.  Ironically, I think that's because the code from
which cookielib derives is much stricter about this, all
modules starting with '_' and package __init__ exporting a
short list of names -- I guess I thought I was following
stdlib conventions by *not* adding initial underscores all
over the place.  Looking at some other stdlib code, I see
that underscores would have been more conventional after all.

Searching for reassurance, I discovered this from one of
your old python-dev summaries that confirms that
undocumented stdlib module globals are not considered part
of the module public interface:

http://www.python.org/dev/summary/2004-07-16_2004-07-31/#use-the-docs-to-know-what-the-public-api-is-people

e.g. from Tim Peters:

"""
As you noted later, it wasn't part of keyword's documented
interface, and you *always* act at your own risk when you go
beyond the docs.
"""

However, I don't see that this is explicitly documented,
which seems unfortunate to me (even though Tim's statement
is true regardless of any convention Python might have).

So, I guess I should:

1. Write something explicit about this (along the lines of
"Use undocumented module globals at your own risk") for the
stdlib library docs -- perhaps starting from Tim's post --
and submit that as a doc patch.

2. Leave all module global names in cookielib unchanged (so
people using those functions don't suffer gratuitous
breakage, even though any such people are asking for trouble
in the long run).  However, in the thread above, Michael
Hudson disagrees with that, and suggests all such module
globals be renamed.  So suggestions are welcome here on the
best course of action.

3. As you suggest, submit a patch to add an __all__ to
cookielib.
msg50134 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-04-30 00:37
Logged In: YES 
user_id=33168

John, at this point (2.x) we should at least do no worse. 
Don't export unnecessary vars in any new code.  We should
also start documenting the situation and work towards
improving it.  For 3k, we should do better and solidify the
rules and do massive cleanup (module by module).  This will
probably involve some arm twisting of Guido. :-)
msg50135 - (view) Author: John J Lee (jjlee) Date: 2006-04-30 01:22
Logged In: YES 
user_id=261020

So, you (Neil) agree with my three numbered action points
below?  You repeat my suggestion that we document this as if
it were a new suggestion; did you read my comment?

Sigh, sorry for being a little grumpy about this, but it's
hard to "do no worse" for a project if that project doesn't
seem itself to be very sure what it considers "worse":

While I must say I *agree* with you that such practices are
not good, if I as somebody apparently unusually inclined to
heavy use of underscores (even in most of my module names,
in library code) actually thought, however foolishly, that I
was *following stdlib conventions* by using *fewer*
underscores (for reasons I'll try to refrain from debating
further here), it does indeed seem pretty clear we're in
need of explicit documentation on this!  So your advice to
"do no worse" is a little annoying at this point... :-)

OK, so, what should get documented, specifically?  And where
should documentation for module authors go?

a). Stdlib module authors should always use underscores for
non-public module globals.

b). Don't know about this one: should non-legacy stdlib
modules (viz, those that follow rule a)) define __all__? 
(perhaps a point against doing this is that it may encourage
import * ?).

c). Stdlib packages should use __init__ to export public names.

d). Any discrepancy between __all__ and the API
documentation is a bug.

e). Stdlib users should assume all non- (I guess this should
be go somewhere near the library reference introduction)

Finally, how about my point 2?  Should I add underscores to
cookielib module globals I consider non-public (== all
undocumented module globals), or not?

Thanks for the feedback, both of you!
msg50136 - (view) Author: John J Lee (jjlee) Date: 2006-04-30 01:26
Logged In: YES 
user_id=261020

Bleh.

e). Stdlib users should assume all undocumented module
globals are not part of the public API (I guess this should
be go somewhere near the library reference introduction)
msg50137 - (view) Author: John J Lee (jjlee) Date: 2006-04-30 01:31
Logged In: YES 
user_id=261020

Hmm, perhaps by "do no worse" you simply meant not to rename
the function in this tracker item to a name not beginning
with an initial underscore (since that would introduce a new
non-public module global that does not begin with an
underscore).

In which case, sorry for the rant. :-)

My questions after the rant still stand. ;-)
msg50138 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2006-04-30 02:23
Logged In: YES 
user_id=357491

You should use an underscore if you don't have an __all__
defined.  This is mostly for protection for ``from cookielib
import *`` code.  But if you define an __all__ I think you
are fine.

You do not need to document that undocumented globals should
not be relied upon.  Yes, people should know better (I
personally got nailed by the Debian folk for an undocumented
function in site.py that I changed the parameters of).  But
the suggestions Neal and I are making are to protect people
who do their doc checking from the command-line and thus
just do ``import cookielib; dir(cookielib)``.  I know Python
is for use by adults, but sometimes going a small step to
protect the adolescents is also okay.  =)
msg50139 - (view) Author: John J Lee (jjlee) Date: 2006-04-30 14:35
Logged In: YES 
user_id=261020

<shrug> I still think there should be documented guidelines
on this for both stdlib users and contributors (though they
should not live in the cookielib docs, of course!).

I've added a new patch to this tracker item,
cookielib_baseexception_2.patch, that adds an __all__ and
uses the name _warn_unhandled_exception (in addition to the
original patch contents).  I verified that the set of
documented module globals is identical to those listed in
the new __all__.  Let me know if I should also rename all
other non-public module globals defined in cookielib to have
initial underscores.

Just for the record, here is the list of all module globals
NOT listed in the new __all__ (as determined by doing
dir(cookielib) prior to applying the patch, and removing the
items I didn't add to __all__, so includes some noise from
things like 'import sys'):

['Absent', 'DAYS', 'DEFAULT_HTTP_PORT', 'EPOCH_YEAR',
'ESCAPED_CHAR_RE', 'HEADER_ESCAPE_RE',
'HEADER_JOIN_ESCAPE_RE', 'HEADER_QUOTED_VALUE_RE',
'HEADER_TOKEN_RE', 'HEADER_VALUE_RE', 'HTTP_PATH_SAFE',
'IPV4_RE', 'ISO_DATE_RE', 'LOOSE_HTTP_DATE_RE',
'MISSING_FILENAME_TEXT', 'MONTHS', 'MONTHS_LOWER',
'STRICT_DATE_RE', 'TIMEZONE_RE', 'UTC_ZONES', 'WEEKDAY_RE',
'__builtins__', '__doc__', '__file__', '__name__',
'_str2time', '_threading', '_timegm', 'copy', 'cut_port_re',
'debug', 'deepvalues', 'domain_match', 'eff_request_host',
'escape_path', 'http2time', 'httplib', 'is_HDN',
'is_third_party', 'iso2time', 'join_header_words',
'liberal_is_HDN', 'logging', 'lwp_cookie_str', 'month',
'offset_from_tz_string', 'parse_ns_headers', 're', 'reach',
'request_host', 'request_path', 'request_port',
'split_header_words', 'sys', 'time', 'time2isoz',
'time2netscape', 'timegm', 'unmatched',
'uppercase_escaped_char', 'urllib', 'urlparse',
'user_domain_match', 'vals_sorted_by_key',
'warn_unhandled_exception']
msg50140 - (view) Author: John J Lee (jjlee) Date: 2006-04-30 14:38
Logged In: YES 
user_id=261020

(...and the new patch makes a tiny fix to a
slightly-inaccurate statement in the module docstring)
msg50141 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-05-08 17:48
Logged In: YES 
user_id=849994

Applied in rev. 45940.
History
Date User Action Args
2022-04-11 14:56:17adminsetgithub: 43294
2006-04-29 16:43:23jjleecreate