This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib.url2pathname, pathname2url doc strings inconsistent
Type: Stage:
Components: Documentation Versions: Python 2.4
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: aimacintyre, facundobatista, georg.brandl, mike_j_brown
Priority: low Keywords:

Created on 2002-12-07 09:22 by mike_j_brown, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
nturl2path.py.diff mike_j_brown, 2004-12-27 07:05 doc strings for url2pathname, pathname2url
urllib.py.diff mike_j_brown, 2004-12-27 07:06 doc strings for url2pathname, pathname2url
macurl2path.py.diff mike_j_brown, 2004-12-27 07:07 doc strings for url2pathname, pathname2url
rourl2path.py.diff mike_j_brown, 2004-12-27 07:07 doc strings for url2pathname, pathname2url
Messages (10)
msg13553 - (view) Author: Mike Brown (mike_j_brown) Date: 2002-12-07 09:22
The Unix version of urllib.url2pathname(), when given a 
file URL that contains a host part, returns a path with 
the host embedded in the URL, despite the fact that 
there is no convention for mapping the host into the 
URL. The resulting path is not usable.

For example, on Windows, there is a convention for 
mapping the host part of a URL to and from a NetBIOS 
name. url2pathname('//somehost/path/to/file') returns 
r'\\somehost\path\to\file' which is safe to pass into open
() or os.access().

But on Unix, there is no such convention. url2pathname
('//somehost/path/to/file') returns '//somehost/path/to/file', 
which means the same thing as '/somehost/path/to/file' --
 somehost is just another path segment and does not 
actually designate a host.

In my opinion, an exception should be raised in this 
situation; url2pathname() should not try to produce an 
OS path for a remote machine when there is no 
convention for referencing a remote machine in that OS's 
traditional path format. This way, if no exception is 
raised, you know that it's safe to pass the result into 
open() or os.access().

And as noted in other bug reports, 'file://localhost/' is a 
special case that should be treated the same as 'file:///'.
msg13554 - (view) Author: Mike Brown (mike_j_brown) Date: 2002-12-07 09:24
Logged In: YES 
user_id=371366

by 'host embedded in the URL' in the first sentence I 
meant 'host embedded in it' [the path]
msg13555 - (view) Author: Andrew I MacIntyre (aimacintyre) * (Python triager) Date: 2002-12-11 06:56
Logged In: YES 
user_id=250749

There is a sort of convention in Unix - 
  somehost:/path/to/file
which comes from NFS, but has been used by tar (for remote 
tapes via rsh) and ssh's scp, and I believe has been used by 
some ftp clients (ncftp?)

However as far as I know you can't pass such a path to open
() or os.access(), so your basic point still has validity.
msg13556 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2004-12-26 14:52
Logged In: YES 
user_id=752496

Please, could you verify if this problem persists in Python 2.3.4
or 2.4?

If yes, in which version? Can you provide a test case?

If the problem is solved, from which version?

Note that if you fail to answer in one month, I'll close this bug
as "Won't fix".

Thank you! 

.    Facundo
msg13557 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2004-12-26 14:52
Logged In: YES 
user_id=752496

Could you please provide a test case?
msg13558 - (view) Author: Mike Brown (mike_j_brown) Date: 2004-12-27 07:04
Logged In: YES 
user_id=371366

pathname2url and url2pathname are undocumented and are
urllib- and platform-specific. My complaints in this old bug
report are based on assumptions that thse functions are
general-purpose public interfaces. Upon further
investigation, I see that they are not.

I suggest leaving the implementations unchanged for now;
there are too many issues with doing it 'right' to go into
here. But perhaps add documentation that is consistent and
indicates that the functions are limited in scope. Patches
attached.
msg13559 - (view) Author: Mike Brown (mike_j_brown) Date: 2004-12-27 07:15
Logged In: YES 
user_id=371366

See also #649961, where I propose the same solution.
msg13560 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2004-12-28 00:32
Logged In: YES 
user_id=752496

The documentation for urllib states that:

Although the urllib module contains (undocumented) routines
to parse and unparse URL strings, the recommended interface
for URL manipulation is in module urlparse.

So, if you think that the files should also be modified,
change the group of this bug to 2.4. Otherwise it will be
closed as won't fix.
msg13561 - (view) Author: Mike Brown (mike_j_brown) Date: 2004-12-28 05:18
Logged In: YES 
user_id=371366

OK. I changed the group to Python 2.4, changed the category
to Documentation, changed the summary, and lowered the priority.

Since there are doc strings for the non-posix versions of
url2pathname() and pathname2url(), please just consider the
patches I created to be just making all of the docs
consistent among each other and consistent with the
module-level docs you pointed out.

Thanks! -Mike
msg13562 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2005-12-26 22:54
Logged In: YES 
user_id=1188172

Applied patches in revisions 41816,41817.
History
Date User Action Args
2022-04-10 16:05:58adminsetgithub: 37581
2002-12-07 09:22:07mike_j_browncreate