This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Adding rsplit() to string and unicode objects.
Type: enhancement Stage:
Components: Interpreter Core Versions:
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: bob.ippolito, gvanrossum, hyeshik.chang, jafo, jemfinch, loewis, rhettinger
Priority: low Keywords:

Created on 2003-09-07 00:52 by jafo, last changed 2022-04-10 16:11 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
Python-2.3-rsplit.diff jafo, 2003-09-07 00:52 Code patch against 2.3 release.
Python-CVS-docs-rsplit.diff jafo, 2003-09-07 00:55 Patch against CVS for the documentation.
Messages (24)
msg53994 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-09-07 00:52
I'm attaching patches to the library and documentation
for implementing rsplit() on string and unicode
objects.  This works like split(), but working from the
right.

   ./python -c 'print u"foo, bar, baz".rsplit(None, 1)'
   [u'foo, bar,', u'baz']

This was supposed to be against the CVS code, but I've
had a heck of a time getting it checked out -- my
checkout has been hung for half an hour now.

The code patch is against the 2.3 release, the docs
patch is against the CVS.  My checkout got to docs, but
I didn't have the code to a point where I could build
and test it.

Sean
msg53995 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-09-07 19:49
Logged In: YES 
user_id=21627

Why is this function useful?
msg53996 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-09-08 00:56
Logged In: YES 
user_id=81797

Can you provide more details about why the usefulness of
this function is in question?

First I would like to tell you the story of it coming to be,
then I will answer your incomplete question with a
(probably) incomplete answer.  I had a device which sent me
comma-separated fields, but one of the fields in the middle
could contain a comma.  The answer that seemed obvious to me
was to use split with a maxsplit to get the fields up to
that field, and then a rsplit with a maxsplit on the
remainder.  When I mentioned on #python that I was
implementing rsplit, 4 other fellow python users replied
right away that they had been wanting it.

To answer your question, it's useful because people using
strings are used to having r*() functions like rfind and
rstrip.  The lack of rsplit is kind of glaring in this
context.  Really, though, it's useful because otherwise
people have to implement -- often badly.

Sean
msg53997 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-09-10 17:08
Logged In: YES 
user_id=21627

I questioned the usefulness because I could not think of a
meaningful application. Now I see what a potential
application could be, but I doubt its generality, because
that approach would break if there could be two fields that
have commas in them.

I also disagree that symmetry can motivate usefulness: I
also doubt that all of the r* functions are useful, but they
cannot be removed for backwards compatibility. The fact that
rsplit would fit together with the other r* functions
indicates that adding rsplit would provide symmetry, not
that it would provide usefulness.
msg53998 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-09-10 19:15
Logged In: YES 
user_id=81797

os.path.basename/os.path.dirname is an example of where you
could use rsplit. One of the other #python folks said he had
recently wanted rsplit for an application where he was
getting the domain name and user part from a list of e-mail
addresses, but found that some entries contained an "@" in
the user part.

Sean
msg53999 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-09-10 19:35
Logged In: YES 
user_id=80475

I would classify this more as a technique than a fundamental 
string operation implemented by all stringlike objects 
(including UserString).  Accordingly, I recommend that the 
patch be closed and a recipe posted in the ASPN cookbook - 
something along the lines of:

>>> def rsplit(s, sep=None, maxsplit=-1):
...     return  [chunk[::-1] for chunk in s[::-1].split(sep, 
maxsplit)[::-1]]

>>> rsplit(u"foo, bar, baz", None, 1)
[u'foo, bar,', u'baz']
msg54000 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-09-10 20:40
Logged In: YES 
user_id=81797

I realize that rsplit() can be implemented, because, well, I
implemented it.

The standard library is there to provide ready-to-use
functionality so that users of python can concentrate on
their program instead of concentrate on re-inventing the
wheel.  find() can be implemented with a short loop, split()
can be implemented with find(), join() can be implemented
with a short loop.    Many things can be implemented with a
little additional effort on the part of the user to develop
or locate the code they're wanting.

These little things can add up quickly and can have quite a
dramatic impact on the programming experience in Python. 
Having to find or implement these functions will cause
distraction from the code at hand, time lost while finding,
implementing, testing, and maintaining the code in question.

One of Python's strengths is a rich standard library.  So,
what are the guidelines for determining when it's rich
enough?  Why is it ok to suggest that users should get
distracted from their code to go implement something else? 
Is there a policy that I'm not aware of that new
functionality should be put in the cookbook instead of the
standard library?  Why is it being ignored that some
programmers would find implementing rsplit() challenging?

I'm not trying to be difficult here, I honestly can't
understand the apparent change from having a rich library to
a "batteries not included" stance.  The response I got from
#python when I mentioned having submitted the patch
indicates to me that other experienced Python developers
expect there to be an rsplit().

So, why is there so much resistance to adding something to
the library?  What are the guidelines for determining if
something should be in the library?

Sean
msg54001 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-09-11 06:07
Logged In: YES 
user_id=21627

There is PEP 2, which suggests to write a library PEP for
proposal to extend the library. Now, this probably would be
overkill for a single string method. However, I feel that
there are already too many string methods, so I won't accept
that patch. I'm not rejecting it, either, because I see that
other maintainers might have a different opinion. In short,
you should propose your change to python-dev, finding out
what "a majority" of the maintainers thinks; you might also
propose it on python-list, trying to collect reactions from
users. It would then be good to summarize these discussions
here (instead of carrying them out here).
msg54002 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-09-11 23:55
Logged In: YES 
user_id=80475

Guido, do you care to pronounce on this one?
msg54003 - (view) Author: Jeremy Fincher (jemfinch) Date: 2003-09-22 13:10
Logged In: YES 
user_id=99508

As a comment on the ease with which a programmer can get rsplit 
wrong, note that rhettinger's rsplit implementation is not correct: 
compare rsplit('foobarbaz', 'bar') with 'foobarbaz'.split('bar'). 
 
He forgot to reverse the separator if it's not None. 
msg54004 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-09-22 18:17
Logged In: YES 
user_id=80475

I'll review your patch when I get a chance.
msg54005 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-09-29 05:38
Logged In: YES 
user_id=81797

This seems to have generated nothing but positive comment
from the folks on python-dev.  Thoughts?
msg54006 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2003-11-24 22:45
Logged In: YES 
user_id=139309

I'd have to say me too on this one, wake up please :)
msg54007 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-11-25 02:25
Logged In: YES 
user_id=80475

Get Guido to approve the API change and I will be happy to
complete the implementation, documentation, testing, etc.  

Advice:  He will want *compelling* non-toy use cases and
reasons why that a person wouldn't just implement it in pure
python (saying that they aren't smart enough is not a
reason).  He is rarely ever persuaded by
symmetry/completeness arguments along the lines of "we have
an l-this so we have to have an r-that".  If that were the
case, tuples would have gotten index() and count() long ago.

Language growth rate is one issue, the total number of
string methods is another, and more importantly he seeks to
minimize the number of small API compatabilities between
versions which make it difficult to write code for Py2.4
that runs on Py2.2.  

Also, there are a number of strong competitors vying to be
added as string methods.  If we only get one new method, why
is this one to be preferred over str.cook() for supporting
Barry's simplified string substitutions

Given only one new str API change, I would personally prefer
to add an optional fillchar argument to str.center().
msg54008 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-11-25 16:34
Logged In: YES 
user_id=81797

Raymond, you've asked Guido about it on September 11, and he
(apparently) explicitly stayed out of the discussion.  I
assumed that you had let him know you wanted his judgement
on this and that his response was that he didn't want to be
involved, leaving it up to the library "elite guard" instead.

Did you actually copy Guido on your earlier request?

Personally, I don't see the logic in "if we get only one
string method".  Python isn't for the Python core
developers, it's for the users.  If the users have several
things that they want added, why the artificial limit on how
many to accept?
msg54009 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-11-25 19:16
Logged In: YES 
user_id=21627

It's very easy: find somebody with commit privileges to
approve and commit the change. Failing to do so, write a
library PEP, and ask for pronouncement.
msg54010 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-11-25 19:35
Logged In: YES 
user_id=81797

If you are reading this and are interested in having this
functionality in the standard Python library, please step
forward and champion the effort.  Obviously, I believe this
is useful, or I wouldn't have spent the better part of a day
building and testing it.  However, I simply don't have the
time to go through the politics of it.

What needs to be done is a case needs to be further built
for the presentation to the Python developement team.  See
Raymond's message below for a good list of what's needed
there.  Also see the thread on the python developers mailing
list that I started in relation to this back in September. 
I will be happy to help out on this, but I just don't have
the time to champion the adoption process.

Thanks,
Sean
msg54011 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-11-25 20:47
Logged In: YES 
user_id=6380

OK, I'm in a generous mood today. I approve the idea. (I'm
not going to review the code, that's up to Raymond and others).

And Raymond can have a fillchar option to center() as well.

I don't know what cook() was supposed to do, but if it's $
substitution, I recommend to keep that in a separate module
for now.
msg54012 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-11-25 21:03
Logged In: YES 
user_id=80475

Okay, I've got it from here!
msg54013 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-12-01 14:18
Logged In: YES 
user_id=80475

Alex said he would take the patch from here.
msg54014 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2003-12-13 19:45
Logged In: YES 
user_id=55188

On my review, few bugs are found.

>>> u'x\x00y\x00z'.rsplit(u'\x00', 1)
zsh: bus error (core dumped)  ./python

>>> u'abcd'.rsplit(u'abcd')
[u'abcd']

>>> 'a,b,c,d'.rsplit(u',', 2)
[u'a', u'b', u'c,d']

And, unittests on Lib/test/test_strop.py should be moved to 
strings_test.py

My revision for jafo's patch is available at 
http://people.freebsd.org/~perky/rsplit-perkyrev.diff
msg54015 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-12-15 02:45
Logged In: YES 
user_id=6380

Perky, feel free to check it in!
msg54016 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2003-12-15 18:58
Logged In: YES 
user_id=55188

Yo. Just checked in.
I don't have a permission to close this entry. Can anybody help?  
:-)
msg54017 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-12-15 19:38
Logged In: YES 
user_id=6380

I could just close it for you, but I'll make it an exercise
for you & Raymond to figure out how to set your developer
perms properly. :-)
History
Date User Action Args
2022-04-10 16:11:03adminsetgithub: 39195
2003-09-07 00:52:55jafocreate