This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add 'before' and 'after' methods to Strings
Type: enhancement Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: cxdunn, rhettinger
Priority: low Keywords:

Created on 2005-04-27 01:35 by cxdunn, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (7)
msg54487 - (view) Author: Christopher Dunn (cxdunn) Date: 2005-04-27 01:35
GNU String used to have two very useful methods,
'before' and 'after'. These are so useful I keep them
defined in an __init__.py file. (Unfortunately, I do
not know how to make them methods, instead of global
functions.)

Usage:

>>> "root.sub".before(".")
'root'
>>> "root.sub1.sub2".after("root.sub1")
'.sub2'

They work like s.split(word)[0], and s.split(word)[-1],
but they are so intuitive they ought to be part of the
interface.

I'm not sure whether they should raise exceptions on
failure, or simply return the whole string.

-cxdunn
msg54488 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2005-04-28 05:15
Logged In: YES 
user_id=80475

I'm -1 on expanding the string API for something so easily
coded with existing primitives:

>>> s = "root.sub"
>>> t = "."
>>> s[:s.find(t)]
'root'

>>> s = "root.sub1.sub2"
>>> t = "root.sub1"
>>> s[s.find(sep)+len(sep):]
'sub1.sub2'
msg54489 - (view) Author: Christopher Dunn (cxdunn) Date: 2005-04-28 06:40
Logged In: YES 
user_id=1267419

Your examples prove my point:

>>> s = "root.sub"
>>> t = "fubar"
>>> s[:s.find(t)]
'root.su'

>>> s = "root.sub1.sub2"
>>> t = "fubar"
>>> s[s.find(sep)+len(sep):]
'.sub1.sub2'

string.find() is the wrong way.
I can live with string.split():
>>> "root.sub1.sub2"
>>> t = '.'
>>> s.split(t)[0]
'root'
>>> s.split(t)[-1]
'sub2'
>>> t = "fubar"
>>> s.split(t)[0]
'root.sub1.sub2'
>>> s.split(t)[-1]
'root.sub1.sub2'

This is not terrible, but the desired behavior is really
more like strip/rstrip::

def before( s, first ):
    """Find first inside string s and return everything
before that.
    >>> before('xyz.pdq.abc', '.')
    'xyz'
    >>> before('xyz.pdq.abc', 'fubar')
    'xyz.pdq.abc'
    """
    return s.split(first)[0]
def after( s, first ):
    """Find first inside string s and return everything
after that.
    >>> after('xyz.pdq.abc', '.')
    'pdq.abc'
    >>> after('xyz.pdq', 'xyz.')
    'pdq'
    >>> after('xyz.pdq.abc', 'fubar')
    ''
    """
    return first.join(s.split(first)[1:])
def rbefore( s, last ):
    """Find last inside string s, from the right,
    and return everything before that.
    >>> rbefore('xyz.pdq.abc', '.')
    'xyz.pdq'
    >>> rbefore('xyz.pdq.abc', 'fubar')
    ''
    """
    return last.join(s.split(last)[:-1])
def rafter( s, last ):
    """Find last inside string s, from the right
    and return everything after that.
    >>> rafter('xyz.pdq.abc', '.')
    'abc'
    >>> rafter('xyz.pdq.abc', 'fubar')
    'xyz.pdq.abc'
    """
    return s.split(last)[-1]

It's a question of elegance. These are very useful,
infuitive functions, and I cannot add them to string myself.
And as you've seen, it's easy to create bugs when you try to
do this on the fly.

Reconsider? If not, I'll just post it in the Cookbook, to
point out the dangers of relying on string.find.
msg54490 - (view) Author: Christopher Dunn (cxdunn) Date: 2005-04-28 06:50
Logged In: YES 
user_id=1267419

Your examples prove my point::

>>> s = "Monty.Python"
>>> t = "fubar"
>>> s[:s.find(t)]
'Monty.Pytho'
>>> s[s.find(t)+len(t):]
'y.Python'

Of course, this would work:
>>> s.split(t)[0]
'Monty.Python'
>>> s.split(t)[-1]
'Monty.Python'

That is not terrible, but the behavior I want is actually
more like strip()/rstrip()::

def before( s, first ):
    """Find first inside string s and return everything
before that.
    >>> before('xyz.pdq.abc', '.')
    'xyz'
    >>> before('xyz.pdq.abc', 'fubar')
    'xyz.pdq.abc'
    """
    return s.split(first)[0]
def after( s, first ):
    """Find first inside string s and return everything
after that.
    >>> after('xyz.pdq.abc', '.')
    'pdq.abc'
    >>> after('xyz.pdq', 'xyz.')
    'pdq'
    >>> after('xyz.pdq.abc', 'fubar')
    ''
    """
    return first.join(s.split(first)[1:])
def rbefore( s, last ):
    """Find last inside string s, from the right,
    and return everything before that.
    >>> rbefore('xyz.pdq.abc', '.')
    'xyz.pdq'
    >>> rbefore('xyz.pdq.abc', 'fubar')
    ''
    """
    return last.join(s.split(last)[:-1])
def rafter( s, last ):
    """Find last inside string s, from the right
    and return everything after that.
    >>> rafter('xyz.pdq.abc', '.')
    'abc'
    >>> rafter('xyz.pdq.abc', 'fubar')
    'xyz.pdq.abc'
    """
    return s.split(last)[-1]

Besides, it's a question of elegance. These are very useful
little functions,
which look wonderful as methods of string, and the
on-the-fly solutions are 
prone to error. Reconsider?

If not, I'll just post it to the Cookbook (without your name
-- I'm not trying
to embarrass anyone) to point out the danger of relying on
string.find().

-cxdunn
msg54491 - (view) Author: Christopher Dunn (cxdunn) Date: 2005-04-28 06:54
Logged In: YES 
user_id=1267419

Sorry for the double-post. I thought I'd lost it and
re-typed the whole thing. Read the top one only -- less buggy. 
msg54492 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2005-04-28 06:58
Logged In: YES 
user_id=80475

You read too much into a simplified example.  Test the
find() result for -1 and be done with it.  Go ahead with a
cookbook recipe if you think that no one else is bright
enough to write their own.
msg54493 - (view) Author: Christopher Dunn (cxdunn) Date: 2005-04-28 08:31
Logged In: YES 
user_id=1267419

I guess you're right.

My goal here is to move my company from Tcl to Python, and
there are surely more important inducements than an expanded
string class.

I will try to find out what "I'm -1" means.

For what it's worth, here is a reference to the old libg++
GNU String library:

http://www.math.utah.edu/docs/info/libg++_19.html

z = x.before("o")
    sets z to the part of x to the left of the first
occurrence of "o", or "Hell" in this case. The argument may
also be a String, SubString, or Regex. (If there is no
match, z is set to "".) 
x.before("ll") = "Bri";
    sets the part of x to the left of "ll" to "Bri", setting
x to "Brillo". 
z = x.before(2)
    sets z to the part of x to the left of x[2], or "He" in
this case. 
z = x.after("Hel")
    sets z to the part of x to the right of "Hel", or "lo"
in this case. 
z = x.through("el")
    sets z to the part of x up and including "el", or "Hel"
in this case. 
z = x.from("el")
    sets z to the part of x from "el" to the end, or "ello"
in this case. 
x.after("Hel") = "p";
    sets x to "Help"; 
z = x.after(3)
    sets z to the part of x to the right of x[3] or "o" in
this case.
History
Date User Action Args
2022-04-11 14:56:11adminsetgithub: 41911
2005-04-27 01:35:57cxdunncreate