How can I split and parse a string in Python?
How can I split and parse a string in Python?
2.7.0_bf4fda703454.split(_)
gives a list of strings:
In [1]: 2.7.0_bf4fda703454.split(_)
Out[1]: [2.7.0, bf4fda703454]
This splits the string at every underscore. If you want it to stop after the first split, use 2.7.0_bf4fda703454.split(_, 1)
.
If you know for a fact that the string contains an underscore, you can even unpack the LHS and RHS into separate variables:
In [8]: lhs, rhs = 2.7.0_bf4fda703454.split(_, 1)
In [9]: lhs
Out[9]: 2.7.0
In [10]: rhs
Out[10]: bf4fda703454
An alternative is to use partition()
. The usage is similar to the last example, except that it returns three components instead of two. The principal advantage is that this method doesnt fail if the string doesnt contain the separator.
Python string parsing walkthrough
Split a string on space, get a list, show its type, print it out:
[email protected]:~/foo$ python
>>> mystring = What does the fox say?
>>> mylist = mystring.split( )
>>> print type(mylist)
<type list>
>>> print mylist
[What, does, the, fox, say?]
If you have two delimiters next to each other, empty string is assumed:
[email protected]:~/foo$ python
>>> mystring = its so fluffy im gonna DIE!!!
>>> print mystring.split( )
[its, , so, , , fluffy, , , im, gonna, , , , DIE!!!]
Split a string on underscore and grab the 5th item in the list:
[email protected]:~/foo$ python
>>> mystring = Time_to_fire_up_Kowalskis_Nuclear_reactor.
>>> mystring.split(_)[4]
Kowalskis
Collapse multiple spaces into one
[email protected]:~/foo$ python
>>> mystring = collapse these spaces
>>> mycollapsedstring = .join(mystring.split())
>>> print mycollapsedstring.split( )
[collapse, these, spaces]
When you pass no parameter to Pythons split method, the documentation states: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.
Hold onto your hats boys, parse on a regular expression:
[email protected]:~/foo$ python
>>> mystring = zzzzzzabczzzzzzdefzzzzzzzzzghizzzzzzzzzzzz
>>> import re
>>> mylist = re.split([a-m]+, mystring)
>>> print mylist
[zzzzzz, zzzzzz, zzzzzzzzz, zzzzzzzzzzzz]
The regular expression [a-m]+ means the lowercase letters a
through m
that occur one or more times are matched as a delimiter. re
is a library to be imported.
Or if you want to chomp the items one at a time:
[email protected]:~/foo$ python
>>> mystring = theres coffee in that nebula
>>> mytuple = mystring.partition( )
>>> print type(mytuple)
<type tuple>
>>> print mytuple
(theres, , coffee in that nebula)
>>> print mytuple[0]
theres
>>> print mytuple[2]
coffee in that nebula
How can I split and parse a string in Python?
If its always going to be an even LHS/RHS split, you can also use the partition
method thats built into strings. It returns a 3-tuple as (LHS, separator, RHS)
if the separator is found, and (original_string, , )
if the separator wasnt present:
>>> 2.7.0_bf4fda703454.partition(_)
(2.7.0, _, bf4fda703454)
>>> shazam.partition(_)
(shazam, , )