parsing – sscanf in Python
parsing – sscanf in Python
There is also the parse
module.
parse()
is designed to be the opposite of format()
(the newer string formatting function in Python 2.6 and higher).
>>> from parse import parse
>>> parse({} fish, 1)
>>> parse({} fish, 1 fish)
<Result (1,) {}>
>>> parse({} fish, 2 fish)
<Result (2,) {}>
>>> parse({} fish, red fish)
<Result (red,) {}>
>>> parse({} fish, blue fish)
<Result (blue,) {}>
When Im in a C mood, I usually use zip and list comprehensions for scanf-like behavior. Like this:
input = 1 3.0 false hello
(a, b, c, d) = [t(s) for t,s in zip((int,float,strtobool,str),input.split())]
print (a, b, c, d)
Note that for more complex format strings, you do need to use regular expressions:
import re
input = 1:3.0 false,hello
(a, b, c, d) = [t(s) for t,s in zip((int,float,strtobool,str),re.search(^(d+):([d.]+) (w+),(w+)$,input).groups())]
print (a, b, c, d)
Note also that you need conversion functions for all types you want to convert. For example, above I used something like:
strtobool = lambda s: {true: True, false: False}[s]
parsing – sscanf in Python
Python doesnt have an sscanf
equivalent built-in, and most of the time it actually makes a whole lot more sense to parse the input by working with the string directly, using regexps, or using a parsing tool.
Probably mostly useful for translating C, people have implemented sscanf
, such as in this module: http://hkn.eecs.berkeley.edu/~dyoo/python/scanf/
In this particular case if you just want to split the data based on multiple split characters, re.split
is really the right tool.