parsing – sscanf in Python

parsing – sscanf in Python

There is also the parse module.

parse() is designed to be the opposite of format() (the newer string formatting function in Python 2.6 and higher).

>>> from parse import parse
>>> parse({} fish, 1)
>>> parse({} fish, 1 fish)
<Result (1,) {}>
>>> parse({} fish, 2 fish)
<Result (2,) {}>
>>> parse({} fish, red fish)
<Result (red,) {}>
>>> parse({} fish, blue fish)
<Result (blue,) {}>

When Im in a C mood, I usually use zip and list comprehensions for scanf-like behavior. Like this:

input = 1 3.0 false hello
(a, b, c, d) = [t(s) for t,s in zip((int,float,strtobool,str),input.split())]
print (a, b, c, d)

Note that for more complex format strings, you do need to use regular expressions:

import re
input = 1:3.0 false,hello
(a, b, c, d) = [t(s) for t,s in zip((int,float,strtobool,str),re.search(^(d+):([d.]+) (w+),(w+)$,input).groups())]
print (a, b, c, d)

Note also that you need conversion functions for all types you want to convert. For example, above I used something like:

strtobool = lambda s: {true: True, false: False}[s]

parsing – sscanf in Python

Python doesnt have an sscanf equivalent built-in, and most of the time it actually makes a whole lot more sense to parse the input by working with the string directly, using regexps, or using a parsing tool.

Probably mostly useful for translating C, people have implemented sscanf, such as in this module: http://hkn.eecs.berkeley.edu/~dyoo/python/scanf/

In this particular case if you just want to split the data based on multiple split characters, re.split is really the right tool.

Leave a Reply

Your email address will not be published. Required fields are marked *