As I mentioned in a comment at Some more tweaks to my Python script, there are a lot of ways you can use the re module. If you need to match multiple expressions against each line, you can build up a single regular expression that includes all the patterns, and used named groups to tell them apart.
import re #if you were matching many of these it would be a good idea #to make a function that simply fills in '%s>(?P<%s>[^<]+)<' cpattern = 'total_credit>(?P<credit>[^<]+)<' opattern = 'os_name>(?P<os>[^<]+)<' pattern = '(%s)|(%s)' % (cpattern, opattern) search = re.compile(pattern).search lines = [ 'blah blah blah total_credit>10< blah blah', 'hkfhsd klfjhs dfkljsdfsl fds', 'hkashflksd os_name>win< hhkjhdflksj d', 'hkfhsd klfjhs dfkljsdfsl fds', 'blah blah blah total_credit>20< blah blah', ] for line in lines: r = search(line) if r: print r.groupdict()
Running this gives
{'credit': '10', 'os': None}
{'credit': None, 'os': 'win'}
{'credit': '20', 'os': None}
In this case you could even generalize the regular expression further, like so:
pattern = '\s(?P<key>[^\s>]+)>(?P<value>[^<]+)<'
Running that (probably less than optimal) regular expression over the input gives
{'key': 'total_credit', 'value': '10'}
{'key': 'os_name', 'value': 'win'}
{'key': 'total_credit', 'value': '20'}