Regular expressions
Revision as of 22:28, 5 December 2013 by Michael Murtaugh (talk | contribs)
Load some text from a file
Imagine you have some text, say from a text file:
text = open("pg105.txt").read()
Finding a pattern in a loop
for match in re.findall(r"the \w+", text):
print match
the use the terms the Project the Baronetage the limited the earliest the almost the last the page the favourite
for match in re.findall(r"the (\w+)", text):
print match
use terms Project Baronetage limited earliest almost last page favourite
for match in re.findall(r"(\w+) the (\w+)", text):
print match
('for', 'use') ('under', 'terms') ('of', 'Project') ('but', 'Baronetage') ('contemplating', 'limited') ('of', 'earliest') ('over', 'almost') ('of', 'last') ('was', 'page') ('which', 'favourite')
Search & Replace with .sub
print re.sub(r"the (\w+)", r"the ONLY \1", text)
The Project Gutenberg EBook of Persuasion, by Jane Austen
This eBook is for the ONLY use of anyone anywhere at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the ONLY terms of the ONLY Project Gutenberg License included with this eBook or online at www.gutenberg.net