Regular expressions: Difference between revisions

From XPUB & Lens-Based wiki
(Created page with "<source lang="python"> for match in re.findall(r"the \w+", text): print match </source> the use the terms the Project the Baronetage the limited t...")
 
No edit summary
Line 1: Line 1:
== Load some text from a file ==
Imagine you have some text, say from a text file:
<source lang="python">
text = open("pg105.txt").read()
</source>
== Finding a pattern in a loop ==
<source lang="python">
<source lang="python">
for match in re.findall(r"the \w+", text):
for match in re.findall(r"the \w+", text):
Line 47: Line 56:
     ('was', 'page')
     ('was', 'page')
     ('which', 'favourite')
     ('which', 'favourite')
== Search & Replace with .sub ==
<source lang="python">
print re.sub(r"the (\w+)", r"the ONLY \1", text)
</source>
    The Project Gutenberg EBook of Persuasion, by Jane Austen
    This eBook is for the ONLY use of anyone anywhere at no cost and with
    almost no restrictions whatsoever.  You may copy it, give it away or
    re-use it under the ONLY terms of the ONLY Project Gutenberg License included
    with this eBook or online at www.gutenberg.net

Revision as of 22:28, 5 December 2013

Load some text from a file

Imagine you have some text, say from a text file:

text = open("pg105.txt").read()

Finding a pattern in a loop

for match in re.findall(r"the \w+", text):
    print match
   the use
   the terms
   the Project
   the Baronetage
   the limited
   the earliest
   the almost
   the last
   the page
   the favourite
for match in re.findall(r"the (\w+)", text):
    print match
   use
   terms
   Project
   Baronetage
   limited
   earliest
   almost
   last
   page
   favourite


for match in re.findall(r"(\w+) the (\w+)", text):
    print match
   ('for', 'use')
   ('under', 'terms')
   ('of', 'Project')
   ('but', 'Baronetage')
   ('contemplating', 'limited')
   ('of', 'earliest')
   ('over', 'almost')
   ('of', 'last')
   ('was', 'page')
   ('which', 'favourite')

Search & Replace with .sub

print re.sub(r"the (\w+)", r"the ONLY \1", text)
   The Project Gutenberg EBook of Persuasion, by Jane Austen
   This eBook is for the ONLY use of anyone anywhere at no cost and with
   almost no restrictions whatsoever.  You may copy it, give it away or
   re-use it under the ONLY terms of the ONLY Project Gutenberg License included
   with this eBook or online at www.gutenberg.net