Vielleicht Vielleicht Vielleicht: Difference between revisions

From XPUB & Lens-Based wiki
(Created page with "A little script to extract all sentences in a file, beginning with a certain word, in this case: "vielleicht" <source lang="python"> #!/usr/bin/env python #-*- coding:utf-8 -*...")
 
No edit summary
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
A little script to extract all sentences in a file, beginning with a certain word, in this case: "vielleicht"
A little script to extract all sentences in a file beginning with a certain word (in this case: "vielleicht"), by using Regular Expressions


<source lang="python">  
<source lang="python">  
Line 7: Line 7:


import codecs
import codecs
text_file = codecs.open("max_mail.txt", encoding="utf-8")
text_file = codecs.open("file.txt", encoding="utf-8")
text = text_file.read()
text = text_file.read()


Line 19: Line 19:




matches = re.finditer(u"ich weiß nicht (.+?)[\.,]", text, flags = re.I)
matches = re.finditer(u"ich weiß (.+?)[\.,]", text, flags = re.I)
for m in matches:
for m in matches:
print m.group(1)
print m.group(1)

Latest revision as of 22:00, 18 January 2012

A little script to extract all sentences in a file beginning with a certain word (in this case: "vielleicht"), by using Regular Expressions

<source lang="python">

  1. !/usr/bin/env python
  2. -*- coding:utf-8 -*-

import codecs text_file = codecs.open("file.txt", encoding="utf-8") text = text_file.read()

import re

matches = re.findall(u"ich weiß nicht .+?[\.,]", text, flags = re.I)


for t in matches: print t.encode("utf-8"), "
"


matches = re.finditer(u"ich weiß (.+?)[\.,]", text, flags = re.I) for m in matches: print m.group(1)