Vielleicht Vielleicht Vielleicht: Difference between revisions
Marie Wocher (talk | contribs) (Created page with "A little script to extract all sentences in a file, beginning with a certain word, in this case: "vielleicht" <source lang="python"> #!/usr/bin/env python #-*- coding:utf-8 -*...") |
Marie Wocher (talk | contribs) No edit summary |
||
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
A little script to extract all sentences in a file | A little script to extract all sentences in a file beginning with a certain word (in this case: "vielleicht"), by using Regular Expressions | ||
<source lang="python"> | <source lang="python"> | ||
Line 7: | Line 7: | ||
import codecs | import codecs | ||
text_file = codecs.open(" | text_file = codecs.open("file.txt", encoding="utf-8") | ||
text = text_file.read() | text = text_file.read() | ||
Line 19: | Line 19: | ||
matches = re.finditer(u"ich weiß | matches = re.finditer(u"ich weiß (.+?)[\.,]", text, flags = re.I) | ||
for m in matches: | for m in matches: | ||
print m.group(1) | print m.group(1) |
Latest revision as of 22:00, 18 January 2012
A little script to extract all sentences in a file beginning with a certain word (in this case: "vielleicht"), by using Regular Expressions
<source lang="python">
- !/usr/bin/env python
- -*- coding:utf-8 -*-
import codecs text_file = codecs.open("file.txt", encoding="utf-8") text = text_file.read()
import re
matches = re.findall(u"ich weiß nicht .+?[\.,]", text, flags = re.I)
for t in matches:
print t.encode("utf-8"), "
"
matches = re.finditer(u"ich weiß (.+?)[\.,]", text, flags = re.I)
for m in matches:
print m.group(1)