User:Lbattich/Replace ME (python text replace): Difference between revisions
(Created page with "Scripts for replacing famous names in a tex with my name, for instance replacing the names in wikipedia entry pages. == guide in steps == ===step 1=== compile a list of names...") |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
Scripts for replacing famous names in a | Scripts for replacing famous names in a text with my name, for instance replacing the names in wikipedia entry pages. | ||
== guide in steps == | == guide in steps == | ||
Line 24: | Line 24: | ||
run this python script in terminal: | run this python script in terminal: | ||
<source lang="bash"> | <source lang="bash"> | ||
cat list.txt | break.py > sublist.txt | cat list.txt | python break.py > sublist.txt | ||
</source> | </source> | ||
where list.txt is your source list file that looks like the one in step one | where list.txt is your source list file that looks like the one in step one | ||
Line 60: | Line 60: | ||
run this: | run this: | ||
<source lang="bash"> | <source lang="bash"> | ||
cat original-wiki.html | replace.py > new-wiki.html | cat original-wiki.html | python replace.py > new-wiki.html | ||
</source> | </source> | ||
Line 95: | Line 95: | ||
I also took [https://en.wikipedia.org/wiki/Modern_art this] and it became [http://lucasbattich.com/tests/modern-lucas.html THIS] | I also took [https://en.wikipedia.org/wiki/Modern_art this] and it became [http://lucasbattich.com/tests/modern-lucas.html THIS] | ||
My original list of names is not so comprehensive – doesn't include names of theorists, writers and pre- | My original list of names is not so comprehensive – doesn't include names of theorists, writers and pre-modern artists, etc, basically of anyone who didn't appear in the small lists I used – so the result is not entirely polished. |
Latest revision as of 16:21, 22 April 2015
Scripts for replacing famous names in a text with my name, for instance replacing the names in wikipedia entry pages.
guide in steps
step 1
compile a list of names into a plain txt file with a layout like this:
Vito Acconci Bas Jan Ader Vikky Alexander Roy Ascott Marina Abramović Billy Apple Shusaku Arakawa Christopher D'Arcangelo Michael Asher Mireille Astore
I took listings for wikipedia lists like this one and this one.
Important! Make sure the list looks neat, and there are no weird characters, like parenthesis: ().
step 2
run this python script in terminal:
cat list.txt | python break.py > sublist.txt
where list.txt is your source list file that looks like the one in step one
and where break.py has this script:
import re,sys
text = sys.stdin.readlines()
for line in text:
line = re.sub(r" ", r"\tLucas_", line)
line = line.strip()
line = re.sub(r"$", r"\tBattich\n", line)
line = re.sub(r"_", r"\n", line)
sys.stdout.write(line)
Your end product list file, sublist.txt, now looks like this (notice that in the case of names with second names, like Jan Bas Ader, second names are also made into Lucas)
Vito Lucas Acconci Battich Bas Lucas Jan Lucas Ader Battich Vikky Lucas Alexander Battich Roy Lucas Ascott Battich Marina Lucas Abramovi Battich
step 3
run this:
cat original-wiki.html | python replace.py > new-wiki.html
where original-wiki.html is your source text, in this case an html file being an exact replica of a wikipedia entry page.
and new-wiki.html is your end product
and replace.py looks like this:
import re, sys
text = sys.stdin.readlines()
subtext = open("sublist.txt").readlines()
for line in text:
for subline in subtext:
subline = subline.strip()
subline = re.split(r"\t", subline)
search = subline[0]
search = r"\b{0}\b".format(search)
replace = subline[1]
line = line.strip()
line = re.sub(search, replace, line)
sys.stdout.write(line)
Note! that the python script REQUIRES that you use a file called sublist.txt containing the subtitution list. cannot have any other name.
what happened
I took this and it became THIS
I also took this and it became THIS
My original list of names is not so comprehensive – doesn't include names of theorists, writers and pre-modern artists, etc, basically of anyone who didn't appear in the small lists I used – so the result is not entirely polished.