XML data manipulation: Difference between revisions

From XPUB & Lens-Based wiki
(Created page with "Currently I am just doing a simple manipulation using a regular expression to remove the suffixes (such as straat, weg, plein, laan, etc) from preceding name. Then the data is re...")
 
(Blanked the page)
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
Currently I am just doing a simple manipulation using a regular expression to remove the suffixes (such as straat, weg, plein, laan, etc) from preceding name. Then the data is rendered into an svg


<source lang="python">
#! /usr/bin/python
import lxml.etree, urllib2, codecs, random, re
f = ("/home/andre/osm/data-original.osm")
#f = urllib2.urlopen("http://api.openstreetmap.org/api/0.6/map?bbox=4.4552603823670465,51.91739525301985,4.46384345121436,51.920373035178464")
# parse the data
doc = lxml.etree.parse(f)
#look for tag k="name"
ways = doc.xpath("//tag[@k='name']")
size = len(ways)
streets = []
for t in ways:
straat = t.get('v') 
straat_re = re.sub("(straat)|(weg)|(plein)|(laan)|(singel)|(steeg)|(boulevard)|(kanaal)|(hof)|(kade)|(dijk)|(haven)|(markt)|(dreef)|(pad)|(werf)|(erf)|(wal)|(burg)|(burgh)|(burcht)|(spoor)\b", "", straat)
#+tje  #only straat... followed by space straat\b
    #set new words into the the xml document 
t.set('v', straat_re)
streets.append(straat_re)
text = lxml.etree.tostring(doc, encoding="utf-8", xml_declaration=True)
#print text
n =open("/home/andre/osm/data.osm", "w")
n.write(text)
</source>

Latest revision as of 19:19, 23 November 2011