XML data manipulation

From XPUB & Lens-Based wiki
Revision as of 18:01, 23 November 2011 by Andrecastro (talk | contribs) (Created page with "Currently I am just doing a simple manipulation using a regular expression to remove the suffixes (such as straat, weg, plein, laan, etc) from preceding name. Then the data is re...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Currently I am just doing a simple manipulation using a regular expression to remove the suffixes (such as straat, weg, plein, laan, etc) from preceding name. Then the data is rendered into an svg


#! /usr/bin/python

import lxml.etree, urllib2, codecs, random, re

f = ("/home/andre/osm/data-original.osm")

#f = urllib2.urlopen("http://api.openstreetmap.org/api/0.6/map?bbox=4.4552603823670465,51.91739525301985,4.46384345121436,51.920373035178464")

# parse the data
doc = lxml.etree.parse(f)
#look for tag k="name"
ways = doc.xpath("//tag[@k='name']")
size = len(ways)
streets = []


for t in ways:
	straat = t.get('v')  	
	straat_re = re.sub("(straat)|(weg)|(plein)|(laan)|(singel)|(steeg)|(boulevard)|(kanaal)|(hof)|(kade)|(dijk)|(haven)|(markt)|(dreef)|(pad)|(werf)|(erf)|(wal)|(burg)|(burgh)|(burcht)|(spoor)\b", "", straat)
	#+tje  		#only straat... followed by space straat\b
	
    #set new words into the the xml document  
	t.set('v', straat_re)
	streets.append(straat_re)
	
text = lxml.etree.tostring(doc, encoding="utf-8", xml_declaration=True)
#print text

n =open("/home/andre/osm/data.osm", "w")
n.write(text)