XML data manipulation: Difference between revisions

From XPUB & Lens-Based wiki
(Created page with "Currently I am just doing a simple manipulation using a regular expression to remove the suffixes (such as straat, weg, plein, laan, etc) from preceding name. Then the data is re...")
 
No edit summary
Line 1: Line 1:
Currently I am just doing a simple manipulation using a regular expression to remove the suffixes (such as straat, weg, plein, laan, etc) from preceding name. Then the data is rendered into an svg
Currently I am just doing a simple manipulation using a regular expression to remove the suffixes (such as straat, weg, plein, laan, etc) from preceding name. Then the data is rendered into an svg


* Ideas:
** Change the dimension of the areas - creating a distorted city


<source lang="python">
<source lang="python">

Revision as of 18:03, 23 November 2011

Currently I am just doing a simple manipulation using a regular expression to remove the suffixes (such as straat, weg, plein, laan, etc) from preceding name. Then the data is rendered into an svg

  • Ideas:
    • Change the dimension of the areas - creating a distorted city
#! /usr/bin/python

import lxml.etree, urllib2, codecs, random, re

f = ("/home/andre/osm/data-original.osm")

#f = urllib2.urlopen("http://api.openstreetmap.org/api/0.6/map?bbox=4.4552603823670465,51.91739525301985,4.46384345121436,51.920373035178464")

# parse the data
doc = lxml.etree.parse(f)
#look for tag k="name"
ways = doc.xpath("//tag[@k='name']")
size = len(ways)
streets = []


for t in ways:
	straat = t.get('v')  	
	straat_re = re.sub("(straat)|(weg)|(plein)|(laan)|(singel)|(steeg)|(boulevard)|(kanaal)|(hof)|(kade)|(dijk)|(haven)|(markt)|(dreef)|(pad)|(werf)|(erf)|(wal)|(burg)|(burgh)|(burcht)|(spoor)\b", "", straat)
	#+tje  		#only straat... followed by space straat\b
	
    #set new words into the the xml document  
	t.set('v', straat_re)
	streets.append(straat_re)
	
text = lxml.etree.tostring(doc, encoding="utf-8", xml_declaration=True)
#print text

n =open("/home/andre/osm/data.osm", "w")
n.write(text)