User:Eleanorg/1.2/Forbidden Pixels/Receiving URL where data is hosted: Difference between revisions
No edit summary |
|||
Line 122: | Line 122: | ||
</body> | </body> | ||
</html>""" | </html>""" | ||
</source> | |||
==Python script writing submitted URL to a text file, for later use== | |||
<source lang="python"> | |||
#!/usr/bin/python | |||
#-*- coding:utf-8 -*- | |||
import cgi | |||
import cgitb; cgitb.enable() | |||
# scrapes pixel data string from a URL submitted by user in an html form; assigns the result to appropriate variables. | |||
#------------- get URL from input form -------------------# | |||
form = cgi.FieldStorage() # Grabs whatever input comes from form | |||
url = form.getvalue("url", "http://ox4.org/~nor/trials/hostedString.html") # assigns form's input to var 'url'. url on the right is a default value that will be printed for testing if nothing is recieved from the form | |||
#------------- save url to a text file --------------------# | |||
f = open("data/urls.txt", 'a') # opens text file in apend mode - 'a' | |||
f.write(url + '\n') | |||
f.close() | |||
#------------- print acknowledgement ---------------------# | |||
htmlHeader = """<!DOCTYPE html> | |||
<html> | |||
<head> | |||
<title>A form talking to a python script</title> | |||
<style type="text/css"> | |||
</style> | |||
</head> | |||
<body>""" | |||
print "Content-Type: text/html" | |||
print | |||
print htmlHeader | |||
print "Thanks, url submitted. The pixel hosted there will be added to the image soon." | |||
print """ | |||
</body> | |||
</html>""" | |||
</source> | </source> |
Revision as of 23:00, 28 March 2012
The site will need to ask participants for the URL of the pixel data they're hosting.This is my first time doing form processing in python: the html form asks for the url; it passes it to a simple script. Next step is for that script to add it to the list of URLs, and to check whether there is a string there matching the regex.
Go here to test the input form: http://pzwart3.wdka.hro.nl/~egreenhalgh/inputForm.html
URL input form
<!DOCTYPE html>
<html>
<head>
<title>A form talking to a python script</title>
<style type="text/css">
</style>
</head>
<body>
<form action="form.cgi" name="inputForm"> <!--pushes form input to form.py script -->
Paste url:
<input name="url">
<input type="submit">
</form>
</body>
</html>
Basic Python script grabbing URL submitted
Saved as .cgi within cgi-bin on server.
#!/usr/bin/python
import cgi
import cgitb; cgitb.enable() #what do these do?
htmlHeader = """<!DOCTYPE html>
<html>
<head>
<title>A form talking to a python script</title>
<style type="text/css">
</style>
</head>
<body>"""
print "Content-Type: text/html" //important - won't print in browser without these 2 lines
print
print htmlHeader
form = cgi.FieldStorage() //Grabs whatever input comes from form
url = form['url'].value // assigns form's url field input to var 'url'
print url
print """
</body>
</html>"""
Python script grabbing inputted URL then scraping it
#!/usr/bin/python
#-*- coding:utf-8 -*-
import cgi, re, urllib2
import cgitb; cgitb.enable()
# scrapes pixel data string from a URL submitted by user in an html form; assigns the result to appropriate variables.
#------------- get URL from input form -------------------#
form = cgi.FieldStorage() # Grabs whatever input comes from form
url = form['url'].value # assigns form's url field input to var 'url'
#------------- scrape webpage----------------------------#
text = urllib2.urlopen(url).read() # reads page at the specified URL
#-------------- extract the string with regex------------#
# string is in format:
# Pixel position:500.001; Color:rgba(222,221,217,1)
for x in re.findall(r"Pixel position:(\d\d\d).(\d\d\d)\;\ Color:(rgba\(.*\))", text):
match = str(x) # only matches what is within capture parentheses. How to match whole string, even when bits within it are captured?
xPos = str(x[0])
yPos = str(x[1])
color= str(x[2])
#-------------- print match -----------------------------#
htmlHeader = """<!DOCTYPE html>
<html>
<head>
<title>A form talking to a python script</title>
<style type="text/css">
</style>
</head>
<body>"""
print "Content-Type: text/html"
print
print htmlHeader
if xPos:
print
print "x position is: " + xPos + "<br />"
print "y position is: " + yPos + "<br />"
print "color is: " + color
else: # not working - how to avoid 'match is undefined' error if regex fails to match anything?
print "this pixel is not currently hosted"
print """
</body>
</html>"""
Python script writing submitted URL to a text file, for later use
#!/usr/bin/python
#-*- coding:utf-8 -*-
import cgi
import cgitb; cgitb.enable()
# scrapes pixel data string from a URL submitted by user in an html form; assigns the result to appropriate variables.
#------------- get URL from input form -------------------#
form = cgi.FieldStorage() # Grabs whatever input comes from form
url = form.getvalue("url", "http://ox4.org/~nor/trials/hostedString.html") # assigns form's input to var 'url'. url on the right is a default value that will be printed for testing if nothing is recieved from the form
#------------- save url to a text file --------------------#
f = open("data/urls.txt", 'a') # opens text file in apend mode - 'a'
f.write(url + '\n')
f.close()
#------------- print acknowledgement ---------------------#
htmlHeader = """<!DOCTYPE html>
<html>
<head>
<title>A form talking to a python script</title>
<style type="text/css">
</style>
</head>
<body>"""
print "Content-Type: text/html"
print
print htmlHeader
print "Thanks, url submitted. The pixel hosted there will be added to the image soon."
print """
</body>
</html>"""