PythonEscaping

From XPUB & Lens-Based wiki
Revision as of 11:36, 21 May 2008 by Michael Murtaugh (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Python provides a number of useful utilities to help "escape" data when placing in HTML (attributes / forms), URLs, or MySQL queries. So you don't have to write your own (plus these implementations ought to be more tested and secure than your usual quick and dirty solution ;).

cgi

escape(s[, quote])::
:: Convert the characters "&", "<" and ">" in string s to HTML-safe sequences. Use this if you need to display text that might contain such characters in HTML. If the optional flag quote is true, the quotation mark character (""") is also translated; this helps for inclusion in an HTML attribute value, as in <A HREF="...">. If the value to be quoted might include single- or double-quote characters, or both, consider using the quoteattr() function in the xml.sax.saxutils module instead.

source: python docs cgi functions

xml.sax.saxutils

quoteattr(data[, entities])::
:: Similar to escape(), but also prepares data to be used as an attribute value. The return value is a quoted version of data with any additional required replacements. quoteattr() will select a quote character based on the content of data, attempting to avoid encoding any quote characters in the string. If both single- and double-quote characters are already in data, the double-quote characters will be encoded and data will be wrapped in double-quotes. The resulting string can be used directly as an attribute value:
:: >>> print "<element attr=%s>" % quoteattr("ab ' cd \" ef")
:: <element attr="ab ' cd " ef">
:: This function is useful when generating attribute values for HTML or any SGML using the reference concrete syntax. New in version 2.2.

<!> nb: do not put your %s in quotes, the quoteattr adds them for you.

<!> quoteattr expects data to be a string, and freaks out if you give it, for instance, an integer; if you have data that might be numeric, wrap it in str() before passing to quoteattr, as in quoteattr(str(foo))

escape(data[, entities])::
:: Escape "&", "<", and ">" in a string of data.
:: You can escape other strings of data by passing a dictionary as the optional entities parameter. The keys and values must all be strings; each key will be replaced with its corresponding value.
unescape(data[, entities])::
:: Unescape "&", "<", and ">" in a string of data.
:: You can unescape other strings of data by passing a dictionary as the optional entities parameter. The keys and values must all be strings; each key will be replaced with its corresponding value.
:: New in version 2.3.

source: python docs xml.sax.saxutils

urllib

quote(string[, safe])::
:: Replace special characters in string using the "%xx" escape. Letters, digits, and the characters "_.-" are never quoted. The optional safe parameter specifies additional characters that should not be quoted -- its default value is '/'.
:: Example: quote('/~connolly/') yields '/%7econnolly/'.
quote_plus(string[, safe])::
:: Like quote(), but also replaces spaces by plus signs, as required for quoting HTML form values. Plus signs in the original string are escaped unless they are included in safe. It also does not have safe default to '/'.
unquote(string)::
:: Replace "%xx" escapes by their single-character equivalent.
:: Example: unquote('/%7Econnolly/') yields '/~connolly/'.
unquote_plus(string)::
:: Like unquote(), but also replaces plus signs by spaces, as required for unquoting HTML form values.

<!> To make sure slashes get quoted, you need to override the default "safe" value, as in:

nexturl = "goto.cgi?path=%s" % urllib.quote(mypath, '')

source: python docs urllib

MySQLdb

Python's MySQLdb library includes the ability to escape values in queries. You should definitely make use of them as they also protect against injection attacks when producing queries from say form/url supplied data.

a "real world" example:

q = "UPDATE "+MEDIA_TABLE+" SET mediatype='video', filesize=%s, filelastmod=%s, width=%s, height=%s, duration=%s, fps=%s, videoinfo=%s, audioinfo=%s WHERE filename=%s"

cursor.execute(q, (filesize, filelastmod, ffmpeg.get('width'), ffmpeg.get('height'), ffmpeg.get('duration'), ffmpeg.get('fps'), ffmpeg.get('video'), ffmpeg.get('audio'), path))

<!> You use %s for all parameters regarless of their type (so also for numbers). The library handles the proper type conversion.

<!> You do NOT use actual quotes (' or ") in the query. The library adds these as necessary.