User:Andre Castro/2.1/spam/spamwritingtechniques
Spam writing (and other) techniques
Here I try to compile and give a brief explanation, maybe also examples of the writing techniques used by spammers in the messages.
Spam Lit
Introduction of fragments of prose or verse, belonging to literary works, in the email body.
Religious text, such as the bible, are for some reason very common.
Literary texts are inserted with the intent to fool spam-filters. Not only it provides a larger bulk of text than the tipical spam email have, but also it introduces key words that not common to spam. Sometimes the reason behind its introduction of the email is not clear, since nothing else, a part from the literary text is present. No "buy this","contact me if you want to be part of this business" is present in the email.
From wikipedia: Bayesian filtering has become popular as a spam-filtering technique, spammers have started using methods to weaken it. To a rough approximation, Bayesian filters rely on word probabilities. If a message contains many words that are used only in spam, and few that are never used in spam, it is likely to be spam. To weaken Bayesian filters, some spammers, alongside the sales pitch, now include lines of irrelevant, random words, in a technique known as Bayesian poisoning. A variant on this tactic may be borrowed from the Usenet abuser known as "Hipcrime"—to include passages from books taken from Project Gutenberg, or nonsense sentences generated with "dissociated press" algorithms. Randomly generated phrases can create spoetry (spam poetry) or spam art.
Auto generated headers
From wikipedia: Another method used to masquerade spam as legitimate messages is the use of autogenerated sender names in the From: field, ranging from realistic ones such as "Jackie F. Bird" to (either by mistake or intentionally) bizarre attention-grabbing names such as "Sloppiest U. Epiglottis" or "Attentively E. Behavioral". Return addresses are also routinely auto-generated, often using unsuspecting domain owners' legitimate domain names, leading some users to blame the innocent domain owners
Image spam
A strategy to obfuscate the message of the email. The text is included into an image, which makes hard for spam filters to detected common spam terms (filters have to use orc tools in order to do that).
Blank spam
Spam lacking payload advertisement. Often there is no message body.
These emails can have their origin in Directory Harvest Attack, where possible email addresses for a given provider are generated and email in order to test their existance, if the email bounces back, that mean that no such address exists The can as well contain HTML code (JS?) to start the browser, in a given url, when the email is read.
Word obfuscation
As spam filter work by searching patterns of blacklisted terms, to defeat those filters words and spelt unsual ways such as: V1agra, Via'gra, Vi@graa, vi*gra, \/iagra , to that it remains legible to humans, but is uncaught by the filter
HTML-based email opens more possibilities for obfuscation. By inserting HTML comments between letters can foil some filters, as can including text made invisible by setting the font color to white on a white background, or shrinking the font size to the smallest fine print.
Patching
Altough not mentioned, by looking at some of the examples there seems to be great emplyment of copy and paste. Some seem to clearly repeate themselves, only with slight chnages. I have been able to trace sources and afirm that different source are put together, in a work of patching, into a single message.
There seem to be some parallels in the techniques of spam writing and those used by Oulipo's members, such as appropriation, word games (word obsfucation). I wonder if this parallel can be further explored.