User:Cristinac/TPNotes

From XPUB & Lens-Based wiki

Character recognition-conversion of text characters into machine readable code. Application: postal code reading, automatic data entry into large administrative systems, recognition of print and script, automatic cartography, banking, reading devices for the blind
OCR is a process that maps a page image I representing a source string S into a recognized string R. Typically, both S and R are defined over the same fixed alphabet Σ (for English, Σ is usually ASCII). While in the most abstract sense OCR is a ¨black box" recognizer accepting bitmaps and generating text, in truth the process consists of multiple levels of segmentation followed by a final recognition (i.e. pattern matching) step. Page images are segmented into lines and lines into characters, and then finally characters are matched against predefined prototypes to determine their translations.
-


harnessing human brainpower to solve problems that computers cannot
-


In contrast to text recognition in documents, which is satisfactorily addressed by state-of-the-art OCR systems, scene text localization and recognition is still an open problem. Factors contributing to the complexity of the problem include: non-uniform background, the need for compensation of perspective effects (for documents, rotation or rotation and scaling is sufficient); real-world texts are often short snippets written in different fonts and languages; text alignment does not follow strict rules of printed documents; many words are proper names which prevents an effective use of a dictionary.
-
Consider the impact of computers on publishing. Book production has been transformed by computers. Book distribution, even more, has been revolutionized by computerized systems of inventory control that have been instrumental in making the chain bookstores and superstores a rousing success and a significant influence on publishers´ decisions about what and how to publish. But the individual experience of reading a book is very much what it used to be. The e-book is a novelty at best. The digital revolution, so far, has influenced the basic experience of reading a book far less than the paperback revolution half a century ago.
-Click here for Democracy: A History and Critique of an Information-Based Model of Citizenship, Michael Schudson


“Print-on-demand liberates artists from the oppressively expensive and laborious demands of traditional photobook publishing. Print-on-demand is fast, cheap, and light. It exists outside the power structures of publishers and distributors. Few people take it seriously and we are one of the few. We’re not interested in what the books smell like, how they’re bound, whether they’re embossed or printed on the finest papers on Earth. Those are luxuries we can live without. We’re interested in raw ideas and there is no better transporter for a great idea than a book. A single book if needs be. And with the internet, the ideas in that single book can go viral and reach millions in a split second. No need for proposals, book dummies, meetings, bank loans, trucks, boats, trains and planes to ship hundreds of kilos of heavy books across the world into warehouses and bookshops. A powerful idea expressed in a collection of pictures bound together for the price of a meal and placed online can bypass all of that.”
-“Paleolithic cave paintings: ABC (Artists’ Books Cooperative) on the photobook”, interview with ABC (2014)


If the user passes, that’s all they will need to do. If some doubt as to whether they are human or not remains, they will then be asked to fill in a Captcha form, which could involve clicking or tapping on cats or dogs instead of entering text into a box.
-http://www.theguardian.com/technology/2014/dec/04/no-more-infuriating-captcha-google-simply-asks-are-you-a-robot


But owing to the way such documents are rendered, pdfs often give up machine readability in favour of human readability. The basic format doesn't include any requirement that text be selectable or searchable, while data presented as charts and tables is often impossible to export in any useable way. That then makes it impossible to mine the documents for the data they contain and so create databases of new information pulling together disparate sources. Despite efforts to create "pdf to html" converters, they still need human oversight to check for errors of interpretation.
-http://www.theguardian.com/technology/2014/may/09/is-the-pdf-hurting-democracy


What are the political possibilities of making information available? A thing that is scanned was already downloaded, in a sense. It circulated on paper, as widely as newspapers or as little as classified documents. And interfering with its further circulation is a time-honored method of keeping a population in check. Documents are kept private; printing presses shut down. Scanning printed material for internet circulation has the potential to circumvent some of these issues. Scanning means turning the document into an image, one that is marked by glitches and bearing the traces of editorial choices on the part of the scanner. Although certain services remain centralized and vulnerable to political manipulation, such as the DNS addressing system, and government monitoring of online behavior is commonplace, there is still political possibility in the aggregate, geographically dispersed nature of the internet. If the same document is scanned, uploaded, and then shared across a number of different hosts, it becomes much more difficult to suppress. And it gains traction by circulation.
Scans are raw material, not journalism. They offer support to a story and give the impression of truthfulness. Wikileaks, for example, benefits enormously from the expanse of the internet, allowing it to dump all of the information it makes available on its website, thus shifting the role of newspapers to no longer publish information, but rather, to organize it. As Julian Assange said, "It's too much; it's impossible to read it all, or get the full overview of all the revelations."
The romanticized image of the scanner is based on the assumption that by scanning and uploading we make information available, and that that is somehow an invariably democratic act. Scanning has become synonymous with transparency and access. But does the document dump generate meaningful analysis, or make it seem insignificant? Does the internet enable widespread distribution, or does it more commonly facilitate centralized access?
-http://rhizome.org/editorial/2014/oct/9/unbound-politics-scanning/


Image Recognition CAPTCHAs
1. The naming images CAPTCHA. The naming CAPTCHA presents six images to the user. If the user correctly types the common term associated with the images,the user passes the round.
2. The distinguishing images CAPTCHA. The distinguishing CAPTCHA presents two sets of images to the user. Each set contains three images of the same subject. With equal probability, both sets either have the same subject or not. The user must determine whether or not the sets have the same subject in order to pass the round.
3.The identifying anomalies CAPTCHA. The anomaly CAPTCHA presents six images to the user: five images are of the same subject, and one image (the anomalous image) shows a different subject. The user must identify the anomalous image to pass the test.
-http://www.cs.berkeley.edu/~tygar/papers/Image_Recognition_CAPTCHAs/imagecaptcha.pdf


Computer vision* is the transformation of data from a still or video camera into either a decision or a new representation. All such transformations are done for achieving some particular goal. The input data may include some contextual information such as “the camera is mounted in a car” or “laser range finder indicates an object is 1 meter away”. The decision might be “there is a person in this scene” or “there are 14 tumor cells on this slide”. A new representation might mean turning a color image into a grayscale image or removing camera motion from an image sequence.
-


INC:Let’s talk about something that has always been seen as the main issue in graphic design and editorial design, the relationship between text and images. What happens to the image in the digital media?
D: Images and texts are only different in form, but actually they are quite close. Images were the first form people used to try to easily pass information on to other people. Later, somehow, those images became symbols, and then symbols became characters and, combined with one another, they became sentences.
-http://arstechnica.com/gadgets/2012/10/drm-be-damned-how-to-protect-your-amazon-e-books-from-being-deleted/


Hyperness. What happens if and when the very concept of “text” becomes, as Andy Miah has argued, “increasingly uninteresting or useless”?—as writers use software, such as Macromedia’s Flash, which has the effect of rendering the text more as “image than text, ungraspable and flat, layered with a virtual and invisible hyperness . . . [and] the sub-level of hyperness, which is really what is of interest when discussing hypertext, derives from the nature of the browser, rather than some new characteristic of text” (2003).20
-


After Google Will Eat Itself, ubermorgen.com, Alessandro Ludovico and Paolo Cirio propose a subversive online work that questions the inconsistencies in the enforcement of copyright law. Using the aesthetics of Film Noir, a corresponding plot, and protagonist, the project allows users to ‘legally’ steal and redistribute copyright books from amazon.com. A programmed software-bot is going to outwit Amazon’s “search-inside-the-book”-system and will be capable of using the search results to compile entire books. The projectpoints out the hypocrisy of the digital copyright lobby. Past work of the group has infiltrated mainstream media outlets, bringing questions about civil rights, patent, copyright and democracy in the digital age to wide audiences. The proposed project will enter into direct conversation with media, legal and business entities and consumer behaviours.
-


Have you ever wondered how Google Maps knows the exact location of your neighborhood coffee shop? Or of the hotel you’re staying at next month? Translating a street address to an exact location on a map is harder than it seems. To take on this challenge and make Google Maps even more useful, we’ve been working on a new system to help locate addresses even more accurately, using some of the technology from the Street View and reCAPTCHA teams.
-http://googleonlinesecurity.blogspot.co.uk/2014/04/street-view-and-recaptcha-technology.html>


As Hall has argued in Digitize this book!, where he gives a very detailed and comprehensive overview of the differing but often also overlapping motivations that exist concerning open access and openness, there is nothing intrinsically political or democratic about open access. A constant wide flow of information can give rise to secrecy through the difficulty of gaining visibility.
Where initially the open access and open source movements where heralded by progressive thinkers as part of a critique of the commodification of knowledge (Berry 2008: 39), openness is seen increasingly as a concept and practice that connects well with neoliberal needs and rhetoric, and that can be related to ideas of transparency and efficiency promoted by business and government.
From an initially subversive idea,[2] one can argue that open access, partly related to its growing accessibility and wider general uptake, is increasingly co-opted by capitalist ideology (of which the Finch Report, which we will be discussing later, is ample evidence) and as a result is turning in some respects at least into yet another business model for commercial publishers to reap a profit from.
The open access movement[5] can be seen as a direct reaction against the ongoing commercialisation of research and of the publishing industry, coupled to a felt need to make research more widely accessible in a faster and more efficient way. Open access literature has been defined as ‘digital, online, free of charge, and free of most copyright and licensing restrictions’.
In his critique of openness, Tkacz thus focuses mainly on Popper and on how the binary open-closed cannot be upheld, since closure is inherent in Popper’s notion of openness. Tkacz states that, based on the philosophy of Popper, the open as a concept is reactionary (where it merely states what it is not, i.e. not closed), it has no (true, positive) meaning—which would close it off—and cannot ‘build a lasting affirmative dimension’ (2012: 400).
‘openness refers to the relative degree of freedom given to the dissemination of information or knowledge and involves assumptions concerning the nature and extent of the audience’
‘openness and secrecy are often interlocked, impossible to take apart, and they might even reinforce each other. They should be understood as positive (instead of privative) categories that do not necessarily stand in opposition to each other’
something can be open but at the same time undiscoverable in a sea of information overload, which can make for new forms of secrecy.[16] Openness and secrecy also don’t always exclude each other, Vermeir states—in the publication of a coded text, for example. Finally, whether we see something as open or secret also depends on the perceiver’s viewpoint.
-https://openreflections.wordpress.com/2015/02/25/new-models-of-knowledge-production-open-access-publishing-and-experimental-research-practices-part-i/


The new reCAPTCHA is here. A significant number of your users can now attest they are human without having to solve a CAPTCHA. Instead with just a single click they’ll confirm they are not a robot. We’re calling it the No CAPTCHA reCAPTCHA experience.
-http://www.google.com/recaptcha/intro/index.html


Instead of depending upon the traditional distorted word test, Google’s “reCaptcha” examines cues every user unwittingly provides: IP addresses and cookies provide evidence that the user is the same friendly human Google remembers from elsewhere on the Web. And Shet says even the tiny movements a user’s mouse makes as it hovers and approaches a checkbox can help reveal an automated bot.
He adds that Google also will use other variables that it is keeping secret—revealing them, he says, would help botmasters improve their software and undermine Google’s filters.
-http://www.wired.com/2014/12/google-one-click-recaptcha/


Captcha-comic.jpg