User:Francg/expub/thesis/bibliography

Annotated Bibliography

19.10.17

(synopsis cutouts from jstor and others sources)

Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More 2nd Edition, Matthew A. Rossell
We live in a world of big data: the amount of information collected on human behavior each day is staggering, and exponentially greater than at any time in the past. How data mining substantially differs from conventional statistical modeling familiar to most social scientists. The authors also empower social scientists to tap into these new resources and incorporate data mining methodologies in their analytical toolkits. Data Mining for the Social Sciences demystifies the process by describing the diverse set of techniques available, discussing the strengths and weaknesses of various approaches, and giving practical demonstrations of how to carry out analyses using tools in various statistical software packages.

The Tao of Open Source Cyber Intelligence, Stewart K
The Internet has become the defining medium for information exchange in the modern world, and the unprecedented success of new web publishing platforms such as those associated with social media has confirmed its dominance as the main information exchange platform for the foreseeable future. But how do you conduct an online investigation when so much of the Internet isn't even indexed by search engines? Accessing and using the information that's freely available online is about more than just relying on the first page of Google results. Open source intelligence (OSINT) is intelligence gathered from publicly available sources, and is the key to unlocking this domain for the purposes of investigation. It catalogues and explains the tools and investigative approaches that are required when conducting research within the surface, deep and dark webs. It explains how to scrutinize criminal activity without compromising your anonymity - and your investigation. It examines the relevance of cyber geography and how to get round its limitations. It describes useful add-ons for common search engines, as well as considering Metasearch engines (including Dogpile, Zuula, PolyMeta, iSeek, Cluuz, and Carrot2) that collate search data from single-source intelligence platforms such as Google. It considers deep web social media platforms and platform-specific search tools, detailing such concepts as concept mapping, Entity Extraction tools, and specialist search syntax (Google Kung-Fu). It gives comprehensive guidance on Internet security for the smart investigator, and how to strike a balance between security, ease of use and functionality, giving tips on counterintelligence, safe practices, and debunking myths about online privacy.

Reality Mining: Using Big Data to Engineer a Better World
Big Data is made up of lots of little data: numbers entered into cell phones, addresses entered into GPS devices, visits to websites, online purchases, ATM transactions, and any other activity that leaves a digital trail. Although the abuse of Big Data -- surveillance, spying, hacking -- has made headlines, it shouldn't overshadow the abundant positive applications of Big Data. InReality Mining, Nathan Eagle and Kate Greene cut through the hype and the headlines to explore the positive potential of Big Data, showing the ways in which the analysis of Big Data ("Reality Mining") can be used to improve human systems as varied as political polling and disease tracking, while considering user privacy. Reality Mining at five different levels: the individual, the neighborhood and organization, the city, the nation, and the world. For each level, there are data collection methods and it describes applications and systems that have been or could be built; a workplace "knowledge system"; the use of GPS, Wi-Fi, and mobile phone data to manage and predict traffic flows; and the analysis of social media to track the spread of disease. Eagle and Greene argue that Big Data, used respectfully and responsibly, can help people live better, healthier, and happier lives.

Digital Methods
How can we study social media to learn something about society rather than about social media use? How can hyperlinks reveal not just the value of a Web site but the politics of association? Rogers proposes repurposing Web-native techniques for research into cultural change and societal conditions. We can learn to reapply such "methods of the medium" as crawling and crowd sourcing, PageRank and similar algorithms, tag clouds and other visualizations; we can learn how they handle hits, likes, tags, date stamps, and other Web-native objects. By "thinking along" with devices and the objects they handle, digital research methods can follow the evolving methods of the medium. Rogers uses this new methodological outlook to examine the findings of inquiries into 9/11 search results, the recognition of climate change skeptics by climate-change-related Web sites, the events surrounding the Srebrenica massacre according to Dutch, Serbian, Bosnian, and Croatian Wikipedias, presidential candidates' social media "friends," and the censorship of the Iranian Web. With Digital Methods, Rogers introduces a new vision and method for Internet research and at the same time applies them to the Web's objects of study, from tiny particles (hyperlinks) to large masses (social media).

Web as History: Using Web Archives to Understand the Past and the Present (chapter 2 - Live versus archive: Comparing a web archive to a population of web pages
The World Wide Web has now been in use for more than 20 years. From early browsers to today’s principal source of information, entertainment and much else, the Web is an integral part of our daily lives, to the extent that some people believe ‘if it’s not online, it doesn’t exist.’ While this statement is not entirely true, it is becoming increasingly accurate, and reflects the Web’s role as an indispensable treasure trove. It is curious, therefore, that historians and social scientists have thus far made little use of the Web to investigate historical patterns of culture and society, despite making good use of letters, novels, newspapers, radio and television programmes, and other pre-digital artefacts. This volume argues that now is the time to question what we have learned from the Web so far. The 12 chapters explore this topic from a number of interdisciplinary angles – through histories of national web spaces and case studies of different government and media domains – as well as an introduction that provides an overview of this exciting new area of research.

- - -
Future Shock
International Journal of Information Management, 2000
STRUCTURING COMPUTER-MEDIATED COMMUNICATION SYSTEMS TO AVOID INFORMATION OVERLOAD - Article 1985
Information Overload and the Message Dynamics of Online Interaction Spaces: A Theoretical Model and Empirical Exploration, 2004
Web Scraping with Python: Collecting Data from the Modern Web 1st Edition