User:Manetta/i-could-have-written-that/mining-software: Difference between revisions

From XPUB & Lens-Based wiki
No edit summary
(Replaced content with "<div style="width:100%;max-width:800px;"> __TOC__ =mining software= </div>")
 
Line 3: Line 3:
=mining software=
=mining software=


==CLiPS (Computational Linguistics & Psycholinguistics Research Center)==


the research center of the University of Antwerp where Pattern is comming from
{| class="wikitable" style="padding:20px;width:750px;"
|-
! text mining project !! application !! date !! link
|-
| automatic detection of crucial information || in clinical reports || 01/01/2016 - 31/12/2019 || [http://www.clips.ua.ac.be/projects/accumulate-acquiring-crucial-medical-information-using-language-technology link]
|-
| computational creativity || TheRiddlerBot, a twitter bot generating riddles about well-known characters || 2015 || [http://www.clips.ua.ac.be/sites/default/files/the_riddler_bot_a_next_step_on_the_ladder_towards_creative_twitter_bots.pdf link]
|-
| automatic opinion detection (using commercial web-services) ||  analyse media coverage on political issues; reporting on the 'sentiment' tone of news reports about the new Belgium parlement of 2011 (established after 541 days of negotiation) || 01/02/2014 - 31/01/2015  || [http://www.clips.ua.ac.be/projects/text-analytics-web-services-for-profiling-and-opinion-mining link]
|-
| styleometry || authorship attribution, personality prediction, gender prediction || 01/10/2014 - 30/09/2018 || [http://www.clips.ua.ac.be/projects/deep-linguistic-features-for-computational-stylometry link]
|-
| connecting knowledge from different domains || investigating the learning process of language by very young children || 01/01/2014 - 31/12/2017 || [http://www.clips.ua.ac.be/projects/bootstrapping-operations-in-language-acquisition-a-computational-psycholinguistic-approach link]
|-
| Automatic Monitoring for Cyberspace Applications (AMiCA) || mine blogs, chat rooms, social networking sites, tracing harmful content, contact, or conduct (cyber-bullying, pedophilia); detecting risks, and sending alerts to moderators; collecting accurate data to support providers, science and governments in decision-making processes with respect to child safety online || 01/01/2013 - 31/12/2016 || [http://www.amicaproject.be/ link]
|-
| language technology development for African languages || improving language software for minority languages, like translation engines and corpus development || 2006 - 2011 || [http://www.clips.ua.ac.be/projects/data-driven-techniques-in-african-language-technology link]
|-
| '''improving text-mining techniques''' || sentiment-analysis on suicide notes, to distinguish between fifteen emotion labels, from guilt, sorrow, and hopelessness to hopefulness and happiness; (emotion detection, suicide prevention) || 2012 || [http://www.clips.ua.ac.be/sites/default/files/f_bii-fine-grained-emotion-detection-in-suicide-notes-a-thresholding-approach_4099.pdf link]
|}
==Weka 3==
http://www.cs.waikato.ac.nz/ml/Title-Bird-Header.gif
Weka is a data mining application written in Java, developed at the university of Waikato, New Sealand. [http://www.cs.waikato.ac.nz/ml/weka/ This is a link to Weka's project page.]
description of text mining (2003): '' It most commonly targets text whose function is the '''communication of factual information or opinions''', (...) “Textmining”(sometimes called “text data mining”;[4]) defies tight definition but encompasses a wide range of activities: text summarization; document retrieval; document clustering; text categorization; language identification; authorship ascription; identifying phrases, phrase structures, and key phrases; extracting “entities” such as names, dates, and abbreviations; locating acronyms and their definitions; filling predefined templates with extracted information; and even learning rules from such templates[8].'' [http://researchcommons.waikato.ac.nz/bitstream/handle/10289/1298/text%20mining%20in%20a%20digital%20library.pdf?sequence=1&isAllowed=y (Witten ed., 2003)]
{| class="wikitable" style="padding:20px;width:750px;"
|-
! (text) mining project !! application !! date !! link
|-
| Document Copy Detector || plagiarism detection || 2016 || [http://www.cs.waikato.ac.nz/~fjb11/publications/inffus15.pdf link]
|-
| large-scale continuous global optimisation || increase efficiency in search engines; || 2015 || [http://www.ncbi.nlm.nih.gov/pubmed/25950391 link]
|-
| opinion mining || tourism product reviews || 2014 || [http://www.cs.waikato.ac.nz/~fjb11/publications/ESWA2014.pdf link]
|-
| mining on the web || ranking order op webpages in search engines (p.21); search query patterns for advertisment profiling; costumer recommendations to increase sales; user recommendations for films to make sure they come back to the website; ''And then there are social networks and other personal data;'' decision procedures at loan-companies through questionaires, which motivates such companies when seeing their results increase: it 'works' (p.22); ''detect intrusion by recognizing unusual patterns of operation'' (p.28) || 2011 || [http://www.cs.waikato.ac.nz/ml/weka/book.html book]
|-
| marketing and sales || (''In these applications, predictions themselves are the chief interest: The structure of how decisions are made is often completely irrelevant.'') to 'woo' customers back by offering special treatments; product positioning in supermarkets after 'Market basket analysis', ''customers who buy beer also buy chips,''; personal discounts: ''Supermarkets want you to feel that although [prices are ricing], they don’t increase so much for you because the bargains offered by personalized coupons make it attractive for you to stock up on things that you wouldn’t normally have bought.''; direct marketing, focused promotions; demographic information is correlated to product demands; (p.26-)  || 2011 || [http://www.cs.waikato.ac.nz/ml/weka/book.html book]
|-
| (healthcare) || increasing success rated of artificial insemination || ? || ?
|-
| (image recognition) || detect oil slicks from satellite images to give early warning of ecological disasters and deter illegal dumping (p.23)  || 2011 || [http://www.cs.waikato.ac.nz/ml/weka/book.html book]
|-
| (electricity industry) || determine future demand for power as far in advance as possible (p.24) || 2011 || [http://www.cs.waikato.ac.nz/ml/weka/book.html book]
|-
| (technological diagnoses) || forestall failures that disrupt industrial processes (p.25) || 2011 || [http://www.cs.waikato.ac.nz/ml/weka/book.html book]
|-
| text mining in a digital library || enrich the library reader’s experience; ''a carefully chosen set of authoritative documents in a particular topic area is far more useful to those working in the area than a huge, unfocused collection (like the Web)'' || 2003 - 2004 || [http://researchcommons.waikato.ac.nz/bitstream/handle/10289/1298/text%20mining%20in%20a%20digital%20library.pdf?sequence=1&isAllowed=y link], [http://researchcommons.waikato.ac.nz/handle/10289/1298 link]
|-
| '''improving text mining techniques''' || (un)supervised creation of twitter opinion/sentiment(POS/NEUT/NEG) corpus; towards solution for text mining that are general, effective, and scalable;  || 2015 || [http://ijcai.org/papers15/Papers/IJCAI15-177.pdf link], [http://www.cs.waikato.ac.nz/~ml/publications.html link]
|}
==notes==
''Automation is especially welcome in situations involving continuous monitoring,''
''a job that is time consuming and exceptionally tedious for humans.'' (Witten ed., 2011)
''Statistical tests are used to validate machine learning models ''
''and to evaluate machine learning algorithms.'' (Witten ed., 2011)
''If you do come up with conclusions (e.g., red car owners being greater credit risks),''
''you need to attach caveats to them and back them up with arguments other than''
''purely statistical ones. The point is that data mining is just a tool in the whole''
''process. It is people who take the results, along with other knowledge, and decide''
''what action to apply.'' (Witten ed., 2011)
==gallery==
==references==
* Witten, Frank, Hall 2011 - [http://www.cs.waikato.ac.nz/ml/weka/book.html Data Mining - Practical Machine Learning Tools and Techniques, 3rd Edition]


</div>
</div>

Latest revision as of 17:56, 28 January 2016

mining software