User:Joca/word-embeddings: Difference between revisions

From XPUB & Lens-Based wiki
No edit summary
 
(One intermediate revision by the same user not shown)
Line 5: Line 5:
I participated in the Algolit session of March 17th and learnt about word embeddings. This is a way of unsupervised machine learning where an algoritm turns text into numbers and places them in a multi dimensional space. The relative distance between specific words is the result of how often they are placed close to each other in the original text.  
I participated in the Algolit session of March 17th and learnt about word embeddings. This is a way of unsupervised machine learning where an algoritm turns text into numbers and places them in a multi dimensional space. The relative distance between specific words is the result of how often they are placed close to each other in the original text.  


[https://gitlab.constantvzw.org/algolit/algolit/tree/master/algologs Scripts used during the session]
Using scripts from the Algolit Git I made a representation of the word embeddings in my reader for SI5. The script was based on the word2vec example of Tensorflow. Using dimension reduction it was able to represent a 21 dimensional space of words, into a 2d graphical representation.
[https://pad.constantvzw.org/p/180317_algolit_word2vec Pad of the day]
 
*[https://gitlab.constantvzw.org/algolit/algolit/tree/master/algologs Scripts used during the session]
*[https://pad.constantvzw.org/p/180317_algolit_word2vec Pad of the day]

Latest revision as of 09:21, 28 March 2018

Word embeddings in my reader for Special Issue 5

Algolit @ Varia

I participated in the Algolit session of March 17th and learnt about word embeddings. This is a way of unsupervised machine learning where an algoritm turns text into numbers and places them in a multi dimensional space. The relative distance between specific words is the result of how often they are placed close to each other in the original text.

Using scripts from the Algolit Git I made a representation of the word embeddings in my reader for SI5. The script was based on the word2vec example of Tensorflow. Using dimension reduction it was able to represent a 21 dimensional space of words, into a 2d graphical representation.