User:Joca/word-embeddings: Difference between revisions
No edit summary |
|||
(One intermediate revision by the same user not shown) | |||
Line 5: | Line 5: | ||
I participated in the Algolit session of March 17th and learnt about word embeddings. This is a way of unsupervised machine learning where an algoritm turns text into numbers and places them in a multi dimensional space. The relative distance between specific words is the result of how often they are placed close to each other in the original text. | I participated in the Algolit session of March 17th and learnt about word embeddings. This is a way of unsupervised machine learning where an algoritm turns text into numbers and places them in a multi dimensional space. The relative distance between specific words is the result of how often they are placed close to each other in the original text. | ||
[https://gitlab.constantvzw.org/algolit/algolit/tree/master/algologs Scripts used during the session] | Using scripts from the Algolit Git I made a representation of the word embeddings in my reader for SI5. The script was based on the word2vec example of Tensorflow. Using dimension reduction it was able to represent a 21 dimensional space of words, into a 2d graphical representation. | ||
[https://pad.constantvzw.org/p/180317_algolit_word2vec Pad of the day] | |||
*[https://gitlab.constantvzw.org/algolit/algolit/tree/master/algologs Scripts used during the session] | |||
*[https://pad.constantvzw.org/p/180317_algolit_word2vec Pad of the day] |
Latest revision as of 09:21, 28 March 2018
Algolit @ Varia
I participated in the Algolit session of March 17th and learnt about word embeddings. This is a way of unsupervised machine learning where an algoritm turns text into numbers and places them in a multi dimensional space. The relative distance between specific words is the result of how often they are placed close to each other in the original text.
Using scripts from the Algolit Git I made a representation of the word embeddings in my reader for SI5. The script was based on the word2vec example of Tensorflow. Using dimension reduction it was able to represent a 21 dimensional space of words, into a 2d graphical representation.