PythonLabZalan: Difference between revisions
No edit summary |
|||
Line 1: | Line 1: | ||
== Optical character recognition + Tesseract == | == Optical character recognition + Tesseract == | ||
Firstly I experimented in Terminal how to translate PDF or JPG to .txt files with tesseract and imagemagick (convert). | |||
[[Optical character recognition]] | [[Optical character recognition]] | ||
Line 8: | Line 10: | ||
imagemagick | imagemagick | ||
* Mac <code>brew install imagemagick</code> | * Mac <code>brew install imagemagick</code> | ||
== Python3 == | == Python3 == |
Revision as of 15:41, 24 March 2018
Optical character recognition + Tesseract
Firstly I experimented in Terminal how to translate PDF or JPG to .txt files with tesseract and imagemagick (convert).
Tesseract (with languages you will be using)
- Mac
brew install tesseract --all-languages
imagemagick
- Mac
brew install imagemagick