PythonLabZalan: Difference between revisions
No edit summary |
|||
Line 19: | Line 19: | ||
tesseract - png - name of the txt file | tesseract - png - name of the txt file | ||
tesseracttest SZAKACS$ tesseract | tesseracttest SZAKACS$ tesseract namefile.png text2.txt | ||
= '''Python3'''= | = '''Python3'''= |
Revision as of 16:06, 24 March 2018
Terminal
Firstly I looked into basic command line functions File:Commands terminal.pdf and their operations for creating a solid base for Python3.
Optical character recognition + Tesseract
Secondarily I experimented in Terminal how to translate PDF or JPG to .txt files with tesseract and imagemagick (convert).
Tesseract (with languages you will be using)
- Mac
brew install tesseract --all-languages
imagemagick
- Mac
brew install imagemagick
tesseract - png - name of the txt file
tesseracttest SZAKACS$ tesseract namefile.png text2.txt