aboutsummaryrefslogtreecommitdiff
path: root/utils/tessdata
Commit message (Collapse)AuthorAge
* tessdata: uncompress tarball only once to speed up buildsBaptiste Jonglez2021-06-30
| | | | | | | | | | | | | | | The previous approach was to uncompress N times a big tarball (638 MB) where N=130 is the number of supported languages. Each iteration would only extract a single file, but it still needs to uncompress the whole tarball. This is of course completely inefficient. Now, we uncompress the tarball only once to extract all relevant files, and then iterate N times to copy the file needed for each language. This massively speeds up builds, at the expense of temporarily requiring more build space (about 1 GB more) Signed-off-by: Baptiste Jonglez <git@bitsofnetworks.org>
* tessdata: update to 2.1.0Rosen Penev2021-03-25
| | | | | | Switch to AUTORELEASE for simplicity. Signed-off-by: Rosen Penev <rosenp@gmail.com>
* tessdata: reorganize menuEneas U de Queiroz2019-07-24
| | | | | | | Move language data menu under the package itself, and shorten the titles so that all of them show up in the menu. Signed-off-by: Eneas U de Queiroz <cotequeiroz@gmail.com>
* tesseract: add packageValentín Kivachuk2019-07-18
Tesseract is an open source text recognizer (OCR) Engine, available under the Apache 2.0 license. It can be used directly, or (for programmers) using an API to extract printed text from images. It supports a wide variety of languages. Signed-off-by: Valentín Kivachuk <vk18496@gmail.com>