nixpkgs-suyu/pkgs/top-level
aszlig 7b5263e1a6
tesseract: Package version 4.x from Git master
Tesseract 4 has got a new long short-term memory neural networking based
OCR engine which really helps a lot in terms of accuracy and our VM
tests.

I ran the new version across a bunch of different screenshots and
comparing the results to the 3.x branch and it really makes a big
difference, especially with various font rendering settings.

The only downside of this is that version 4 hasn't been released yet and
is in alpha state right now, but it will eventually get there and the
only solutions that came into my mind sticking to version 3 were really
sub-par:

 * Use several passes with different color negation on the screenshots.
 * Train Tesseract 3 specifically for screenshots. This is sub-par
   because we'd need to do it for Tesseract 4 from scratch again.
 * Change the test systems so that it specifically uses *only* OCR an
   font when displaying. I've actually tried this but this also isn't
   accurate enough with our default font rendering setup.
 * Turn off special font rendering settings for our tests. In
   conjunction with changing to an OCR font this might work but it won't
   catch all the cases, because applications might use their own font
   rendering.

Given that version 4 is faster[1] when it comes to OCR detection and also
the points just mentioned I think even using the alpha version just for
tests isn't going to hurt anybody.

[1]: https://github.com/tesseract-ocr/tesseract/wiki/4.0-Accuracy-and-Performance

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
2017-04-11 03:21:46 +02:00
..
aliases.nix surf: 0.7 -> 2.0 2017-04-02 20:11:44 +02:00
all-packages.nix tesseract: Package version 4.x from Git master 2017-04-11 03:21:46 +02:00
default.nix top-level: Allow nixpkgs to take localSystem directly 2017-02-08 22:06:57 -05:00
dotnet-packages.nix dafny: fix meta attribute 2017-02-07 11:35:10 +01:00
emacs-packages.nix melpa-packages: init w3m at 20170203.647 2017-02-10 13:11:45 -06:00
emscripten-packages.nix libxml2: supportPython -> pythonSupport 2016-11-08 17:10:05 +01:00
haskell-packages.nix Disable integer-simple variant of GHC 7.6.3 since it does not compile. 2017-03-29 20:30:27 +02:00
impure.nix Allow directories with a default.nix to be imported as an overlay. Closes #23016. 2017-02-25 02:32:04 +01:00
java-packages.nix Complete hello world with test 2016-11-15 14:18:19 -05:00
lua-packages.nix luaPackages.vicious 2.1.3 -> 2.2.0 2017-03-30 20:27:57 +02:00
make-tarball.nix make-tarball.nix: Fix running as root 2016-12-15 13:08:21 +01:00
metrics.nix
node-packages-generated.nix nodePackages.bower2nix: 3.0.1 -> 3.1.1 2016-09-15 01:28:37 +01:00
node-packages.json yarn: init at 0.17.8 (#20635) 2016-12-14 15:46:45 +01:00
node-packages.nix
ocaml-packages.nix bap: init at 1.2.0 2017-04-04 13:11:01 -04:00
perl-packages.nix Merge pull request #24008 from phile314/slimserver 2017-04-08 17:43:41 +02:00
php-packages.nix phpPackages.composer: 1.3.2 -> 1.4.1 2017-03-24 22:16:10 +01:00
platforms.nix top-level/platforms.nix: Reformat and clean up whitespace 2017-04-10 15:39:47 -04:00
pure-packages.nix purePackages.octave: Use octaveHg 2017-04-09 21:54:39 +10:00
python-packages.nix pyocr: Add patch to support Tesseract 3.05.00 2017-04-11 03:21:39 +02:00
release-cross.nix top-level: no more need to expose splicedPackages 2017-01-25 09:24:55 -05:00
release-lib.nix nixpkgs: add aarch64-linux to release-lib 2017-03-08 17:13:34 +01:00
release-python.nix
release-small.nix release-small: use unar instead of unrar 2017-04-03 09:09:37 +02:00
release.nix Add aggregate job for a forthcoming nixpkgs-darwin-unstable channel 2017-04-10 12:35:32 -04:00
rust-packages.nix rustRegistry: 2017-04-03 -> 2017-04-08 2017-04-08 17:43:51 +02:00
splice.nix top-level: no more need to expose splicedPackages 2017-01-25 09:24:55 -05:00
stage.nix top-level: Only splice as needed for performance 2017-01-24 11:37:56 -05:00