A long-time issue and one of the reasons I've never used that function
before. So let's remove that todo-comment and escape the contents
properly.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Cc: @edolstra
Regression introduced by d84741a4bf.
The mentioned commit actually is a good thing, because we now get the
output from the X session.
Unfortunately, for the i3wm test, the i3-config-wizard prints out the
raw keyboard symbols directly coming from xcb, so the output isn't
necessarily proper UTF-8.
As the XML::Writer already expects valid UTF-8 input, we assume that
everything that comes into sanitise() will be UTF-8 from the start. So
we just decode() it using FB_DEFAULT as the check argument so that
every invalid character is replaced by the unicode replacement
character:
https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character
We simply re-oncode it again afterwards and return it, so we should
always get out valid UTF-8 in the log XML.
For more information about FB_DEFAULT and FB_CROAK, have a look at:
http://search.cpan.org/~dankogai/Encode-2.84/Encode.pm#Handling_Malformed_Data
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
By default this is now enabled, and it has to be explicitely enabled
using "enableOCR = true". If it is set to false, any usage of
getScreenText or waitForText will fail with an error suggesting to pass
enableOCR.
This should get rid of the rather large dependency on tesseract which
we don't need for most tests.
Note, that I'm using system("type -P") here to check whether tesseract
is in PATH. I know it's a bashism but we already have other bashisms
within the test scripts and we also run it with bash, so IMHO it's not a
problem here.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
As promised in the previous commit, this can be used similarly to
$machine->waitForWindow, where you supply a regular expression and it's
retrying OCR until the regexp matches.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Basically, this creates a screenshot and throws tesseract at it to
recognize the characters from the screenshot. In order to produce a
result that is well enough, we're using lanczos scaling and scale the
image up to 400% of its original size.
This provides the base functionality for a new Machine method which will
be called waitForText. I originally had that idea long ago when writing
the VM tests for VirtualBox and Chromium, but thought it would be
disproportionate to the case.
The downside however is that VM tests now depend on tesseract, but given
the average runtime of our tests it really shouldn't have a too big
impact and it's only a runtime dependency after all.
Another issue is that the OCR process takes quite some time to finish,
but IMHO it's better (as in more deterministic) than to rely on sleep().
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
The current way test reports get jquery,
src="//ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js"
only works when getting reports over http:// or https://, not file://.
Change it so that it works for all protocols by using a local copy of
jquery.
This fixes the issue where locally created and browsed test reports
cannot be navigated properly; clicking the '+' symbol to expand
sub-sections doesn't work.
This reverts commit 4e6eae45ee. It
breaks running the test driver interactively (in that it causes all
VMs to be started immediately, which is not always what you wnat).