Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 924402 - dev-python/pytesseract-0.3.12 fails tests
Summary: dev-python/pytesseract-0.3.12 fails tests
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Tupone Alfredo
URL:
Whiteboard:
Keywords: TESTFAILURE
Depends on:
Blocks:
 
Reported: 2024-02-13 07:48 UTC by Agostino Sarubbo
Modified: 2024-08-10 18:12 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
build.log (build.log,192.89 KB, text/plain)
2024-02-13 07:48 UTC, Agostino Sarubbo
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Agostino Sarubbo gentoo-dev 2024-02-13 07:48:10 UTC
https://blogs.gentoo.org/ago/2020/07/04/gentoo-tinderbox/

Issue: dev-python/pytesseract-0.3.12 fails tests.
Discovered on: amd64 (internal ref: clang-lld_tinderbox)
System: CLANG-LLD (https://wiki.gentoo.org/wiki/Project:Tinderbox/Common_Issues_Helper#CLANG-LLD)

Info about the issue:
https://wiki.gentoo.org/wiki/Project:Tinderbox/Common_Issues_Helper#CF0015
Comment 1 Agostino Sarubbo gentoo-dev 2024-02-13 07:48:13 UTC
Created attachment 884829 [details]
build.log

build log and emerge --info
Comment 2 Agostino Sarubbo gentoo-dev 2024-02-13 07:48:14 UTC
Error(s) that match a know pattern:


E               pytesseract.pytesseract.TesseractError: (127, "read_params_file: Can't open tessedit_create_boxfile=1 read_params_file: Can't open tessedit_create_hocr=1 tesseract: symbol lookup error: /usr/lib64/libtesseract.so.5: undefined symbol: __kmpc_global_thread_num")
E               pytesseract.pytesseract.TesseractError: (127, 'Estimating resolution as 304 tesseract: symbol lookup error: /usr/lib64/libtesseract.so.5: undefined symbol: __kmpc_global_thread_num')
E               pytesseract.pytesseract.TesseractError: (127, 'Estimating resolution as 333 tesseract: symbol lookup error: /usr/lib64/libtesseract.so.5: undefined symbol: __kmpc_global_thread_num')
E               pytesseract.pytesseract.TesseractError: (127, 'Page 0 : ./tests/data/test.jpg tesseract: symbol lookup error: /usr/lib64/libtesseract.so.5: undefined symbol: __kmpc_global_thread_num')
E               pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup error: /usr/lib64/libtesseract.so.5: undefined symbol: __kmpc_global_thread_num')
E           pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup error: /usr/lib64/libtesseract.so.5: undefined symbol: __kmpc_global_thread_num')
FAILED tests/pytesseract_test.py::test_image_to_alto_xml - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_boxes - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_data_common_output[bytes] - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_data_common_output[dict] - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_data_common_output[string] - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_pdf_or_hocr[hocr] - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_pdf_or_hocr[pdf] - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_string_batch - pytesseract.pytesseract.TesseractError: (127, 'Page 0 : ./tests/data/test.j...
FAILED tests/pytesseract_test.py::test_image_to_string_european - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_string_multiprocessing - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_string_with_args_type[image_object] - pytesseract.pytesseract.TesseractError: (127, 'Estimating resolution as 304...
FAILED tests/pytesseract_test.py::test_image_to_string_with_args_type[path_str] - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_string_with_image_type[gif] - pytesseract.pytesseract.TesseractError: (127, 'Estimating resolution as 304...
FAILED tests/pytesseract_test.py::test_image_to_string_with_image_type[jpeg2000] - pytesseract.pytesseract.TesseractError: (127, 'Estimating resolution as 304...
FAILED tests/pytesseract_test.py::test_image_to_string_with_image_type[jpg] - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_string_with_image_type[pgm] - pytesseract.pytesseract.TesseractError: (127, 'Estimating resolution as 304...
FAILED tests/pytesseract_test.py::test_image_to_string_with_image_type[png] - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_string_with_image_type[ppm] - pytesseract.pytesseract.TesseractError: (127, 'Estimating resolution as 304...
FAILED tests/pytesseract_test.py::test_image_to_string_with_image_type[tiff] - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
FAILED tests/pytesseract_test.py::test_image_to_string_with_image_type[webp] - pytesseract.pytesseract.TesseractError: (127, 'Estimating resolution as 304...
FAILED tests/pytesseract_test.py::test_la_image_to_string - pytesseract.pytesseract.TesseractError: (127, 'Estimating resolution as 333...
FAILED tests/pytesseract_test.py::test_run_and_get_multiple_output[extensions0] - pytesseract.pytesseract.TesseractError: (127, "read_params_file: Can't open...
FAILED tests/pytesseract_test.py::test_run_and_get_multiple_output[extensions1] - pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup err...
pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup error: /usr/lib64/libtesseract.so.5: undefined symbol: __kmpc_global_thread_num')
E               pytesseract.pytesseract.TesseractError: (127, 'tesseract: symbol lookup error: /usr/lib64/libtesseract.so.5: undefined symbol: __kmpc_global_thread_num')
Comment 3 Tupone Alfredo gentoo-dev 2024-02-18 07:34:39 UTC
(In reply to Agostino Sarubbo from comment #2)
> Error(s) that match a know pattern:
> 
> 
> E               pytesseract.pytesseract.TesseractError: (127,
> "read_params_file: Can't open tessedit_create_boxfile=1 read_params_file:
> Can't open tessedit_create_hocr=1 tesseract: symbol lookup error:
> /usr/lib64/libtesseract.so.5: undefined symbol: __kmpc_global_thread_num")

The errors seem related to app-text/tesseract. I'm not able to reproduce. Maybe pytesseract need to add some library?
Comment 4 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-02-18 07:35:17 UTC
It looks like the standard weird clang/lld weirdness with openmp. Not tesseract specific.
Comment 5 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-02-18 07:35:40 UTC
(In reply to Sam James from comment #4)
> It looks like the standard weird clang/lld weirdness with openmp. Not
> tesseract specific.

or, well, it might be, but if it is, it's app-text/tesseract which needs fixing.
Comment 6 Bernard Cafarelli gentoo-dev 2024-02-19 08:18:30 UTC
I rebuilt tesseract
# CC="clang" CXX="clang++" LDFLAGS="${LDFLAGS} -fuse-ld=lld" emerge -av1 tesseract
[ebuild   R    ] app-text/tesseract-5.3.4:0/5::gentoo  USE="float32 jpeg openmp png tiff webp -doc -opencl -static-libs -training" ABI_X86="32 (64) (-x32)"
# strings /usr/lib64/libtesseract.so.5|grep __kmpc_global_thread_num
__kmpc_global_thread_num

But I don't get openmp undefined symbol error and pytesseract tests pass
Comment 7 Paul Gover 2024-08-10 18:12:46 UTC
I'm getting the same problem from tesseract-5.0.4 compiled with clang - nothing to do with python in my case.
I see that this very issue gets tesseract into the system-wide clang bug 408963.  Compiling tesseract with gcc cures the problem.