Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 940138 - app-text/tesseract-5.3.4[opencl] crash when running
Summary: app-text/tesseract-5.3.4[opencl] crash when running
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Bernard Cafarelli
URL:
Whiteboard:
Keywords:
Depends on: 941651
Blocks:
  Show dependency tree
 
Reported: 2024-09-23 09:39 UTC by Erik Quaeghebeur
Modified: 2024-11-05 19:44 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Erik Quaeghebeur 2024-09-23 09:39:14 UTC
I had app-text/tesseract-5.3.4 installed with the opencl use flag. It crashes every time when running. Removing the opencl use flag and reinstalling results in an executable that does not crash.

Stack trace from log:
---
[🡕] Process 629586 (tesseract) of user 1000 dumped core.

Module libtinfo.so.6 without build-id.
Module libffi.so.8 without build-id.
Module libSPIRV-Tools-link.so without build-id.
Module libSPIRV-Tools.so without build-id.
Module libSPIRV-Tools-opt.so without build-id.
Module libLLVMSPIRVLib.so.18.1 without build-id.
Module libclang-cpp.so.18.1 without build-id.
Module libdrm_amdgpu.so.1 without build-id.
Module libdrm_radeon.so.1 without build-id.
Module libexpat.so.1 without build-id.
Module libdrm.so.2 without build-id.
Module libLLVM.so.18.1 without build-id.
Module libicudata.so.74 without build-id.
Module libunistring.so.5 without build-id.
Module libidn2.so.0 without build-id.
Module libicuuc.so.74 without build-id.
Module ld-linux-x86-64.so.2 without build-id.
Module libssl.so.3 without build-id.
Module libpsl.so.5 without build-id.
Module libnghttp2.so.14 without build-id.
Module libcares.so.2 without build-id.
Module libxml2.so.2 without build-id.
Module libbz2.so.1 without build-id.
Module libzstd.so.1 without build-id.
Module liblzma.so.5 without build-id.
Module libacl.so.1 without build-id.
Module libcrypto.so.3 without build-id.
Module libz.so.1 without build-id.
Module libtiff.so.6 without build-id.
Module libgif.so.7 without build-id.
Module libjpeg.so.62 without build-id.
Module libpng16.so.16 without build-id.
Module libgomp.so.1 without build-id.
Module libc.so.6 without build-id.
Module libgcc_s.so.1 without build-id.
Module libm.so.6 without build-id.
Module libstdc++.so.6 without build-id.
Module libcurl.so.4 without build-id.
Module libarchive.so.13 without build-id.
Module libleptonica.so.6 without build-id.
Module libOpenCL.so.1 without build-id.
Module libtesseract.so.5 without build-id.
Module tesseract without build-id.
Stack trace of thread 629586:
#0  0x00007e2771f7eb89 __memset_avx2_unaligned_erms (libc.so.6 + 0x147b89)
#1  0x00007e2772a4e02b _ZN9tesseract12OpenclDevice16HistogramRectOCLEPviiiiiiiPi (libtesseract.so.5 + 0x24e02b)
#2  0x00007e2772a5670f n/a (libtesseract.so.5 + 0x25670f)
#3  0x00007e2772a570e4 _ZN9tesseract12OpenclDevice18getDeviceSelectionEv (libtesseract.so.5 + 0x2570e4)
#4  0x00007e2772a4f988 _ZN9tesseract12OpenclDevice32InitOpenclRunEnv_DeviceSelectionEi (libtesseract.so.5 + 0x24f988)
#5  0x00007e2772a4f9db _ZN9tesseract12OpenclDevice7InitEnvEv (libtesseract.so.5 + 0x24f9db)
#6  0x00007e277288ce7f _ZN9tesseract11TessBaseAPI4InitEPKciS2_NS_13OcrEngineModeEPPciPKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaISC_EESG_bPFbS2_PS6_IcSB_EE (libtesseract.so.5 + 0x8ce7f)
#7  0x00007e277288d436 _ZN9tesseract11TessBaseAPI4InitEPKcS2_NS_13OcrEngineModeEPPciPKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaISC_EESG_b (libtesseract.so.5 + 0x8d436)
#8  0x00005ecd53ec09dd n/a (tesseract + 0x39dd)
#9  0x00007e2771e5ceec __libc_start_call_main (libc.so.6 + 0x25eec)
#10 0x00007e2771e5cfa5 __libc_start_main_impl (libc.so.6 + 0x25fa5)
#11 0x00005ecd53ec2d41 n/a (tesseract + 0x5d41)
ELF object binary architecture: AMD x86-64
---

Command and output:
---
pdfsandwich <filename>.pdf 
pdfsandwich version 0.1.7
Warning: tesseract option --list-langs not implemented. Cannot check languages. Make sure you have all necessary tesseract language packages installed.
Input file: "<filename>.pdf"
Output file: "<filename>_ocr.pdf"
Number of pages in inputfile: 2
More threads than pages. Using 2 threads instead.

Parallel processing with 2 threads started.
Processing page order may differ from original page order.

Processing page 1.
Processing page 2.
identify -format "%w\n%h\n"  "/tmp/pdfsandwich_tmpfbc425/pdfsandwich_inputfileb1b5d0.pdf[1]" 
identify -format "%w\n%h\n"  "/tmp/pdfsandwich_tmpfbc425/pdfsandwich_inputfileb1b5d0.pdf[0]" 
ERROR: Command "OMP_THREAD_LIMIT=1 tesseract /tmp/pdfsandwich_tmpfbc425/pdfsandwich829289.tif>/dev/null 2>&1 /tmp/pdfsandwich_tmpfbc425/pdfsandwichbe7e64  -l eng pdf " failed. 
Terminating pdfsandwich. All temporary files are kept.
---

Reproducible: Always
Comment 1 Maxim P. Dementiev 2024-10-16 12:43:22 UTC
I've got as well problem with tesseract-5.3.4[opencl] crash:

(gdb) run p1.png -
Starting program: /usr/bin/tesseract p1.png -
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[Detaching after vfork from child process 2818]
[DS] Profile file not available (tesseract_opencl_profile_devices.dat); performing profiling.

[DS] Device: "h���" (OpenCL) evaluation...
OpenCL error code is -33 at   when populateGPUEnv::getDeviceInfo(TYPE) .
OpenCL error code is -33 at   when populateGPUEnv::getDeviceInfo(PLATFORM) .
OpenCL error code is -33 at   when populateGPUEnv::createContext .
OpenCL error code is -34 at   when populateGPUEnv::createCommandQueue .
OpenCL error code is -33 at   when clGetDeviceInfo .
OpenCL error code is -34 at   when clCreateProgramWithSource .
OpenCL error code is -44 at   when clCreateKernel composeRGBPixel .
OpenCL error code is -48 at   when clSetKernelArg .
OpenCL error code is -48 at   when clSetKernelArg .
OpenCL error code is -48 at   when clSetKernelArg .
OpenCL error code is -48 at   when clSetKernelArg .
OpenCL error code is -48 at   when clSetKernelArg .
OpenCL error code is -36 at   when clEnqueueNDRangeKernel .
OpenCL error code is -36 at   when clEnqueueMapBuffer outputCl .
OpenCL error code is -34 at   when clCreateBuffer imageBuffer .
OpenCL error code is -33 at   when clCreateBuffer imageBuffer .
OpenCL error code is -34 at   when clCreateBuffer histogramBuffer .
OpenCL error code is -34 at   when clCreateBuffer tmpHistogramBuffer .
OpenCL error code is -34 at   when clCreateBuffer atomicSyncBuffer .
OpenCL error code is -44 at   when clCreateKernel kernel_HistogramRectAllChannels .
OpenCL error code is -44 at   when clCreateKernel kernel_HistogramRectAllChannelsReduction .
OpenCL error code is -36 at   when clEnqueueMapBuffer tmpHistogramBuffer .

Program received signal SIGSEGV, Segmentation fault.
__memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:330
warning: 330	../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such file or directory
(gdb) bt
#0  __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:330
#1  0x00007ffff7ecc4ef in tesseract::OpenclDevice::HistogramRectOCL(void*, int, int, int, int, int, int, int, int*) () from /usr/lib64/libtesseract.so.5
#2  0x00007ffff7ece694 in tesseract::evaluateScoreForDevice(tesseract::ds_device*, void*) () from /usr/lib64/libtesseract.so.5
#3  0x00007ffff7eceffb in tesseract::OpenclDevice::getDeviceSelection() () from /usr/lib64/libtesseract.so.5
#4  0x00007ffff7ecfc98 in tesseract::OpenclDevice::InitOpenclRunEnv_DeviceSelection(int) () from /usr/lib64/libtesseract.so.5
#5  0x00007ffff7ecfcef in tesseract::OpenclDevice::InitEnv() () from /usr/lib64/libtesseract.so.5
#6  0x00007ffff7cb4ffa in tesseract::TessBaseAPI::Init(char const*, int, char const*, tesseract::OcrEngineMode, char**, int, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const*, bool, bool (*)(char const*, std::vector<char, std::allocator<char> >*)) () from /usr/lib64/libtesseract.so.5
#7  0x00007ffff7cb574a in tesseract::TessBaseAPI::Init(char const*, char const*, tesseract::OcrEngineMode, char**, int, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const*, bool) ()
   from /usr/lib64/libtesseract.so.5
#8  0x00005555555584f8 in main ()


By installing tesseract 5.4.1, it works fine now:

tesseract --version 
tesseract 5.4.1
 leptonica-1.83.1
  libgif 5.2.2 : libjpeg 6b (libjpeg-turbo 3.0.3) : libpng 1.6.43+apng : libtiff 4.6.0 : zlib 1.3.1 : libwebp 1.4.0 : libopenjp2 2.5.2
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found OpenMP 201511
 Found libarchive 3.7.6 zlib/1.3.1 liblzma/5.6.2 bz2lib/1.0.8 liblz4/1.10.0 libzstd/1.5.6
 Found libcurl/8.9.1 GnuTLS/3.8.7 zlib/1.3.1 brotli/1.1.0 zstd/1.5.6 libidn2/2.3.7 libpsl/0.21.5 libssh2/1.11.0 nghttp2/1.62.1 ngtcp2/1.7.0 nghttp3/1.6.0 librtmp/2.3 OpenLDAP/2.6.4

So, it's better to stabilize 5.4.1 as a solution.
Comment 2 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-10-17 05:43:47 UTC
(In reply to Maxim P. Dementiev from comment #1)
> [...]
> So, it's better to stabilize 5.4.1 as a solution.

Thanks, filed bug 940138.
Comment 3 Bernard Cafarelli gentoo-dev 2024-11-05 19:44:39 UTC
5.4.1 is now stable and I just dropped previous versions, so I think wee can consider this one fixed