Created attachment 542286 [details, diff] beignet-1.3.2-self-test_fail.patch Opencl is currently broken for Skylake and reading other related bug reports for other platforms as Haswell as well. Reverting upstream patch is required for proper opencl support without the patch $ clinfo Number of platforms 1 Platform Name Intel Gen OCL Driver Platform Vendor Intel Platform Version OpenCL 2.0 beignet 1.3 Platform Profile FULL_PROFILE Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_gl_sharing Platform Extensions function suffix Intel Beignet: self-test failed: (3, 7, 5) + (5, 7, 3) returned (6, 7, 5) See README.md or http://www.freedesktop.org/wiki/Software/Beignet/ Beignet: disabling non-working device Platform Name Intel Gen OCL Driver Number of devices 0 NULL platform behavior clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform Beignet: disabling non-working device clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) clCreateContext(NULL, ...) [default] No devices found in platform Beignet: disabling non-working device clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform Beignet: disabling non-working device clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform Beignet: disabling non-working device clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) No devices found in platform with the patch it back and working clinfo Number of platforms 1 Platform Name Intel Gen OCL Driver Platform Vendor Intel Platform Version OpenCL 2.0 beignet 1.3 Platform Profile FULL_PROFILE Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_gl_sharing Platform Extensions function suffix Intel Platform Name Intel Gen OCL Driver Number of devices 1 Device Name Intel(R) HD Graphics Skylake Desktop GT2 Device Vendor Intel Device Vendor ID 0x8086 Device Version OpenCL 2.0 beignet 1.3 Driver Version 1.3 Device OpenCL C Version OpenCL C 2.0 beignet 1.3 Device Type GPU Device Profile FULL_PROFILE Device Available Yes Compiler Available Yes Linker Available Yes Max compute units 24 Max clock frequency 1000MHz Device Partition (core) Max number of sub-devices 1 Supported partition types None, None, None Supported affinity domains (n/a) Max work item dimensions 3 Max work item sizes 512x512x512 Max work group size 512 Preferred work group size multiple 16 Preferred / native vector sizes char 16 / 8 short 8 / 8 int 4 / 4 long 2 / 2 half 0 / 8 (cl_khr_fp16) float 4 / 4 double 0 / 2 (n/a) Half-precision Floating-point support (cl_khr_fp16) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Single-precision Floating-point support (core) Denormals No Infinity and NANs Yes Round to nearest Yes Round to zero No Round to infinity No IEEE754-2008 fused multiply-add No Support is emulated in software No Correctly-rounded divide and sqrt operations No Double-precision Floating-point support (n/a) Address bits 32, Little-Endian Global memory size 4294967296 (4GiB) Error Correction support No Max memory allocation 3221225472 (3GiB) Unified memory for Host and Device Yes Shared Virtual Memory (SVM) capabilities (core) Coarse-grained buffer sharing Yes Fine-grained buffer sharing No Fine-grained system sharing No Atomics No Minimum alignment for any data type 128 bytes Alignment of base address 1024 bits (128 bytes) Preferred alignment for atomics SVM 0 bytes Global 0 bytes Local 0 bytes Max size for global variable 65536 (64KiB) Preferred total size of global vars 65536 (64KiB) Global Memory cache type Read/Write Global Memory cache size 8192 (8KiB) Global Memory cache line size 64 bytes Image support Yes Max number of samplers per kernel 16 Max size for 1D images from buffer 65536 pixels Max 1D or 2D image array size 2048 images Base address alignment for 2D image buffers 4096 bytes Pitch alignment for 2D image buffers 1 pixels Max 2D image size 8192x8192 pixels Max 3D image size 8192x8192x2048 pixels Max number of read image args 128 Max number of write image args 8 Max number of read/write image args 8 Max number of pipe args 16 Max active pipe reservations 1 Max pipe packet size 1024 Local memory type Local Local memory size 65536 (64KiB) Max number of constant args 8 Max constant buffer size 134217728 (128MiB) Max size of kernel argument 1024 Queue properties (on host) Out-of-order execution No Profiling Yes Queue properties (on device) Out-of-order execution Yes Profiling Yes Preferred size 16384 (16KiB) Max size 262144 (256KiB) Max queues on device 1 Max events on device 1024 Prefer user sync for interop Yes Profiling timer resolution 80ns Execution capabilities Run OpenCL kernels Yes Run native kernels Yes SPIR versions 1.2 printf() buffer size 1048576 (1024KiB) Built-in kernels __cl_copy_region_align4;__cl_copy_region_align16;__cl_cpy_region_unalign_same_offset;__cl_copy_region_unalign_dst_offset;__cl_copy_region_unalign_src_offset;__cl_copy_buffer_rect;__cl_copy_image_1d_to_1d;__cl_copy_image_2d_to_2d;__cl_copy_image_3d_to_2d;__cl_copy_image_2d_to_3d;__cl_copy_image_3d_to_3d;__cl_copy_image_2d_to_buffer;__cl_copy_image_3d_to_buffer;__cl_copy_buffer_to_image_2d;__cl_copy_buffer_to_image_3d;__cl_fill_region_unalign;__cl_fill_region_align2;__cl_fill_region_align4;__cl_fill_region_align8_2;__cl_fill_region_align8_4;__cl_fill_region_align8_8;__cl_fill_region_align8_16;__cl_fill_region_align128;__cl_fill_image_1d;__cl_fill_image_1d_array;__cl_fill_image_2d;__cl_fill_image_2d_array;__cl_fill_image_3d; Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_gl_sharing cl_khr_fp16 NULL platform behavior clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [Intel] clCreateContext(NULL, ...) [default] Success [Intel] clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1) Platform Name Intel Gen OCL Driver Device Name Intel(R) HD Graphics Skylake Desktop GT2 clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1) Platform Name Intel Gen OCL Driver Device Name Intel(R) HD Graphics Skylake Desktop GT2 clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1) Platform Name Intel Gen OCL Driver Device Name Intel(R) HD Graphics Skylake Desktop GT2
Could you please provide a link to an upstream bug report, creating one if necessary? I would rather not merge such changes unless they have been blessed by upstream and it is only a matter of time before they merge them themselves, I haven't got the hardware to test the effect of backend-related patches on all possible platforms.
related upstream bug report https://bugs.freedesktop.org/show_bug.cgi?id=102137 not sure which is preferred upstream source https://github.com/intel/beignet/commits/master - no changes since 1.3.2 https://cgit.freedesktop.org/beignet/ - 3 other changes to the git, but none of them fixed the problem for me It all comes down to Intel devices not being properly detected for OpenCL with message "Beignet: self-test failed: (3, 7, 5) + (5, 7, 3) returned (6, 7, 5)" https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=885423 - this patch fixed issue on Haswell I may be wrong here, but IMHO if it is good enough for Debian should be good enough for Gentoo PS This patch actually made Imagemagick work with OpenCL for me
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=2517438cae9831332178378078fc273cb8ffb466 commit 2517438cae9831332178378078fc273cb8ffb466 Author: Marek Szuba <marecki@gentoo.org> AuthorDate: 2018-08-31 13:27:16 +0000 Commit: Marek Szuba <marecki@gentoo.org> CommitDate: 2018-08-31 13:27:16 +0000 dev-libs/beignet: disable optimisations broken on some platforms Certain optimisation introduced in 1.3.2 is now known not to work correctly on Skylake and Haswell systems. Upstream has been notified but has yet to respond. Upstream-Bug: https://bugs.freedesktop.org/show_bug.cgi?id=102137 Closes: https://bugs.gentoo.org/662760 Package-Manager: Portage-2.3.40, Repoman-2.3.9 dev-libs/beignet/beignet-1.3.2-r2.ebuild | 107 +++++++++++++++++++++ ...eignet-1.3.2_disable-doNegAddOptimization.patch | 66 +++++++++++++ 2 files changed, 173 insertions(+)