Intel Ivy Bridge HD 4000 GPU test – Index
3 – Ivy Bridge – OpenCL Side
This is the interesting part. Sandy Bridge processors are available with an OpenCL support limited to the CPU only. Now with Ivy Bridge processors, Intel has extended the OpenCL support to the GPU too!
Here what GPU Caps Viewer tells us about Ivy Bridge OpenCL support:
Okay the OpenCL 1.1 support is cool, but do the demos work? Yep man, the demos work quite fine.
Only the mesh deformer demo has a little problem with the normals in the OpenCL GPU version, the OpenCL CPU working properly:
Mesh deformer, OpenCL GPU, incorrect rendering
Mesh deformer, OpenCL CPU, correct rendering
Here are some scores:
QJulia 4D demo, 600×600 windowed:
– Ivy Bridge HD 4000 OpenCL GPU test: around 35 FPS
– Ivy Bridge OpenCL CPU test: around 15 FPS
– GeForce GTX 680: around 200 FPS
PostFX demo, 600×600 windowed:
– Ivy Bridge HD 4000 OpenCL GPU test: around 20 FPS
– Ivy Bridge OpenCL CPU test: around 5 FPS
– GeForce GTX 680: around 120 FPS
As you can see, the OpenCL GPU support brings some serious performance gains compared to the CPU support. The double support of OpenCL (CPU + GPU) that is available with Ivy Bridge processors will make happy the users of the upcoming Photoshop CS6.
Another OpenCL test before looking at the OpenCL report. I posted today an article about CLBenchmark, a new OpenCL benchmark. Then it’s a perfect opportunity to test it with the Ivy Bridge processor.
CLBenchmark, Ivy Bridge OpenCL CPU results
CLBenchmark, Ivy Bridge OpenCL GPU results
As you can see, OpenCL GPU is not always the fastest path for processing data. Depending on the type of algorithms, OpenCL CPU can be more efficient that OpenCL GPU. But in pure graphics processing like the raytracing test (which is highly parallelizable), the OpenCL GPU path is much faster than the CPU path.
Here is the complete OpenCL report provided by GPU Caps Viewer:
- Num OpenCL platforms: 1 - CL_PLATFORM_NAME: Intel(R) OpenCL - CL_PLATFORM_VENDOR: Intel(R) Corporation - CL_PLATFORM_VERSION: OpenCL 1.1 - CL_PLATFORM_PROFILE: FULL_PROFILE - Num devices: 2 - CL_DEVICE_NAME: Genuine Intel(R) CPU @ 2.20GHz - CL_DEVICE_VENDOR: Intel(R) Corporation - CL_DRIVER_VERSION: 1.1 - CL_DEVICE_PROFILE: FULL_PROFILE - CL_DEVICE_VERSION: OpenCL 1.1 (Build 30316.30328) - CL_DEVICE_TYPE: CPU - CL_DEVICE_VENDOR_ID: 0x8086 - CL_DEVICE_MAX_COMPUTE_UNITS: 8 - CL_DEVICE_MAX_CLOCK_FREQUENCY: 2200MHz - CL_DEVICE_ADDRESS_BITS: 32 - CL_DEVICE_MAX_MEM_ALLOC_SIZE: 524256KB - CL_DEVICE_GLOBAL_MEM_SIZE: 2047MB - CL_DEVICE_MAX_PARAMETER_SIZE: 3840 - CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: 64 Bytes - CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 256KB - CL_DEVICE_ERROR_CORRECTION_SUPPORT: NO - CL_DEVICE_LOCAL_MEM_TYPE: Global - CL_DEVICE_LOCAL_MEM_SIZE: 32KB - CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 128KB - CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3 - CL_DEVICE_MAX_WORK_ITEM_SIZES: [1024 ; 1024 ; 1024] - CL_DEVICE_MAX_WORK_GROUP_SIZE: 1024 - CL_EXEC_NATIVE_KERNEL: 19527224 - CL_DEVICE_IMAGE_SUPPORT: YES - CL_DEVICE_MAX_READ_IMAGE_ARGS: 480 - CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 480 - CL_DEVICE_IMAGE2D_MAX_WIDTH: 8192 - CL_DEVICE_IMAGE2D_MAX_HEIGHT: 8192 - CL_DEVICE_IMAGE3D_MAX_WIDTH: 2048 - CL_DEVICE_IMAGE3D_MAX_HEIGHT: 2048 - CL_DEVICE_IMAGE3D_MAX_DEPTH: 2048 - CL_DEVICE_MAX_SAMPLERS: 480 - CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR: 16 - CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT: 8 - CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT: 4 - CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG: 2 - CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT: 4 - CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE: 2 - CL_DEVICE_EXTENSIONS: 11 - Extensions: - cl_khr_fp64 - cl_khr_icd - cl_khr_global_int32_base_atomics - cl_khr_global_int32_extended_atomics - cl_khr_local_int32_base_atomics - cl_khr_local_int32_extended_atomics - cl_khr_byte_addressable_store - cl_intel_printf - cl_ext_device_fission - cl_intel_exec_by_local_thread - cl_khr_gl_sharing - CL_DEVICE_NAME: Intel(R) HD Graphics 4000 - CL_DEVICE_VENDOR: Intel(R) Corporation - CL_DRIVER_VERSION: 8.15.10.2696 - CL_DEVICE_PROFILE: FULL_PROFILE - CL_DEVICE_VERSION: OpenCL 1.1 - CL_DEVICE_TYPE: GPU - CL_DEVICE_VENDOR_ID: 0x8086 - CL_DEVICE_MAX_COMPUTE_UNITS: 16 - CL_DEVICE_MAX_CLOCK_FREQUENCY: 400MHz - CL_DEVICE_ADDRESS_BITS: 64 - CL_DEVICE_MAX_MEM_ALLOC_SIZE: 415744KB - CL_DEVICE_GLOBAL_MEM_SIZE: 1624MB - CL_DEVICE_MAX_PARAMETER_SIZE: 1024 - CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: 64 Bytes - CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 2048KB - CL_DEVICE_ERROR_CORRECTION_SUPPORT: NO - CL_DEVICE_LOCAL_MEM_TYPE: Local (scratchpad) - CL_DEVICE_LOCAL_MEM_SIZE: 64KB - CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64KB - CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3 - CL_DEVICE_MAX_WORK_ITEM_SIZES: [512 ; 512 ; 512] - CL_DEVICE_MAX_WORK_GROUP_SIZE: 512 - CL_EXEC_NATIVE_KERNEL: 19527220 - CL_DEVICE_IMAGE_SUPPORT: YES - CL_DEVICE_MAX_READ_IMAGE_ARGS: 128 - CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 8 - CL_DEVICE_IMAGE2D_MAX_WIDTH: 16384 - CL_DEVICE_IMAGE2D_MAX_HEIGHT: 16384 - CL_DEVICE_IMAGE3D_MAX_WIDTH: 2048 - CL_DEVICE_IMAGE3D_MAX_HEIGHT: 2048 - CL_DEVICE_IMAGE3D_MAX_DEPTH: 2048 - CL_DEVICE_MAX_SAMPLERS: 16 - CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR: 1 - CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT: 1 - CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT: 1 - CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG: 1 - CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT: 1 - CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE: 0 - CL_DEVICE_EXTENSIONS: 10 - Extensions: - cl_khr_icd - cl_khr_global_int32_base_atomics - cl_khr_global_int32_extended_atomics - cl_khr_local_int32_base_atomics - cl_khr_local_int32_extended_atomics - cl_khr_gl_sharing - cl_khr_d3d10_sharing - cl_intel_dx9_media_sharing - cl_khr_3d_image_writes - cl_khr_byte_addressable_store
References
Intel Ivy Bridge HD 4000 GPU test – Index
I wonder how well Rage runs on this.
My AMD APU can get 40-50 FPS in Rage.
Intel said, the are still to deliver a properly optimized driver for IVB’s graphics unit, probably later this year.
In addition to Rage I wonder if you can test with these OpenCL tests:
*GPCBenchmark 1.1 (hard to find it’s on
http://forum.beyond3d.com/showpost.php?p=1466478&postcount=91)
*Bullet OpenCL rigid body pipeline (https://github.com/downloads/erwincoumans/experiments/gpu_rigidbody_2012_feb11.zip)
possibly lower the number of bodies..
Can you try to run some of this apps on Intel HD 4000
1) http://scalibq.wordpress.com/2010/11/25/running-nvidias-endless-city-tessellation-demo-on-radeons/
2)
http://developer.amd.com/samples/demos/pages/RadeonHD6900SeriesRealTimeDemos.aspx
3)Ladybug Demo http://developer.amd.com/samples/demos/pages/ATIRadeonHD5800SeriesRealTimeDemos.aspx
Result can be very interesting so I will be glad if you try do that
Please give millisecond results rather than FPS – FPS is non-linear, skewing the results, and I’m not smart enough to convert in my head 🙂
@oscarbg the Bullet OpenCL gpu rigid body pipeline doesn’t work on Ivy Bridge yet, it only runs fine on latest Radeon and Fermi, Kepler gpus. I try to make it compatible, I just got the Ivy Bridge myself.
@erwincoumans good to know you have IVB now and fixing it to work..
I am wondering how would one detect something bad happens with opencl. One of the tests above showed incorrect result without an error message. How would oen know running a computational task that it run fine?
Any idea what motherboard was used in these tests?