Intel Ivy Bridge HD Graphics 4000 GPU: OpenGL and OpenCL Tests

Intel Ivy Bridge HD 4000 GPU test – Index

3 – Ivy Bridge – OpenCL Side

OpenCL logo

This is the interesting part. Sandy Bridge processors are available with an OpenCL support limited to the CPU only. Now with Ivy Bridge processors, Intel has extended the OpenCL support to the GPU too!

Here what GPU Caps Viewer tells us about Ivy Bridge OpenCL support:

GPU Caps Viewer, OpenCL API support for Intel Ivy Bridge

Okay the OpenCL 1.1 support is cool, but do the demos work? Yep man, the demos work quite fine.

GPU Caps Viewer, OpenCL demos with Intel Ivy Bridge

Only the mesh deformer demo has a little problem with the normals in the OpenCL GPU version, the OpenCL CPU working properly:

GPU Caps Viewer, OpenCL mesh deformer demo, GPU
Mesh deformer, OpenCL GPU, incorrect rendering

GPU Caps Viewer, OpenCL mesh deformer demo, CPU
Mesh deformer, OpenCL CPU, correct rendering

Here are some scores:

QJulia 4D demo, 600×600 windowed:
Ivy Bridge HD 4000 OpenCL GPU test: around 35 FPS
Ivy Bridge OpenCL CPU test: around 15 FPS
– GeForce GTX 680: around 200 FPS

PostFX demo, 600×600 windowed:
Ivy Bridge HD 4000 OpenCL GPU test: around 20 FPS
Ivy Bridge OpenCL CPU test: around 5 FPS
– GeForce GTX 680: around 120 FPS

As you can see, the OpenCL GPU support brings some serious performance gains compared to the CPU support. The double support of OpenCL (CPU + GPU) that is available with Ivy Bridge processors will make happy the users of the upcoming Photoshop CS6.

Another OpenCL test before looking at the OpenCL report. I posted today an article about CLBenchmark, a new OpenCL benchmark. Then it’s a perfect opportunity to test it with the Ivy Bridge processor.

CLBenchmark, Ivy Bridge OpenCL CPU results
CLBenchmark, Ivy Bridge OpenCL CPU test

CLBenchmark, Ivy Bridge OpenCL GPU results
CLBenchmark, Ivy Bridge OpenCL GPU test



As you can see, OpenCL GPU is not always the fastest path for processing data. Depending on the type of algorithms, OpenCL CPU can be more efficient that OpenCL GPU. But in pure graphics processing like the raytracing test (which is highly parallelizable), the OpenCL GPU path is much faster than the CPU path.

Here is the complete OpenCL report provided by GPU Caps Viewer:

- Num OpenCL platforms: 1
- CL_PLATFORM_NAME: Intel(R) OpenCL
- CL_PLATFORM_VENDOR: Intel(R) Corporation
- CL_PLATFORM_VERSION: OpenCL 1.1 
- CL_PLATFORM_PROFILE: FULL_PROFILE
- Num devices: 2

	- CL_DEVICE_NAME:                 Genuine Intel(R) CPU  @ 2.20GHz
	- CL_DEVICE_VENDOR: Intel(R) Corporation
	- CL_DRIVER_VERSION: 1.1
	- CL_DEVICE_PROFILE: FULL_PROFILE
	- CL_DEVICE_VERSION: OpenCL 1.1 (Build 30316.30328)
	- CL_DEVICE_TYPE: CPU
	- CL_DEVICE_VENDOR_ID: 0x8086
	- CL_DEVICE_MAX_COMPUTE_UNITS: 8
	- CL_DEVICE_MAX_CLOCK_FREQUENCY: 2200MHz
	- CL_DEVICE_ADDRESS_BITS: 32
	- CL_DEVICE_MAX_MEM_ALLOC_SIZE: 524256KB
	- CL_DEVICE_GLOBAL_MEM_SIZE: 2047MB
	- CL_DEVICE_MAX_PARAMETER_SIZE: 3840
	- CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: 64 Bytes
	- CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 256KB
	- CL_DEVICE_ERROR_CORRECTION_SUPPORT: NO
	- CL_DEVICE_LOCAL_MEM_TYPE: Global
	- CL_DEVICE_LOCAL_MEM_SIZE: 32KB
	- CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 128KB
	- CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
	- CL_DEVICE_MAX_WORK_ITEM_SIZES: [1024 ; 1024 ; 1024]
	- CL_DEVICE_MAX_WORK_GROUP_SIZE: 1024
	- CL_EXEC_NATIVE_KERNEL: 19527224
	- CL_DEVICE_IMAGE_SUPPORT: YES
	- CL_DEVICE_MAX_READ_IMAGE_ARGS: 480
	- CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 480
	- CL_DEVICE_IMAGE2D_MAX_WIDTH: 8192
	- CL_DEVICE_IMAGE2D_MAX_HEIGHT: 8192
	- CL_DEVICE_IMAGE3D_MAX_WIDTH: 2048
	- CL_DEVICE_IMAGE3D_MAX_HEIGHT: 2048
	- CL_DEVICE_IMAGE3D_MAX_DEPTH: 2048
	- CL_DEVICE_MAX_SAMPLERS: 480
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR: 16
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT: 8
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT: 4
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG: 2
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT: 4
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE: 2
	- CL_DEVICE_EXTENSIONS: 11
	- Extensions:
		- cl_khr_fp64
		- cl_khr_icd
		- cl_khr_global_int32_base_atomics
		- cl_khr_global_int32_extended_atomics
		- cl_khr_local_int32_base_atomics
		- cl_khr_local_int32_extended_atomics
		- cl_khr_byte_addressable_store
		- cl_intel_printf
		- cl_ext_device_fission
		- cl_intel_exec_by_local_thread
		- cl_khr_gl_sharing

	- CL_DEVICE_NAME: Intel(R) HD Graphics 4000
	- CL_DEVICE_VENDOR: Intel(R) Corporation
	- CL_DRIVER_VERSION: 8.15.10.2696
	- CL_DEVICE_PROFILE: FULL_PROFILE
	- CL_DEVICE_VERSION: OpenCL 1.1 
	- CL_DEVICE_TYPE: GPU
	- CL_DEVICE_VENDOR_ID: 0x8086
	- CL_DEVICE_MAX_COMPUTE_UNITS: 16
	- CL_DEVICE_MAX_CLOCK_FREQUENCY: 400MHz
	- CL_DEVICE_ADDRESS_BITS: 64
	- CL_DEVICE_MAX_MEM_ALLOC_SIZE: 415744KB
	- CL_DEVICE_GLOBAL_MEM_SIZE: 1624MB
	- CL_DEVICE_MAX_PARAMETER_SIZE: 1024
	- CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: 64 Bytes
	- CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 2048KB
	- CL_DEVICE_ERROR_CORRECTION_SUPPORT: NO
	- CL_DEVICE_LOCAL_MEM_TYPE: Local (scratchpad)
	- CL_DEVICE_LOCAL_MEM_SIZE: 64KB
	- CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64KB
	- CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
	- CL_DEVICE_MAX_WORK_ITEM_SIZES: [512 ; 512 ; 512]
	- CL_DEVICE_MAX_WORK_GROUP_SIZE: 512
	- CL_EXEC_NATIVE_KERNEL: 19527220
	- CL_DEVICE_IMAGE_SUPPORT: YES
	- CL_DEVICE_MAX_READ_IMAGE_ARGS: 128
	- CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 8
	- CL_DEVICE_IMAGE2D_MAX_WIDTH: 16384
	- CL_DEVICE_IMAGE2D_MAX_HEIGHT: 16384
	- CL_DEVICE_IMAGE3D_MAX_WIDTH: 2048
	- CL_DEVICE_IMAGE3D_MAX_HEIGHT: 2048
	- CL_DEVICE_IMAGE3D_MAX_DEPTH: 2048
	- CL_DEVICE_MAX_SAMPLERS: 16
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR: 1
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT: 1
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT: 1
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG: 1
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT: 1
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE: 0
	- CL_DEVICE_EXTENSIONS: 10
	- Extensions:
		- cl_khr_icd
		- cl_khr_global_int32_base_atomics
		- cl_khr_global_int32_extended_atomics
		- cl_khr_local_int32_base_atomics
		- cl_khr_local_int32_extended_atomics
		- cl_khr_gl_sharing
		- cl_khr_d3d10_sharing
		- cl_intel_dx9_media_sharing
		- cl_khr_3d_image_writes
		- cl_khr_byte_addressable_store

References




Intel Ivy Bridge HD 4000 GPU test – Index

9 thoughts on “Intel Ivy Bridge HD Graphics 4000 GPU: OpenGL and OpenCL Tests”

  1. Leith Bade

    I wonder how well Rage runs on this.

    My AMD APU can get 40-50 FPS in Rage.

  2. fellix

    Intel said, the are still to deliver a properly optimized driver for IVB’s graphics unit, probably later this year.

  3. mg

    Please give millisecond results rather than FPS – FPS is non-linear, skewing the results, and I’m not smart enough to convert in my head 🙂

  4. erwincoumans

    @oscarbg the Bullet OpenCL gpu rigid body pipeline doesn’t work on Ivy Bridge yet, it only runs fine on latest Radeon and Fermi, Kepler gpus. I try to make it compatible, I just got the Ivy Bridge myself.

  5. oscarbg

    @erwincoumans good to know you have IVB now and fixing it to work..

  6. mincho

    I am wondering how would one detect something bad happens with opencl. One of the tests above showed incorrect result without an error message. How would oen know running a computational task that it run fine?

Comments are closed.