September 10, 2014, Intel Developer Forum, San Francisco—Graphics specialists from Intel presented some possible optimizations for HD and Iris graphics systems. The growing need for higher performance in graphics units requires better use of the available resources.
The need for better performance spans the whole creation, consumption, and playing spectrum. Almost one in five Steam users are on some type of Intel graphics platform, and the attach rate for notebook discrete graphics is down to 24 percent from 36 percent two years ago. One reason for the increases in both numbers is the incorporation of GPU functions into the base CPU to form an APU.
The integrated graphics subsystems can support multiple displays, 4k displays, and various video formats for creation, game playing, and video viewing. The APUs are comparable in performance to mid-and low-range discrete graphics cards in many applications. The integrated graphics also offer greater power efficiency than their discrete counterparts and allow for shared memory between the CPU and GPU functions.
Getting maximum performance from the graphics subsystem requires optimization of memory configuration, power delivery, thermal design, and turbo parameters as well as tweeks to the graphics drivers, BIOS RC, and other software and firmware. As always, optimization requires tradeoffs. Form factor and motherboard designs can greatly affect graphics performance.
Dual channel, multi-rank memories will have the highest data bandwidth and allow the most open pages at a time. Tests show that a 1Rx8 4Gb x1 memory will have only 0.68 of the performance of a 2Rx8 4Gb x2 module on a 3DM11 graphics test. In a similar manner, power delivery should be designed to handle both sustained and peak power requirements. The current monitor needs to be calibrated to manage performance and thermal behavior.
At the higher power levels needed for higher performance, the thermal design must work towards the lowest processor temperature possible. The turbo parameters will interact with the thermal and power capabilities, so they must also be optimized. The graphics test scores drop with higher temperatures and lower delivered energy, while the turbo settings can improve performance with correct PL1/PL2/Tau settings. See figure.
Graphics Performance
Intel's graphics performance analyzers allow for quick and simple configuration and software optimizations in a system. The interactions of hardware and software for graphics performance depend on the model complexity and levels of detail. Some power optimizations can be performance neutral, but can extend battery life by 30 percent. In addition, the analyzer can help to manage CPU versus GPU workloads to manage power and performance. With power constant, graphics frame rates increase if more power is directed to the GPU. With constant GPU workloads, the CPU power can be reduced, reducing total power.
A media SDK provides a cross-platform API for media application development and various video functions. The SDK supports all the iA devices including hardware acceleration and permits a single development for many devices. The hardware acceleration improves transcode and video effects processing that is faster than real time. Plug-ins enable RAW and 4k RAW processing in the media SDK.
Another feature in the APUs is the ability to use shared virtual memory. The SVM allows the CPU and GU to move pointers rather than full data anc can be accessed within OpenCL kernels. Complex data structures like linked lists can be shared without copies and buffer management is simplified. The overall programming model becomes simpler and the decrease in data transfers and processing overhead for those data transfers improves performance and reduced operating power.
caption: Figure: The measured performance on a graphics benchmark varies with memory configuration, processor temperature, and Turbo settings