GDC Manual
The Arm Game Developer Tune-Up Manual provides information on four free tools that game developers can use straight away to build better games, faster.
The Arm Game Developer Tune-Up Manual
Four free tools to build better games, faster on Arm.
Introducing the Manual
Mobile gaming is a $100-billion industry, with developers delivering outstanding gaming experiences across billions of Arm-based mobile devices worldwide. Due to Arm’s heritage in mobile gaming, we understand the pressures you - the game developers - face in terms of cost and time during the development process, and your need to constantly push the boundaries of graphics performance and efficiency. To help you manage these challenges, we are committed to ensuring the best-possible game development process on Arm-based technology and devices.
Alongside our industry-leading mobile GPUs that have been shipped in more than 8 billion units to date, Arm offers a range of optimization tools and guides to improve graphics, increase performance, and optimize power efficiency. This allows you to speed up the game development process and fulfill the true power and potential of your games.
As part of Arm's commitment to support the game developer community, we have produced this manual, which provides information on four free tools that you can use straight away to tune-up performance and build better games, faster on Arm. Each chapter on the free tools also includes detailed video tutorials from Arm's industry-leading technical experts.
The four free tools cover the following areas:
Read on to to learn now how to tune-up performance with our four free tools.
Tool 1: Supercharge Your Game with Vibrant Graphics
Tool 1: Supercharge Your Game with Vibrant Graphics
Why: As more AAA games enter the mobile market, it’s crucial to have supporting technologies and tools that help create more realistic and immersive gaming experiences. Ray tracing is a computer-graphics technique that achieves this aim through generating realistic lighting and shadows. The challenge is that ray tracing techniques can use significant power and energy as shown in profiling via Arm Mobile Studio. However, using these techniques carefully through a series of initial tips and recommendations can mitigate any power or energy losses, while tuning-up the performance of your game.
Video tutorial: Delivering ray tracing on Arm GPUs
How: The flagship Immortalis-G715 is Arm's first GPU to introduce hardware-based ray tracing support on mobile. The GPU is now shipping on silicon that is targeting flagship smartphone devices being launched in 2023.
To help solve the challenge of power, energy, and area, ray tracing on Immortalis-G715 only uses 4% of the shader core area and delivers more than 300% performance improvements through hardware acceleration.
Beyond Immortalis-G715, there are five practical tips and recommendations to tune up ray tracing performance on mobile devices that either offer software or hardware-based ray tracing support:
Minimize Rays Per Pixel
Every ray cast has a performance cost, so, if possible, minimize the number of rays used and only cast rays that are essential.
To achieve this aim, consider the following techniques:
Render at a lower resolution and then upscale appropriately.
Checkerboard rendering that allows only half of the rays to be traced in each frame.
Use temporal super-sampling techniques to distribute rays across multiple frames.
Use hybrid rendering methods, such as combining screen-space reflections (SSR) or even reflection probes with ray traced reflections, and shadow-maps with ray traced shadows.
Optimize Shadow Rays
Try to avoid casting unnecessary rays, with some examples below on how to achieve this:
Avoid tracing a ray if the surface is facing away from the light and therefore automatically in shadow.
Detect if a given pixel is in range of a light or on the far plane, so you can avoid casting the ray.
Use a simple shadow map to detect shadow edges, where shadow rays are more valuable.
Minimize divergence between rays
Try to maximize ray coherency, which means hard shadows are faster than soft shadows and mirror reflections are faster than glossy reflections.
Also avoid shader divergence across neighboring threads to maximize shader throughput and exploit caching strategies, as spatially adjacent shader threads can result in different ray traversals and intersection logic.
Avoid Transparent Geometry
Transparency requires more complex traversals and shader logic, which are both expensive. So, only use transparent geometry if it’s absolutely necessary for the game.
Negative Consequences of Too Many Rays Cast
Ultimately, ray casting is expensive, so casting too many rays at once results in the frame taking too long. Therefore, make sure that every ray cast is required and provides useful information to the player.
Video demo: Arm Bonza demo showing ray tracing benefits
Further reading:
Tool 2: Gain X-Ray Vision with Adaptive Performance
Tool 2: Gain X-Ray Vision with Adaptive Performance
Why: For any game developer, especially those creating games with complex workloads, you need to extract as much performance possible from the game. Any performance issues not only affect gameplay, but can also drain the battery life of mobile devices. Adaptive Performance, with instructions on how to install here, is a package from Unity that allows you to tune-up your game to improve the overall performance.
Video tutorial: Unity tutorial on Adaptive Performance
How: Adaptive Performance allows you to identify any issues with your game quickly and then tweak your applications to improve performance in a controlled way. This has a minimal impact on game development, while also saving time and costs. The tool uses four key metrics to tune-up gaming performance:
Desired frame rate based on previous frames
Device temperature level
Device proximity to thermal event
Game application bound by CPU or GPU.
Video demo: Performance difference between Adaptive Performance on & off
Adaptive Performance can be integrated with a device simulator, which means you can test various scenarios rather than waiting for the device to heat up before benchmarking tests. With the thermal settings in device simulator, you can set the device to throttle or to send out a warning when throttling is imminent or when the device overheating.
The performance settings in Adaptive Performance allow you to set any bottlenecks to the CPU, GPU, or a target frame rate. You can also set CPU and GPU levels to simulate the frequency of your game’s performance to help to establish any performance issues.
Further reading:
Tool 3: Reinforce Your Agents with Machine Learning
Tool 3: Reinforce Your Agents with Machine Learning
Why: Using machine learning (ML) features during the game development process means you no longer need to “code” characters, saving time and costs. ML Agents from Unity, which can be downloaded here, allows you to train intelligent AI agents within games and simulations, with the game characters “learning” through reinforcement learning (RL).
Video tutorial: Using Unity's ML Agents on Arm
How: There are many different elements when using ML Agents on Arm-based devices, but below are two broad considerations when getting started.
Consider how the ML Agents in the game are designed and what the generated neural network (NN) models look like through the following items:
Input to the agent and what information it needs.
Outputs from the agent and which actions it needs to take to accomplish a target task in the game.
The NN model structure and how the brain should process information.
The reward function and how the agent will be trained.
Consider the training strategy for the ML Agents and how the game runs on Arm-based devices:
Two key aspects of the training strategy are:
When establishing how the game runs with the Agents, think about:
Applying different difficulty levels
Measuring the performance of the agents by using the Unity profiler
Utilizing the Decision Requester component and timeline that shows the NN model execution with specific intervals.
Video GIFs: Examples from Arm's own work with ML Agents
Further reading:
Tool 4: Turbo Tune-Up with Arm Mobile Studio
Tool 4: Turbo Tune-Up with Arm Mobile Studio
Why: Before a game is ready for market, you need to make sure it’s fully optimized in terms of performance and efficiency, with testing a vital part of this process. Arm Mobile Studio, which can be downloaded here, is a suite of optimization tools for mobile games, providing developers with early and frequent testing around gaming and graphics performance. The suite of tools quickly identifies areas that can be tuned-up to reduce profiling time, which helps you predict and improve the performance and efficiency of your games.
Video: Introduction to Arm Mobile Studio
Video tutorial: Optimizing mobile games using Arm Mobile Studio
How: Arm Mobile Studio optimization tools each target a different stage in the profiling workflow for gaming applications:
Performance Advisor: This generates an easy-to-read performance report from an annotated streamline capture and gives actionable suggestions about how to optimize the gaming application.
Streamline: This captures a performance profile for deep-dive analysis, using all of the CPU, GPU, and memory system performance data in the system. All of these metrics allow you to target optimizations to the areas that matter most.
Graphics Analyzer: This investigates the OpenGL ES and Vulkan API calls made by your game and how the GPU reacted to them to identify rendering defects and performance inefficiencies. The tool uses diagnostic tooling, such as overdraw and shader usage maps, to explore the frames of the application draw-by-draw and find opportunities for optimization.
Mali Offline Compiler : This compiles your game’s shader programs and checks how they will perform across any Arm GPU. Performance reports from this tool give you information on shader register usage and thread occupancy, an estimated cycle cost breakdown for the target GPU, and other stage-specific performance feedback.
Further reading:
BONUS LEVEL: Other Technologies
BONUS LEVEL: Other Technologies
Alongside the four free tools highlighted previously, Arm provides other technologies designed to improve the game development process and tune-up performance. These include:
Memory Tagging Extension
Why: Memory safety has been a major source of security vulnerabilities for decades, with Google’s Chromium Project team stating that 70 percent of all serious security bugs are memory safety issues. Memory Tagging Extension (MTE) allows you to find memory-related bugs quickly, speeding up the application debugging and development process to save time and costs. MTE is a security feature built into the Armv9 architecture to detect and prevent memory safety vulnerabilities across the mobile ecosystem before and after deployment.
How: Arm implements MTE as a two-phase system, known as the ‘lock’ and the ‘key’. If the key matches, then the lock memory access is permitted; otherwise access can be recorded or faulted. In this way, hard-to-catch memory safety errors can be detected more easily, which also aids general debugging. Any silicon vendor that adopts Armv9 CPUs will have the MTE feature built into their chipsets. Device manufacturers can then make the decision whether to switch on the MTE feature.
The first encounter that you have with MTE may reveal far more vulnerabilities than you want to fix initially. However, you can fix the serious vulnerabilities before a launch and then track down the less severe ones during updates.
Video: Explainer of Memory Tagging Extension
Further reading:
Variable Rate Shading
Why: Variable Rate Shading (VRS) provides energy savings and a performance boost through optimized rendering where it matters on graphics and visuals. This reduces the computational cost with a minimal impact on the visual quality of your game. When enabling VRS on gaming content through Arm's own internal tests, we have seen improvements of up to 15% on FPS.
Video tutorial: Improving performance with adaptive Variable Rate Shading
How: VRS takes your gaming scene and focuses the rendering on the parts that need it at a fine pixel granularity. This is usually where the game action takes place. Areas that require less focus, such as the background scenery, are rendered with a more coarse pixel granularity.
VRS is built into the Arm Immortalis-G715, Mali-G715 and Mali-G615 GPUs, supporting the following:
Shading rate up to 4x4
Shading rate attachment texel size 8x8 up to 32x32
Pipeline shading rate, primitive shading rate and shading rate attachment
Non-trivial combiners.
VRS adds the ability for you to control the fragment shading rate, giving you granular control of the performance over quality of fragment shading. Here are 5 initial steps to get you started:
Check VkPhysicalDeviceFragmentShadingRateFeaturesKHR for support.
Look at the shading rate per drawcall, such as a lower rate for out-of-focus or moving objects.
Look at the shading rate per primitive.
Look at the shading rate per tile of the frame buffer driven by a screen-space shading rate attachment, such as adaptively generating the attachment each frame by analyzing frame contents.
Use VRS on fragment bound passes, with three key passes likely to benefit being: indirect diffuse; lighting; and indirect specular.
Further reading:
Game Engine Resources
Arm works closely with leading game engines Unity and Unreal to help provide optimal gaming performance across mobile platforms. Both game engines are highly popular development environments that are used to create a variety of games and applications across multiple platforms. You can use Arm Mobile Studio to profile, debug and optimize games using Unity and Unreal.
For the latest Unity and Unreal best practice guides to help fine-tune gaming performance, you can visit the following pages on Arm.com
Join the Arm Developer Program
Join the Arm Developer Program
This manual of free tools aims to provide useful insights and information to help all game developers - regardless of where you are in the development journey - tune-up performance and build better games, faster on Arm. While Arm offers a wide variety of optimization technologies, tools and insights, the free tools highlighted in this manual provide games with vital boosts in performance and efficiency, while also saving time and costs during the game development process.
Once you've gone through all the content in this manual, we recommend joining the Arm Developer Program to learn more about tuning up your game, share ideas and knowledge with the Arm developer community, gain access to the Arm Discord channel, and talk with Arm's industry-leading technical experts.
We would also love to hear from you about what Arm can do - or do more of - to improve the overall game development experience, as well as topics that you want to hear more about.
Email developer@arm.com and let us know!