The Profile Window

The Profile Window displays all the counter data collected during the profile. This data can be saved from the File->Save As menu and can also be reloaded at a later time. The data does not have to be reloaded on the same system, so you can easily send the profiled data to another developer so they can do their own analysis of the data.

Data

ProfileWindow-data.png

State Bucket Options

Across the top of the Data tab are options for grouping draw calls by state bucket. By default, none of the options are selected, so that the top draw call in the list will be the most expensive draw call. Use the Select All or Clear All buttons to quickly set or clear the selected options. The actual state bucket options that appear are based on what API your application is using and what features or API calls are being made.

Clear calls do not have shaders as part of their state. Calls that clear render targets are given a state based on the render target being cleared and those that clear depth / stencil are given state based on the associated buffer.

Here is a list of the available options for each API:

DirectX 11

OpenGL

Profile Results Table

The per-draw profile results are grouped into state buckets based on the currently selected state bucket options. State Bucket 0 is reserved for draw calls which do not fall into one of the selected state bucket options (or if no options are selected). When the results are initially displayed, the data will be sorted by the first counter that was profiled, such that the highest value is at the top. If GPUTime is used, this means the top draw call in the list will be most expensive call. When state buckets options are selected, the table will be sorted hierarchically so that state buckets are sorted first, then the draw calls within each state bucket are sorted. Since draw calls which fall into the same state bucket share similar properties, it is often best to attempt to improve the performance of an expensive state bucket instead of a single expensive draw call, as this will yield the best results over the entire frame.

The rows in the table are color-coded such that state buckets are in a shade of orange, and draw calls are in a shade of blue.

State buckets can also be collapsed by clicking on the +/- sign representing the node in the tree. Alternatively, all the state buckets can expanded or collapsed via the context menu on the state bucket cells.

Some columns always appear in the results table, while the other columns are based on the selected counters.

The data in the state bucket rows are aggregated from all the draw calls within that group, with the exception that counters representing percentages are displayed as an average. This average depends on whether the "GPUTime" counter has been chosen. If "GPUTime" has not been selected, then this average will be non-weighted; in other words it will be the total of all percentages of that column, devided by the number of columns; it won't take into account the amount of time taken for each draw call. If the GPUTime counter is selected, the average will be weighted depending on the amount of time spent in the draw call. As an example, suppose there are 3 draw calls in an application. Two of them take 10ms and keep the GPU busy for 20% of the time and the third draw call takes 1000m but keeps the GPU busy for 80% of the time. The non-weighted GPU busy percentage would be (20+20+80) / 3 = 40%. The weighted calculation would take GPU time into account, leading to an average of (20 * 10) + (20 * 10) + (80 * 1000) / (1000 + 10 + 10) = 78.8%.

A screenshot demonstrating this more fully is shown below:

ProfileAverage.png

These values allow you to easily identify which stages are bottlenecking the entire state bucket.

If the API Trace or Frame Debugger is opened, selecting a draw call in the results table will also select that draw call in the other windows. Likewise, the results table will also highlight the selected draw call if it was changed by either of the other windows.

Profile Delta

ProfileWindow-delta.png

At the top right of the Data Tab is a drop down list for comparing two profiles. The first profile serves as base profile for comparison. By clicking New Profile a new profile is generated and the change between the base profile and the new profile is displayed as a percentage. The green color indicates improvement of the performance and the red color indicates deterioration of the performance. The first number is from the base profile and the second number is from the new profile.

Options

ProfileWindow-options.png

Data Visibility

Allows you to select which counter groups are displayed in the tables in the Data view. Only those groups from which a counter was enabled are shown in the list. This provides an easy way to filter out the data that may not be interest to what you are currently investigating.

Analysis

ProfileWindow-analysis.png

Timing

Displays a brief analysis of whether the application is CPU or GPU bound and where most of the time is spent. This analysis is only performed if the Frame Profiler option is enabled in the Settings Dialog.

Info

ProfileWindow-info.png

a) System Information

Shows the type of graphics card and associated DeviceID of the card that the profile was taken on.

b) Frame Image

Shows a capture of the backbuffer of the frame that was profiled.