OpenMP Optimization Remarks¶
The OpenMP-Aware optimization pass is able to
generate compiler remarks for performed and missed optimisations. To emit them,
pass these options to the Clang invocation: -Rpass=openmp-opt
-Rpass-analysis=openmp-opt -Rpass-missed=openmp-opt
. For more information and
features of the remark system, consult the clang documentation:
- Clang options to emit optimization reports
- Clang diagnostic and remark flags
- The -foptimization-record-file flag and the -fsave-optimization-record flag
OpenMP Remarks¶
Diagnostics Number | Diagnostics Kind | Diagnostics Description |
---|---|---|
OMP100 | Analysis | Potentially unknown OpenMP target region caller. |
OMP101 | Analysis | Parallel region is used in unknown / unexpected ways. Will not attempt to rewrite the state machine. |
OMP102 | Analysis | Parallel region is not called from a unique kernel. Will not attempt to rewrite the state machine. |
OMP110 | Optimization | Moving globalized variable to the stack. |
OMP111 | Optimization | Replaced globalized variable with X bytes of shared memory. |
OMP112 | Missed | Found thread data sharing on the GPU. Expect degraded performance due to data globalization. |
OMP113 | Missed | Could not move globalized variable to the stack. Variable is potentially captured in call. Mark parameter as __attribute__((noescape)) to override. |
OMP120 | Optimization | Transformed generic-mode kernel to SPMD-mode. |
OMP121 | Analysis | Value has potential side effects preventing SPMD-mode execution. Add __attribute__((assume(“ompx_spmd_amenable”))) to the called function to override. |
OMP130 | Optimization | Removing unused state machine from generic-mode kernel. |
OMP131 | Optimization | Rewriting generic-mode kernel with a customized state machine. |
OMP132 | Analysis | Generic-mode kernel is executed with a customized state machine that requires a fallback. |
OMP133 | Analysis | Call may contain unknown parallel regions. Use __attribute__((assume(“omp_no_parallelism”))) to override. |
OMP140 | Analysis | Could not internalize function. Some optimizations may not be possible. |
OMP150 | Optimization | Parallel region merged with parallel region at <location>. |
OMP160 | Optimization | Removing parallel region with no side-effects. |
OMP170 | Optimization | OpenMP runtime call <call> deduplicated. |
OMP180 | Optimization | Replacing OpenMP runtime call <call> with <value>. |
OMP190 | Optimization | Redundant barrier eliminated. (device only) |