OpenMP Command-Line Argument Reference¶
Welcome to the OpenMP in LLVM command line argument reference. The content is not a complete list of arguments but includes the essential command-line arguments you may need when compiling and linking OpenMP. Section OpenMP Command-Line Arguments lists OpenMP command line options for multicore programming while Offloading Specific Command-Line Arguments lists options relevant to OpenMP target offloading.
OpenMP Command-Line Arguments¶
-fopenmp
¶
Enable the OpenMP compilation toolchain. The compiler will parse OpenMP compiler directives and generate parallel code.
-fopenmp-extensions
¶
Enable all Clang
extensions for OpenMP directives and clauses. A list of
current extensions and their implementation status can be found on the
support
page.
-fopenmp-simd
¶
This option enables OpenMP only for single instruction, multiple data (SIMD) constructs.
-static-openmp
¶
Use the static OpenMP host runtime while linking.
-fopenmp-version=<arg>
¶
Set the OpenMP version to a specific version <arg>
of the OpenMP standard.
For example, you may use -fopenmp-version=45
to select version 4.5 of
the OpenMP standard. The default value is -fopenmp-version=51
for Clang
.
Offloading Specific Command-Line Arguments¶
-fopenmp-targets
¶
-fopenmp-targets=amdgcn-amd-amdhsa,nvptx64
. This option is
often optional when --offload-arch is provided.-fopenmp-targets=x86_64-pc-linux-gnu
.--offload-arch
¶
--offload-arch=sm_80
to target an Nvidia Tesla A100,
--offload-arch=gfx90a
to target an AMD Instinct MI250X, or
--offload-arch=sm_80,gfx90a
to target both.--offload-arch
. In that case, the executables amdgpu-arch
or
nvptx-arch
will be executed as part of the compiler driver to
detect the device architecture automatically.--offload-arch=native
.--offload-device-only
¶
Compile only the code that goes on the device. This option is mainly for debugging purposes. It is primarily used for inspecting the intermediate representation (IR) output when compiling for the device. It may also be used if device-only runtimes are created.
--offload-host-only
¶
Compile only the code that goes on the host. With this option enabled, the
.llvm.offloading
section with embedded device code will not be included in
the intermediate representation.
--offload-host-device
¶
Compile the target regions for both the host and the device. That is the default option.
-Xopenmp-target <arg>
¶
Pass an argument <arg>
to the offloading toolchain, for instance
-Xopenmp-target -march=sm_80
.
-Xopenmp-target=<triple> <arg>
¶
Pass an argument <arg>
to the offloading toolchain for the target
<triple>
. That is especially useful when an argument must differ for each
triple. For instance -Xopenmp-target=nvptx64 --offload-arch=sm_80
-Xopenmp-target=amdgcn --offload-arch=gfx90a
to specify the device
architecture. Alternatively, -Xarch_host <arg> and -Xarch_device <arg> can
pass an argument to the host and device compilation toolchain.
-Xoffload-linker<triple> <arg>
¶
Pass an argument <arg>
to the offloading linker for the target specified in
<triple>
.
-Xarch_device <arg>
¶
Pass an argument <arg>
to the device compilation toolchain.
-Xarch_host <arg>
¶
Pass an argument <arg>
to the host compilation toolchain.
-foffload-lto[=<arg>]
¶
Enable device link time optimization (LTO) and select the LTO mode <arg>
.
Select either -foffload-lto=thin
or -foffload-lto=full
. Thin LTO takes
less time while still achieving some performance gains. If no argument is set,
this option defaults to -foffload-lto=full
.
-fopenmp-offload-mandatory
¶
OMP_TARGET_OFFLOAD='MANDATORY'
to confirm that the code is being offloaded to
the device.-fopenmp-target-debug[=<arg>]
¶
Enable debugging in the device runtime library (RTL). Note that it is both
necessary to configure the debugging in the device runtime at compile-time with
-fopenmp-target-debug=<arg>
and enable debugging at runtime with the
environment variable LIBOMPTARGET_DEVICE_RTL_DEBUG=<arg>
. Further, it is
currently only supported for Nvidia targets as of July 2023. Alternatively, the
environment variable LIBOMPTARGET_DEBUG
can be set to debug both Nvidia and
AMD GPU targets. For more information, see the
debugging instructions.
The debugging instructions list the supported debugging arguments.
-fopenmp-target-jit
¶
LIBOMPTARGET_JIT_OPT_LEVEL
, for instance,
LIBOMPTARGET_JIT_OPT_LEVEL=3
corresponding to optimizations level -O3
.
See the
OpenMP JIT details
for instructions on extracting the embedded device code before or after the
JIT and more.--offload-new-driver
¶
In upstream LLVM, OpenMP only uses the new driver. However, enabling this option for experimental linking with CUDA or HIP files is necessary.
--offload-link
¶
Use the new offloading linker clang-linker-wrapper to perform the link job. clang-linker-wrapper is the default offloading linker for OpenMP. This option can be used to use the new offloading linker in toolchains that do not automatically use it. It is necessary to enable this option when linking with CUDA or HIP files.
-nogpulib
¶
Do not link the device library for CUDA or HIP device compilation.
-nogpuinc
¶
Do not include the default CUDA or HIP headers, and do not add CUDA or HIP include paths.