tilelang.contrib.nvcc¶
Utility to invoke nvcc compiler in the system
Functions¶
|
Compile cuda code with NVCC from env. |
|
Build a set of default NVCC compile options for TileLang generated sources. |
|
Compile CUDA C++ source to PTX using NVCC and return as text. |
|
Compile CUDA C++ source to CUBIN and disassemble to SASS. |
Utility function to find cuda path |
|
|
Utility function to get cuda version |
|
Utility function to find libdevice |
|
|
|
Utility function to get compute capability of compilation target. |
|
Parse compute capability string to divide major and minor version |
|
|
|
Either fp16 support is provided in the compute capability or not |
|
Either int8 support is provided in the compute capability or not |
|
Either TensorCore support is provided in the compute capability or not |
Either CUDA Graph support is provided |
|
|
Either bf16 support is provided in the compute capability or not |
|
Whether fp8 support is provided in the specified compute capability or not |
|
Whether TMA support is provided in the specified compute capability or not |
|
|
Get the path to the nvcc compiler |
Module Contents¶
- tilelang.contrib.nvcc.compile_cuda(code, target_format='ptx', arch=None, options=None, path_target=None, verbose=False)¶
Compile cuda code with NVCC from env.
- Parameters:
code (str) – The cuda code.
target_format (str) – The target format of nvcc compiler.
arch (str) – The cuda architecture.
options (str or list of str) – The additional options.
path_target (str, optional) – Output file.
- Returns:
cubin – The bytearray of the cubin
- Return type:
bytearray
- tilelang.contrib.nvcc.default_compile_options(compile_flags=None)¶
Build a set of default NVCC compile options for TileLang generated sources.
Includes C++ standard and common include paths (TileLang templates, CUTLASS, CUDA include). Merges user-provided compile flags if given.
- Parameters:
compile_flags (Optional[List[str]]) – Additional flags to include. Items are split on whitespace.
- Returns:
A list of flags suitable for NVCC’s command line.
- Return type:
List[str]
- tilelang.contrib.nvcc.get_ptx_from_source(code, compile_flags=None, verbose=False)¶
Compile CUDA C++ source to PTX using NVCC and return as text.
- Parameters:
code (str) – CUDA C++ kernel source code.
compile_flags (Optional[List[str]]) – Additional flags merged with defaults.
verbose (bool) – Print NVCC output when True.
- Returns:
PTX text.
- Return type:
str
- tilelang.contrib.nvcc.get_sass_from_source(code, compile_flags=None, verbose=False)¶
Compile CUDA C++ source to CUBIN and disassemble to SASS.
Uses nvdisasm if available; otherwise falls back to cuobjdump.
- Parameters:
code (str) – CUDA C++ kernel source code.
compile_flags (Optional[List[str]]) – Additional flags merged with defaults.
verbose (bool) – Print tool outputs when True.
- Returns:
SASS text.
- Return type:
str
- tilelang.contrib.nvcc.find_cuda_path()¶
Utility function to find cuda path
- Returns:
path – Path to cuda root.
- Return type:
str
- tilelang.contrib.nvcc.get_cuda_version(cuda_path=None)¶
Utility function to get cuda version
- Parameters:
cuda_path (Optional[str]) – Path to cuda root. If None is passed, will use find_cuda_path() as default.
- Returns:
version – The cuda version
- Return type:
float
- tilelang.contrib.nvcc.find_libdevice_path(arch)¶
Utility function to find libdevice
- Parameters:
arch (int) – The compute architecture in int
- Returns:
path – Path to libdevice.
- Return type:
str
- tilelang.contrib.nvcc.callback_libdevice_path(arch)¶
- tilelang.contrib.nvcc.get_target_compute_version(target=None)¶
Utility function to get compute capability of compilation target.
Looks for the target arch in three different places, first in the target input, then the Target.current() scope, and finally the GPU device (if it exists).
- Parameters:
target (tvm.target.Target, optional) – The compilation target
- Returns:
compute_version – compute capability of a GPU (e.g. “8.6” or “9.0”)
- Return type:
str
- tilelang.contrib.nvcc.parse_compute_version(compute_version)¶
Parse compute capability string to divide major and minor version
- Parameters:
compute_version (str) – compute capability of a GPU (e.g. “6.0”)
- Returns:
major (int) – major version number
minor (int) – minor version number
- Return type:
tuple[int, int]
- tilelang.contrib.nvcc.get_target_arch(compute_version)¶
- Return type:
str
- tilelang.contrib.nvcc.have_fp16(compute_version)¶
Either fp16 support is provided in the compute capability or not
- Parameters:
compute_version (str) – compute capability of a GPU (e.g. “6.0”)
- tilelang.contrib.nvcc.have_int8(compute_version)¶
Either int8 support is provided in the compute capability or not
- Parameters:
compute_version (str) – compute capability of a GPU (e.g. “6.1”)
- tilelang.contrib.nvcc.have_tensorcore(compute_version=None, target=None)¶
Either TensorCore support is provided in the compute capability or not
- Parameters:
compute_version (str, optional) – compute capability of a GPU (e.g. “7.0”).
target (tvm.target.Target, optional) – The compilation target, will be used to determine arch if compute_version isn’t specified.
- tilelang.contrib.nvcc.have_cudagraph()¶
Either CUDA Graph support is provided
- tilelang.contrib.nvcc.have_bf16(compute_version)¶
Either bf16 support is provided in the compute capability or not
- Parameters:
compute_version (str) – compute capability of a GPU (e.g. “8.0”)
- tilelang.contrib.nvcc.have_fp8(compute_version)¶
Whether fp8 support is provided in the specified compute capability or not
- Parameters:
compute_version (str) – GPU capability
- tilelang.contrib.nvcc.have_tma(target)¶
Whether TMA support is provided in the specified compute capability or not
- Parameters:
target (tvm.target.Target) – The compilation target
- tilelang.contrib.nvcc.is_hopper(target)¶
- tilelang.contrib.nvcc.get_nvcc_compiler()¶
Get the path to the nvcc compiler
- Return type:
str