tilelang.contrib.nvcc ===================== .. py:module:: tilelang.contrib.nvcc .. autoapi-nested-parse:: Utility to invoke nvcc compiler in the system Functions --------- .. autoapisummary:: tilelang.contrib.nvcc.compile_cuda tilelang.contrib.nvcc.find_cuda_path tilelang.contrib.nvcc.get_cuda_version tilelang.contrib.nvcc.tilelang_callback_cuda_compile tilelang.contrib.nvcc.find_libdevice_path tilelang.contrib.nvcc.callback_libdevice_path tilelang.contrib.nvcc.get_target_compute_version tilelang.contrib.nvcc.parse_compute_version tilelang.contrib.nvcc.have_fp16 tilelang.contrib.nvcc.have_int8 tilelang.contrib.nvcc.have_tensorcore tilelang.contrib.nvcc.have_cudagraph tilelang.contrib.nvcc.have_bf16 tilelang.contrib.nvcc.have_fp8 tilelang.contrib.nvcc.have_tma tilelang.contrib.nvcc.get_nvcc_compiler Module Contents --------------- .. py:function:: compile_cuda(code, target_format='ptx', arch=None, options=None, path_target=None, verbose=False) Compile cuda code with NVCC from env. :param code: The cuda code. :type code: str :param target_format: The target format of nvcc compiler. :type target_format: str :param arch: The cuda architecture. :type arch: str :param options: The additional options. :type options: str or list of str :param path_target: Output file. :type path_target: str, optional :returns: **cubin** -- The bytearray of the cubin :rtype: bytearray .. py:function:: find_cuda_path() Utility function to find cuda path :returns: **path** -- Path to cuda root. :rtype: str .. py:function:: get_cuda_version(cuda_path=None) Utility function to get cuda version :param cuda_path: Path to cuda root. If None is passed, will use `find_cuda_path()` as default. :type cuda_path: Optional[str] :returns: **version** -- The cuda version :rtype: float .. py:function:: tilelang_callback_cuda_compile(code, target) use nvcc to generate fatbin code for better optimization .. py:function:: find_libdevice_path(arch) Utility function to find libdevice :param arch: The compute architecture in int :type arch: int :returns: **path** -- Path to libdevice. :rtype: str .. py:function:: callback_libdevice_path(arch) .. py:function:: get_target_compute_version(target=None) Utility function to get compute capability of compilation target. Looks for the target arch in three different places, first in the target input, then the Target.current() scope, and finally the GPU device (if it exists). :param target: The compilation target :type target: tvm.target.Target, optional :returns: **compute_version** -- compute capability of a GPU (e.g. "8.6" or "9.0") :rtype: str .. py:function:: parse_compute_version(compute_version) Parse compute capability string to divide major and minor version :param compute_version: compute capability of a GPU (e.g. "6.0") :type compute_version: str :returns: * **major** (*int*) -- major version number * **minor** (*int*) -- minor version number .. py:function:: have_fp16(compute_version) Either fp16 support is provided in the compute capability or not :param compute_version: compute capability of a GPU (e.g. "6.0") :type compute_version: str .. py:function:: have_int8(compute_version) Either int8 support is provided in the compute capability or not :param compute_version: compute capability of a GPU (e.g. "6.1") :type compute_version: str .. py:function:: have_tensorcore(compute_version=None, target=None) Either TensorCore support is provided in the compute capability or not :param compute_version: compute capability of a GPU (e.g. "7.0"). :type compute_version: str, optional :param target: The compilation target, will be used to determine arch if compute_version isn't specified. :type target: tvm.target.Target, optional .. py:function:: have_cudagraph() Either CUDA Graph support is provided .. py:function:: have_bf16(compute_version) Either bf16 support is provided in the compute capability or not :param compute_version: compute capability of a GPU (e.g. "8.0") :type compute_version: str .. py:function:: have_fp8(compute_version) Whether fp8 support is provided in the specified compute capability or not :param compute_version: GPU capability :type compute_version: str .. py:function:: have_tma(target) Whether TMA support is provided in the specified compute capability or not :param target: The compilation target :type target: tvm.target.Target .. py:function:: get_nvcc_compiler() Get the path to the nvcc compiler