tilelang.autotuner.param¶

The auto-tune parameters.

Attributes¶

`BEST_CONFIG_PATH`
`FUNCTION_PATH`
`LATENCY_PATH`
`KERNEL_PATH`
`WRAPPED_KERNEL_PATH`
`KERNEL_LIB_PATH`
`PARAMS_PATH`

Classes¶

`CompileArgs`	Compile arguments for the auto-tuner. Detailed description can be found in tilelang.jit.compile.
`ProfileArgs`	Profile arguments for the auto-tuner.
`AutotuneResult`	Results from auto-tuning process.

Module Contents¶

tilelang.autotuner.param.BEST_CONFIG_PATH = 'best_config.json'¶

tilelang.autotuner.param.FUNCTION_PATH = 'function.pkl'¶

tilelang.autotuner.param.LATENCY_PATH = 'latency.json'¶

tilelang.autotuner.param.KERNEL_PATH = 'kernel.cu'¶

tilelang.autotuner.param.WRAPPED_KERNEL_PATH = 'wrapped_kernel.cu'¶

tilelang.autotuner.param.KERNEL_LIB_PATH = 'kernel_lib.so'¶

tilelang.autotuner.param.PARAMS_PATH = 'params.pkl'¶

class tilelang.autotuner.param.CompileArgs¶

Compile arguments for the auto-tuner. Detailed description can be found in tilelang.jit.compile. .. attribute:: out_idx

List of output tensor indices.

execution_backend¶: Execution backend to use for kernel execution (default: “cython”).

target¶: Compilation target, either as a string or a TVM Target object (default: “auto”).

target_host¶: Target host for cross-compilation (default: None).

verbose¶: Whether to enable verbose output (default: False).

pass_configs¶: Additional keyword arguments to pass to the Compiler PassContext.

Available options: “tir.disable_vectorize”: bool, default: False “tl.disable_tma_lower”: bool, default: False “tl.disable_warp_specialized”: bool, default: False “tl.config_index_bitwidth”: int, default: None “tl.disable_dynamic_tail_split”: bool, default: False “tl.dynamic_vectorize_size_bits”: int, default: 128 “tl.disable_safe_memory_legalize”: bool, default: False

out_idx: List[int] | int | None = None¶

execution_backend: Literal['dlpack', 'ctypes', 'cython'] = 'cython'¶

target: Literal['auto', 'cuda', 'hip'] = 'auto'¶

target_host: str | tvm.target.Target = None¶

verbose: bool = False¶

pass_configs: Dict[str, Any] | None = None¶

compile_program(program)¶

Parameters:: program (tvm.tir.PrimFunc)

__hash__()¶

class tilelang.autotuner.param.ProfileArgs¶

Profile arguments for the auto-tuner.

warmup¶: Number of warmup iterations.

rep¶: Number of repetitions for timing.

timeout¶: Maximum time per configuration.

supply_type¶: Type of tensor supply mechanism.

ref_prog¶: Reference program for correctness validation.

supply_prog¶: Supply program for input tensors.

out_idx¶: Union[List[int], int] = -1

supply_type¶: tilelang.TensorSupplyType = tilelang.TensorSupplyType.Auto

ref_prog¶: Callable = None

supply_prog¶: Callable = None

rtol¶: float = 1e-2

atol¶: float = 1e-2

max_mismatched_ratio¶: float = 0.01

skip_check¶: bool = False

manual_check_prog¶: Callable = None

cache_input_tensors¶: bool = True

warmup: int = 25¶

rep: int = 100¶

timeout: int = 30¶

supply_type: tilelang.TensorSupplyType¶

ref_prog: Callable = None¶

supply_prog: Callable = None¶

rtol: float = 0.01¶

atol: float = 0.01¶

max_mismatched_ratio: float = 0.01¶

skip_check: bool = False¶

manual_check_prog: Callable = None¶

cache_input_tensors: bool = True¶

__hash__()¶

class tilelang.autotuner.param.AutotuneResult¶

Results from auto-tuning process.

latency¶: Best achieved execution latency.

config¶: Configuration that produced the best result.

ref_latency¶: Reference implementation latency.

libcode¶: Generated library code.

func¶: Optimized function.

kernel¶: Compiled kernel function.

latency: float | None = None¶

config: dict | None = None¶

ref_latency: float | None = None¶

libcode: str | None = None¶

func: Callable | None = None¶

kernel: Callable | None = None¶

save_to_disk(path, verbose=False)¶

Parameters:

path (pathlib.Path)
verbose (bool)

classmethod load_from_disk(path, compile_args)¶

Parameters:

path (pathlib.Path)
compile_args (CompileArgs)

Return type:

AutotuneResult