tilelang.profiler package#
Submodules#
Module contents#
The profiler and convert to torch utils
- class tilelang.profiler.Profiler(params: List[KernelParam], result_idx: List[int], supply_type: TensorSupplyType, adapter: Optional[BaseKernelAdapter] = None)#
Bases:
object
A profiler class for benchmarking and validating kernel implementations.
- params#
List of kernel parameters defining the input/output specifications
- Type:
- result_idx#
Indices indicating which parameters are output tensors
- Type:
List[int]
- supply_type#
Type of tensor supply to use (e.g., random, zeros, etc.)
- adapter#
Optional kernel adapter for interfacing with different backends
- Type:
- adapter: Optional[BaseKernelAdapter] = None#
- assert_allclose(reference_program: Callable, input_tensors: Optional[List[torch.Tensor]] = None, atol: float = 0.01, rtol: float = 0.01, max_mismatched_ratio=0.01)#
Validates kernel output against a reference implementation.
- Parameters:
reference_program β Reference implementation to compare against
input_tensors β Optional pre-generated input tensors
atol β Absolute tolerance for comparison
rtol β Relative tolerance for comparison
max_mismatched_ratio β Maximum allowed ratio of mismatched elements
- assert_consistent(repeat=10)#
Checks for kernel consistency across multiple runs.
- Parameters:
repeat β Number of times to repeat the consistency check
- determine_profiler(func: Optional[Callable] = None)#
Determines which profiler backend to use based on function type.
- Parameters:
func β Function to be profiled
profiler β Explicitly specified profiler type or βautoβ for automatic detection
- Returns:
The determined profiler type (βtorchβ or βtvmβ)
- Return type:
str
- do_bench(func: Optional[Callable] = None, warmup: int = 25, rep: int = 100, n_warmup: int = 1, n_repeat: int = 1, input_tensors: Optional[List[torch.Tensor]] = None) float #
Benchmarks the execution time of a given function.
- Parameters:
func β Function to benchmark (uses adapter if None)
warmup β Warmup time in milliseconds
rep β Number of repetitions for timing
n_warmup β Number of warmup iterations
n_repeat β Number of timing iterations
profiler β Which profiling backend to use
input_tensors β Optional pre-generated input tensors
- Returns:
Average execution time in milliseconds
- Return type:
float
- property func#
- params: List[KernelParam]#
- result_idx: List[int]#
- run_once(func: Optional[Callable] = None)#
- supply_type: TensorSupplyType#
- with_default_adapter(adapter: BaseKernelAdapter) Profiler #