tilelang.language.kernel ======================== .. py:module:: tilelang.language.kernel .. autoapi-nested-parse:: The language interface for tl programs. Classes ------- .. autoapisummary:: tilelang.language.kernel.FrameStack tilelang.language.kernel.KernelLaunchFrame Functions --------- .. autoapisummary:: tilelang.language.kernel.Kernel tilelang.language.kernel.get_thread_binding tilelang.language.kernel.get_thread_bindings tilelang.language.kernel.get_block_binding tilelang.language.kernel.get_block_bindings tilelang.language.kernel.get_thread_extent tilelang.language.kernel.get_thread_extents tilelang.language.kernel.get_block_extent tilelang.language.kernel.get_block_extents Module Contents --------------- .. py:class:: FrameStack A simple stack-like wrapper around a deque that provides push, pop, and top methods for convenience. .. py:method:: push(item) Pushes an item onto the top of the stack. .. py:method:: pop() Pops and returns the top of the stack, or returns None if the stack is empty. .. py:method:: top() Returns the item on the top of the stack without removing it, or None if the stack is empty. .. py:method:: size() Returns the number of items in the stack. .. py:method:: __len__() Returns the number of items in the stack. .. py:method:: __bool__() Allows truthy checks on the stack object itself, e.g., 'if stack: ...' .. py:class:: KernelLaunchFrame Bases: :py:obj:`tvm.script.ir_builder.tir.frame.TIRFrame` KernelLaunchFrame is a custom TIRFrame that manages block/thread indices and handles the entry and exit of the kernel launch scope. .. py:method:: __enter__() Enters the KernelLaunchFrame scope and pushes this frame onto the stack. Returns one Var if we detect exactly 5 frames (meaning there is a single block dimension), or a list of Vars otherwise. .. py:method:: __exit__(ptype, value, trace) Exits the KernelLaunchFrame scope and pops this frame from the stack, but only if it's indeed the topmost frame. .. py:method:: Current() :classmethod: Returns the topmost (current) KernelLaunchFrame from the stack if it exists, or None if the stack is empty. .. py:method:: get_block_extent(dim) Returns the block extent for the given dimension. dim=0 corresponds to blockIdx.x, dim=1 to blockIdx.y, and dim=2 to blockIdx.z. .. py:method:: get_block_extents() Returns the block extents for all three dimensions. .. py:method:: get_thread_extent(dim) Returns the thread extent for the given dimension. dim=0 corresponds to threadIdx.x, dim=1 to threadIdx.y, and dim=2 to threadIdx.z. .. py:method:: get_thread_extents() Returns the thread extents for all three dimensions. .. py:method:: get_thread_binding(dim = 0) Returns the thread binding for the given dimension. dim=0 corresponds to threadIdx.x, dim=1 to threadIdx.y, and dim=2 to threadIdx.z. .. py:method:: get_thread_bindings() Returns the thread binding for the given dimension. dim=0 corresponds to threadIdx.x, dim=1 to threadIdx.y, and dim=2 to threadIdx.z. .. py:method:: get_num_threads() Returns the thread indices from the topmost frame. .. py:method:: get_block_binding(dim = 0) Returns the block binding for the given dimension. dim=0 corresponds to blockIdx.x, dim=1 to blockIdx.y, and dim=2 to blockIdx.z. .. py:method:: get_block_bindings() Returns all three block bindings. .. py:property:: blocks :type: List[tvm.tir.Var] Returns the block indices from the topmost frame. .. py:property:: threads :type: List[tvm.tir.Var] Returns the thread indices from the topmost frame. .. py:property:: num_threads :type: int Returns the total number of threads. .. py:function:: Kernel(*blocks, threads = None, is_cpu = False, prelude = None) Tools to quickly construct a GPU kernel launch frame. :param blocks: A list of extent, can be 1-3 dimension, representing gridDim.(x|y|z) :type blocks: List[int] :param threads: A integer representing blockDim.x Or a list of integers representing blockDim.(x|y|z) if the value is -1, we skip the threadIdx.x binding. :type threads: int :param is_cpu: Whether the kernel is running on CPU. Thus we will not bind threadIdx.x, threadIdx.y, threadIdx.z. and blockIdx.x, blockIdx.y, blockIdx.z. :type is_cpu: bool :param prelude: The import c code of the kernel, will be injected before the generated kernel code. :type prelude: str :returns: **res** -- The result LaunchThreadFrame. :rtype: Tuple[frame.LaunchThreadFrame] .. py:function:: get_thread_binding(dim = 0) Returns the thread binding for the given dimension. .. py:function:: get_thread_bindings() Returns all three thread bindings. .. py:function:: get_block_binding(dim = 0) Returns the block binding for the given dimension. .. py:function:: get_block_bindings() Returns all three block bindings. .. py:function:: get_thread_extent(dim = 0) Returns the thread extent for the given dimension. .. py:function:: get_thread_extents() Returns all three thread extents. .. py:function:: get_block_extent(dim = 0) Returns the block extent for the given dimension. .. py:function:: get_block_extents() Returns all three block extents.