LetStmt Inlining in TileLang¶
This document explains how LetStmt
inlining works in TileLang’s simplification pipeline, which is an important optimization that affects code generation and performance.
Overview¶
A LetStmt
(Let Statement) is a temporary variable binding in the IR (Intermediate Representation). During compilation, TileLang’s simplifier may choose to inline these temporary variables to simplify the code. TileLang also provides a standalone LetInline
pass that performs eager substitution before the main legalization pipeline. However, not all LetStmt
nodes can be safely inlined.
When Does LetStmt Get Inlined?¶
The inlining logic is implemented in src/transform/simplify.cc
. A LetStmt
will be inlined if both of the following conditions are met:
1. The value satisfies CanInlineLetStmt
¶
The CanInlineLetStmt
helper returns true
when:
The value is a constant (
is_const_number(op->value)
returns true)The value is a variable (
op->value.as<VarNode>()
returns a node)The value is an integer expression without side effects:
The value has
int
dtypeThe side effect level is
kPure
or lower (no observable side effects)
bool CanInlineLetStmt(const LetStmtNode *op) {
if (is_const_number(op->value))
return true;
if (op->value.as<VarNode>())
return true;
// Won't face the deep expression explosion problem as in Let expression.
// attempt to inline as much as possible if the value integer type(can be
// index).
if (!op->value.dtype().is_int())
return false;
return SideEffect(op->value) <= CallEffectKind::kPure;
}
2. The variable is NOT used in buffer definitions¶
Even if CanInlineLetStmt
returns true, the variable will not be inlined if it’s used in a buffer’s definition (shape, strides, elem_offset, or data fields).
This protection exists because:
Buffer definitions are not updated during the simplification pass
If a variable used in a buffer definition is inlined, later references to that buffer would fail to find the variable definition
This would cause compilation errors or incorrect behavior
The mutator checks this before dropping the binding:
bool used_in_buffer_def = used_in_buffer_def_.count(op->var.get());
if (can_inline && !used_in_buffer_def) {
return body; // Inline: remove LetStmt and return body directly
}
Example: Why Buffer Definition Variables Are Protected¶
Consider this code:
let stride = M * 16
let buffer_a = Buffer(data, shape=[M, N], strides=[stride, 1])
buffer_a[i, j] = ...
stride
satisfiesCanInlineLetStmt
(it’s an int expression with no side effects)However,
stride
is used inbuffer_a
’sstrides
fieldIf we inline it, the buffer definition becomes
strides=[M*16, 1]
But the Buffer object’s fields are not updated during simplification
Later code accessing
buffer_a
would fail to find thestride
variable
Therefore, stride
is added to used_in_buffer_def_
and will not be inlined.
How Variables Are Collected¶
The CollectVarsUsedInBufferDefinition
helper traverses all BufferLoad
and BufferStore
nodes and collects variables used in their buffer definitions:
void VisitBuffer(const Buffer &buf) {
// Collect variables that should remain defined
VarUseDefAnalyzer usage(Array<Var>{});
usage(buf->data);
for (const auto &dim : buf->shape) {
usage(dim);
}
for (const auto &dim : buf->strides) {
usage(dim);
}
usage(buf->elem_offset);
// Track for use in LetStmtNode mutator
for (const auto &var : usage.undefined_) {
used_in_buffer_def_.insert(var.get());
}
}
Practical Example: Temporary Variable Issue¶
Consider this TileLang code:
for i in T.Parallel(block_N):
idx = bx * block_N + i
tmp = T.max(A[idx], 1)
B[idx] = tmp / 2
A[idx] = tmp * 2
In this case:
tmp
is an integer-like temporary variableIt satisfies
CanInlineLetStmt
(pure int expression)It’s not used in any buffer definition
Therefore,
tmp
will be inlined
This means the IR becomes:
for i in T.Parallel(block_N):
idx = bx * block_N + i
B[idx] = T.max(A[idx], 1) / 2
A[idx] = T.max(A[idx], 1) * 2
If this causes issues (e.g., A[idx]
being read twice with different values due to the first write), it indicates a potential problem with the inlining heuristic or the code pattern.
Controlling Let Inlining via Pass Config¶
TileLang exposes an explicit pass configuration key, tilelang.PassConfigKey.TL_FORCE_LET_INLINE
("tl.force_let_inline"
), that allows users to force the eager LetInline
pass to run before the legalization pipeline begins. When enabled, the pipeline invokes tilelang.transform.LetInline()
at the start of LowerAndLegalize
(see tilelang/engine/phase.py
). This knob is useful when debugging LetStmt-related issues or when deterministic inlining behavior is desired across different environments.
from tilelang import transform
from tilelang.engine.phase import LowerAndLegalize
with transform.PassContext(
config={transform.PassConfigKey.TL_FORCE_LET_INLINE: True}
):
lowered_mod = LowerAndLegalize(input_mod, target)
If the flag is left unset (the default), the eager pass is only applied when downstream transforms opt in (for example, by calling _Simplify(..., inline_let=True)
inside Tile operators). The guard in tilelang/engine/phase.py
ensures the eager pass is only triggered when the user explicitly requests it.
Summary¶
The LetStmt inlining mechanism is a conservative optimization that:
Aggressively inlines simple, pure integer expressions to simplify the IR
Protects variables used in buffer definitions to avoid breaking buffer access
Helps reduce IR complexity and improve code generation
Can be forced through
TL_FORCE_LET_INLINE
when deterministic eager inlining is required
Understanding when inlining happens is crucial for:
Debugging compilation issues
Understanding generated code
Writing efficient TileLang programs
Identifying potential optimization opportunities or bugs