AscendC passes

-ascendc-allocate-tensor

Allocate tensors for LocalTensorAuto

-ascendc-compute-memory-consumption

Compute memory consumption for kernel functions

-ascendc-declare-py-struct

Insert emitasc.declare_py_struct

-ascendc-define-cube-only

Insert emitc.define if CUBE-ONLY

-ascendc-detect-enable-debug

Check whether the kernel is using debug utils

-ascendc-detect-kernel-type

Check whether the kernel is vector-only or mixed

-ascendc-erase-sync

Erase intra-core synchronization operations

-ascendc-fill-asc-operands

Filling AscendC operands

-ascendc-fuse-bufid-sync

Erase unnecessary get_buf and rls_buf

-ascendc-generate-boilerplate

Insert emitc.include and additional boilerplate code

-ascendc-hoist-que-bind

Hoist TQueBind, TQue, TBuf initialization operations

-ascendc-hoist-ub-allocation

Hoist tensor allocations to the function root

Options

-exclude-in-out : Prevent input/output tensors from hoisting

-ascendc-input-output-tensor

Set input and output operand for local_tensor_auto

-ascendc-insert-bufid-sync

Insert get_buf and rls_buf synchronization

-ascendc-insert-sync

Insert intra-core synchronization operations

-ascendc-legalize-kernel-args

Attach emitasc.kernel_arg attributes and insert operations

Options

-set-ffts-addr : Append ffts_addr to kernel arguments and call set_ffts_addr

-ascendc-lower-to-l0

Lowering L2 to L0 Ascend C operations

-ascendc-materialize-tensor

Insert ascendc.tbuf, ascendc.queue and ascendc.alloca for local_tensor_auto

Options

-always-buf : Always use TBuf to materialize tensors

-ascendc-noop

This pass does nothing

-ascendc-privatize-func

Mark functions without ascendc.global attribute as private

-ascendc-reuse-ub-allocation

Perform analysis and reuse temporary UB allocations

Options

-reuse-in-out : Allow input/output tensors to be reusable

-ascendc-unify-pipe

Unify pipe opertation

-ascendc-verify-sync

Verify TQue synchronization