AscendC passes
-ascendc-allocate-tensor
Allocate tensors for LocalTensorAuto
-ascendc-compute-memory-consumption
Compute memory consumption for kernel functions
-ascendc-declare-py-struct
Insert emitasc.declare_py_struct
-ascendc-define-cube-only
Insert emitc.define if CUBE-ONLY
-ascendc-detect-enable-debug
Check whether the kernel is using debug utils
-ascendc-detect-kernel-type
Check whether the kernel is vector-only or mixed
-ascendc-erase-sync
Erase intra-core synchronization operations
-ascendc-fill-asc-operands
Filling AscendC operands
-ascendc-fuse-bufid-sync
Erase unnecessary get_buf and rls_buf
-ascendc-generate-boilerplate
Insert emitc.include and additional boilerplate code
-ascendc-hoist-que-bind
Hoist TQueBind, TQue, TBuf initialization operations
-ascendc-hoist-ub-allocation
Hoist tensor allocations to the function root
Options
-exclude-in-out : Prevent input/output tensors from hoisting
-ascendc-input-output-tensor
Set input and output operand for local_tensor_auto
-ascendc-insert-bufid-sync
Insert get_buf and rls_buf synchronization
-ascendc-insert-sync
Insert intra-core synchronization operations
-ascendc-legalize-kernel-args
Attach emitasc.kernel_arg attributes and insert operations
Options
-set-ffts-addr : Append ffts_addr to kernel arguments and call set_ffts_addr
-ascendc-lower-to-l0
Lowering L2 to L0 Ascend C operations
-ascendc-materialize-tensor
Insert ascendc.tbuf, ascendc.queue and ascendc.alloca for local_tensor_auto
Options
-always-buf : Always use TBuf to materialize tensors
-ascendc-noop
This pass does nothing
-ascendc-privatize-func
Mark functions without ascendc.global attribute as private
-ascendc-reuse-ub-allocation
Perform analysis and reuse temporary UB allocations
Options
-reuse-in-out : Allow input/output tensors to be reusable
-ascendc-unify-pipe
Unify pipe opertation
-ascendc-verify-sync
Verify TQue synchronization