C++ API Reference

C++ API Reference#

SDK qb Runtime Library: C++ API Reference

SDK qb Runtime Library v1.0

MCS001-

C++ API provides core functionalities for the NPU. More...

Classes
class	mobilint::Accelerator
	Represents an accelerator, i.e., an NPU, used for executing models. More...
class	mobilint::FutureImpl< T >
class	mobilint::Future< T >
	Represents a future for retrieving the result of asynchronous inference. More...
class	mobilint::Model
	Represents an AI model loaded from an MXQ file. More...
class	mobilint::ModelVariantHandle
	Handle to a specific variant of a loaded model. More...
class	mobilint::NDArray< T >
	A class representing an N-dimensional array (NDArray). More...
struct	mobilint::Scale
	Struct for scale values. More...
struct	mobilint::CoreId
	Represents a unique identifier for an NPU core. More...
struct	mobilint::Buffer
	A simple byte-sized buffer. More...
struct	mobilint::BufferInfo
	Struct representing input/output buffer information. More...
class	mobilint::ModelConfig
	Configures a core mode and core allocation of a model for NPU inference. More...
struct	mobilint::CacheInfo
	Struct representing KV-cache information. More...

Enumerations
enum class	mobilint::StatusCode { StatusCode::OK = 0 , StatusCode::InternalError = 23 , StatusCode::NotImplemented = 18 , StatusCode::BadAlloc = 39 , StatusCode::Acc_CoreAlreadyInUse = 11 , StatusCode::Acc_NPUTimeout = 42 , StatusCode::Acc_NoIMemInitFound = 43 , StatusCode::Acc_NoSuchModel = 12 , StatusCode::Acc_TaskQueueNotFound = 28 , StatusCode::Driver_FailedToAllocateHostMemory = 13 , StatusCode::Driver_FailedToAllocateModelMemory = 44 , StatusCode::Driver_FailedToBuildCmaIoReq = 45 , StatusCode::Driver_FailedToClaimCores = 24 , StatusCode::Driver_FailedToFreeModelMemory = 46 , StatusCode::Driver_FailedToLockCore = 49 , StatusCode::Driver_FailedToPostInfer = 33 , StatusCode::Driver_FailedToReadMemoryBuffer = 15 , StatusCode::Driver_FailedToUnclaimCores = 25 , StatusCode::Driver_FailedToUnlockCore = 50 , StatusCode::Driver_FailedToWaitDone = 34 , StatusCode::Driver_FailedToWriteMemoryBuffer = 14 , StatusCode::Driver_NotInitialized = 1 , StatusCode::Driver_WaitDoneTimeout = 35 , StatusCode::Driver_WrongBaseAddress = 47 , StatusCode::MemoryPool_AllocatorNotSet = 30 , StatusCode::MemoryPool_BufNotFound = 29 , StatusCode::Model_AlreadyLaunched = 65 , StatusCode::Model_AsyncPipelineCheckFailed = 55 , StatusCode::Model_AsyncPipelineNotAlive = 56 , StatusCode::Model_AsyncPipelineTimeout = 57 , StatusCode::Model_BrokenMXQ = 20 , StatusCode::Model_BufferSizeMismatched = 63 , StatusCode::Model_CacheOverflow = 53 , StatusCode::Model_DtypeMismatched = 22 , StatusCode::Model_FailedToAllocMemory = 32 , StatusCode::Model_FailedToFindDirectory = 61 , StatusCode::Model_FailedToLoadMXQ = 3 , StatusCode::Model_FailedToOpenCacheFile = 62 , StatusCode::Model_FailedToOpenModelDescFile = 19 , StatusCode::Model_FailedToOpenOutputScale = 6 , StatusCode::Model_FailedToOpenScaleFile = 4 , StatusCode::Model_FailedToOpenSectionBinary = 8 , StatusCode::Model_FailedToSaveTensor = 37 , StatusCode::Model_NumCacheMismatched = 60 , StatusCode::Model_InvalidNPUDtype = 51 , StatusCode::Model_InvalidOutputScaleValue = 7 , StatusCode::Model_InvalidRmemType = 52 , StatusCode::Model_InvalidScaleValue = 5 , StatusCode::Model_InvalidSupplementary = 59 , StatusCode::Model_InvalidVariantIdx = 64 , StatusCode::Model_IsNotPacked = 27 , StatusCode::Model_IsNotSupportedHardware = 48 , StatusCode::Model_IsPacked = 26 , StatusCode::Model_MXQAndModelConfigNotMatch = 16 , StatusCode::Model_NoCache = 54 , StatusCode::Model_NoGlobalCoreWithGlobalMultiMode = 17 , StatusCode::Model_NoTargetCores = 2 , StatusCode::Model_NotAlive = 10 , StatusCode::Model_NotLaunched = 9 , StatusCode::Model_PredictError = 36 , StatusCode::Model_ShapeMismatched = 21 , StatusCode::Model_TaskQueueClosed = 40 , StatusCode::Model_TaskQueueTimeout = 41 , StatusCode::Model_UnexpectedMemoryFormat = 38 , StatusCode::Future_NotValid = 58 }
	Enumerates status codes for the qbruntime. More...
enum class	mobilint::Cluster : int32_t { Cluster::Cluster0 = 1 << 16 , Cluster::Cluster1 = 2 << 16 , Cluster::Error = 0x7FFF'0000 }
	Enumerates clusters in the ARIES NPU. More...
enum class	mobilint::Core : int32_t { Core::Core0 = 1 , Core::Core1 = 2 , Core::Core2 = 3 , Core::Core3 = 4 , Core::All = 0x0000'FFFC , Core::GlobalCore = 0x0000'FFFE , Core::Error = 0x0000'FFFF }
	Enumerates cores within a cluster in the ARIES NPU. More...
enum class	mobilint::CoreAllocationPolicy { CoreAllocationPolicy::Auto , CoreAllocationPolicy::Manual }
	Core allocation policy. More...
enum class	mobilint::CoreMode : uint8_t { CoreMode::Single = 0 , CoreMode::Multi = 1 , CoreMode::Global = 2 , CoreMode::Global4 = 3 , CoreMode::Global8 = 4 , CoreMode::Error = 0xF }
	Defines the core mode for NPU execution. More...
enum class	mobilint::LogLevel : char { DEBUG = 1 , INFO = 2 , WARN = 3 , ERR = 4 , FATAL = 5 , OFF = 6 }
	LogLevel. More...
enum class	mobilint::CacheType : uint8_t { Default = 0 , Batch , Error = 0x0F }
	CacheType. More...

Functions
	mobilint::ModelVariantHandle::ModelVariantHandle (const ModelVariantHandle &other)=delete
	mobilint::ModelVariantHandle::ModelVariantHandle (ModelVariantHandle &&other)=delete
ModelVariantHandle &	mobilint::ModelVariantHandle::operator= (const ModelVariantHandle &rhs)=delete
ModelVariantHandle &	mobilint::ModelVariantHandle::operator= (ModelVariantHandle &&rhs) noexcept=delete
int	mobilint::ModelVariantHandle::getVariantIdx () const
	Returns the index of this model variant.
const std::vector< std::vector< int64_t > > &	mobilint::ModelVariantHandle::getModelInputShape () const
	Returns the input shape for this model variant.
const std::vector< std::vector< int64_t > > &	mobilint::ModelVariantHandle::getModelOutputShape () const
	Returns the output shape for this model variant.
const std::vector< BufferInfo > &	mobilint::ModelVariantHandle::getInputBufferInfo () const
	Returns the input buffer information for this variant.
const std::vector< BufferInfo > &	mobilint::ModelVariantHandle::getOutputBufferInfo () const
	Returns the output buffer information for this variant.
std::vector< Scale >	mobilint::ModelVariantHandle::getInputScale () const
	Returns the input quantization scale(s) for this variant.
std::vector< Scale >	mobilint::ModelVariantHandle::getOutputScale () const
	Returns the output quantization scale(s) for this variant.
const std::string	mobilint::statusCodeToString (const StatusCode status_code)
	Convert StatusCode into string.
bool	mobilint::operator! (StatusCode sc)
	Checks whether the given StatusCode represents an error.
QBRUNTIME_EXPORT std::string	mobilint::getQbRuntimeVersion ()
	Retrieves the version of the qbruntime.
QBRUNTIME_EXPORT std::string	mobilint::getQbRuntimeGitVersion ()
	Retrieves the Git commit hash of the qbruntime.
QBRUNTIME_EXPORT std::string	mobilint::getQbRuntimeVendor ()
	Retrieves the vendor name of the qbruntime.
QBRUNTIME_EXPORT std::string	mobilint::getQbRuntimeProduct ()
	Retrieves product information of the qbruntime.
QBRUNTIME_EXPORT void	mobilint::setLogLevel (LogLevel level)
QBRUNTIME_EXPORT bool	mobilint::startTracingEvents (const char *path)
	Starts event tracing and prepares to save the trace log to a specified file.
QBRUNTIME_EXPORT void	mobilint::stopTracingEvents ()
	Stops event tracing and writes the recorded trace log.
QBRUNTIME_EXPORT std::string	mobilint::getModelSummary (const std::string &mxq_path)
	Generates a structured summary of the specified MXQ model.

Friends
class	mobilint::ModelVariantHandle::ModelImpl

Buffer Management APIs
These APIs are used when performing inference with Model::inferBuffer or Model::inferBufferToFloat, using this variant’s input and output shapes. Buffers are acquired using: acquireInputBuffer acquireOutputBuffer Any acquired buffer must be released using: releaseBuffer releaseBuffers Repositioning is handled by: repositionInputs repositionOutputs Note These APIs are intended for advanced use and follow the same buffer management interface as the Model class.
std::vector< Buffer >	mobilint::ModelVariantHandle::acquireInputBuffer (const std::vector< std::vector< int > > &seqlens={}) const
std::vector< Buffer >	mobilint::ModelVariantHandle::acquireOutputBuffer (const std::vector< std::vector< int > > &seqlens={}) const
std::vector< std::vector< Buffer > >	mobilint::ModelVariantHandle::acquireInputBuffers (int batch_size, const std::vector< std::vector< int > > &seqlens={}) const
std::vector< std::vector< Buffer > >	mobilint::ModelVariantHandle::acquireOutputBuffers (int batch_size, const std::vector< std::vector< int > > &seqlens={}) const
StatusCode	mobilint::ModelVariantHandle::releaseBuffer (std::vector< Buffer > &buffer) const
StatusCode	mobilint::ModelVariantHandle::releaseBuffers (std::vector< std::vector< Buffer > > &buffers) const
StatusCode	mobilint::ModelVariantHandle::repositionInputs (const std::vector< float * > &input, std::vector< Buffer > &input_buf, const std::vector< std::vector< int > > &seqlens={}) const
StatusCode	mobilint::ModelVariantHandle::repositionOutputs (const std::vector< Buffer > &output_buf, std::vector< float * > &output, const std::vector< std::vector< int > > &seqlens={}) const
StatusCode	mobilint::ModelVariantHandle::repositionOutputs (const std::vector< Buffer > &output_buf, std::vector< std::vector< float > > &output, const std::vector< std::vector< int > > &seqlens={}) const
StatusCode	mobilint::ModelVariantHandle::repositionInputs (const std::vector< uint8_t * > &input, std::vector< Buffer > &input_buf, const std::vector< std::vector< int > > &seqlens={}) const
StatusCode	mobilint::ModelVariantHandle::repositionInputs (const std::vector< float * > &input, std::vector< std::vector< Buffer > > &input_buf, const std::vector< std::vector< int > > &seqlens={}) const
StatusCode	mobilint::ModelVariantHandle::repositionOutputs (const std::vector< std::vector< Buffer > > &output_buf, std::vector< float * > &output, const std::vector< std::vector< int > > &seqlens={}) const
StatusCode	mobilint::ModelVariantHandle::repositionOutputs (const std::vector< std::vector< Buffer > > &output_buf, std::vector< std::vector< float > > &output, const std::vector< std::vector< int > > &seqlens={}) const
StatusCode	mobilint::ModelVariantHandle::repositionInputs (const std::vector< uint8_t * > &input, std::vector< std::vector< Buffer > > &input_buf, const std::vector< std::vector< int > > &seqlens={}) const

Detailed Description

C++ API provides core functionalities for the NPU.

Enumeration Type Documentation

◆ StatusCode

enum class mobilint::StatusCode

strong

Enumerates status codes for the qbruntime.

This enumeration defines return codes used in the qbruntime to indicate success or specific error conditions. A value of StatusCode::OK (0) represents success, while any other value indicates a particular error type.

Enumerator
OK	0	OK
InternalError	23	Should never be reached, but reached anyway
NotImplemented	18	Not implemented
BadAlloc	39	Bad allocation
Acc_CoreAlreadyInUse	11	Core already in use.
Acc_NPUTimeout	42	NPU timeout
Acc_NoIMemInitFound	43	No imem initialization found
Acc_NoSuchModel	12	No such model
Acc_TaskQueueNotFound	28	Task queue not found
Driver_FailedToAllocateHostMemory	13	Failed to allocate host memory
Driver_FailedToAllocateModelMemory	44	Failed to allocate model memory
Driver_FailedToBuildCmaIoReq	45	Failed to build CMA IO request
Driver_FailedToClaimCores	24	Failed to claim cores
Driver_FailedToFreeModelMemory	46	Failed to free model memory
Driver_FailedToLockCore	49	Failed to lock core
Driver_FailedToPostInfer	33	Failed to post infer
Driver_FailedToReadMemoryBuffer	15	Failed to read memory buffer
Driver_FailedToUnclaimCores	25	Failed to unclaim cores
Driver_FailedToUnlockCore	50	Failed to unlock core
Driver_FailedToWaitDone	34	Failed to wait done
Driver_FailedToWriteMemoryBuffer	14	Failed to write memory buffer
Driver_NotInitialized	1	Not implemented
Driver_WaitDoneTimeout	35	Wait done timeout
Driver_WrongBaseAddress	47	Wrong base address
MemoryPool_AllocatorNotSet	30	Allocator not set
MemoryPool_BufNotFound	29	Buffer not found
Model_AlreadyLaunched	65	Model has been launched already
Model_AsyncPipelineCheckFailed	55	Asynchronous pipeline check failed
Model_AsyncPipelineNotAlive	56	Asynchronous pipeline not alive
Model_AsyncPipelineTimeout	57	Asynchronous pipeline Timeout
Model_BrokenMXQ	20	Broken mxq
Model_BufferSizeMismatched	63	Size of buffer is mismatched
Model_CacheOverflow	53	KV-cache overflow
Model_DtypeMismatched	22	dtype mismatched
Model_FailedToAllocMemory	32	Failed to allocate memory
Model_FailedToFindDirectory	61	Failed to find specified directory
Model_FailedToLoadMXQ	3	Failed to load mxq
Model_FailedToOpenCacheFile	62	Failed to open cache file
Model_FailedToOpenModelDescFile	19	Failed to open model description file
Model_FailedToOpenOutputScale	6	Deprecated Scale file doesn't exist any more.
Model_FailedToOpenScaleFile	4	Deprecated Scale file doesn't exist any more.
Model_FailedToOpenSectionBinary	8	Failed to open section binary
Model_FailedToSaveTensor	37	Failed to save tensor
Model_NumCacheMismatched	60	Size of KV-cache mismatched
Model_InvalidNPUDtype	51	Invalid NPU dtype
Model_InvalidOutputScaleValue	7	Deprecated Scale file doesn't exist any more.
Model_InvalidRmemType	52	Invalid RMEM type
Model_InvalidScaleValue	5	Invalid scale value
Model_InvalidSupplementary	59	Invalid supplementary inference
Model_InvalidVariantIdx	64	Invalid model variant idx
Model_IsNotPacked	27	Deprecated Packed logic doesn't exist any more.
Model_IsNotSupportedHardware	48	Is not supported hardware
Model_IsPacked	26	Deprecated Packed logic doesn't exist any more.
Model_MXQAndModelConfigNotMatch	16	mxq and model config not match
Model_NoCache	54	Model does not support KV-cache
Model_NoGlobalCoreWithGlobalMultiMode	17	No global core with global multi mode
Model_NoTargetCores	2	No target cores
Model_NotAlive	10	Not alive
Model_NotLaunched	9	Not launched
Model_PredictError	36	Predict error
Model_ShapeMismatched	21	Shape mismatched
Model_TaskQueueClosed	40	Task queue closed
Model_TaskQueueTimeout	41	Task queue timeout
Model_UnexpectedMemoryFormat	38	Deprecated
Future_NotValid	58	Future is not valid

Definition at line 26 of file status_code.h.

◆ Cluster

enum class mobilint::Cluster : int32_t

strong

Enumerates clusters in the ARIES NPU.

Note: The ARIES NPU consists of two clusters, each containing one global core and four local cores, totaling eight local cores. REGULUS has only a single cluster (Cluster0) with one local core (Core0).

Enumerator
Cluster0	1 << 16	Cluster 0
Cluster1	2 << 16	Cluster 1
Error	0x7FFF'0000	Represents an invalid or uninitialized state.

Definition at line 64 of file type.h.

◆ Core

enum class mobilint::Core : int32_t

strong

Enumerates cores within a cluster in the ARIES NPU.

Note: The ARIES NPU consists of two clusters, each containing one global core and four local cores, totaling eight local cores. REGULUS has only a single cluster (Cluster0) with one local core (Core0).

Enumerator
Core0	1	Local core 0
Core1	2	Local core 1
Core2	3	Local core 2
Core3	4	Local core 3
All	0x0000'FFFC	Deprecated
GlobalCore	0x0000'FFFE	Global core
Error	0x0000'FFFF	Represents an invalid or uninitialized state.

Definition at line 77 of file type.h.

◆ CoreAllocationPolicy

enum class mobilint::CoreAllocationPolicy

strong

Core allocation policy.

Enumerator
Auto	Auto
Manual	Manual

Definition at line 90 of file type.h.

◆ CoreMode

enum class mobilint::CoreMode : uint8_t

strong

Defines the core mode for NPU execution.

Supported core modes include single-core, multi-core, global4-core, and global8-core. For detailed explanations of each mode, refer to the following functions:

Enumerator
Single	0	Single-core mode
Multi	1	Multi-core mode
Global	2	Deprecated
Global4	3	Global4-core mode
Global8	4	Global8-core mode
Error	0xF	Represents an invalid or uninitialized state.

Definition at line 169 of file type.h.

◆ LogLevel

enum class mobilint::LogLevel : char

strong

LogLevel.

Definition at line 464 of file type.h.

◆ CacheType

enum class mobilint::CacheType : uint8_t

strong

CacheType.

Definition at line 476 of file type.h.

Function Documentation

◆ getVariantIdx()

int mobilint::ModelVariantHandle::getVariantIdx ( ) const

Returns the index of this model variant.

Returns: Index of the model variant.

◆ getModelInputShape()

const std::vector< std::vector< int64_t > > & mobilint::ModelVariantHandle::getModelInputShape ( ) const

Returns the input shape for this model variant.

Returns: Reference to the input shape.

◆ getModelOutputShape()

const std::vector< std::vector< int64_t > > & mobilint::ModelVariantHandle::getModelOutputShape ( ) const

Returns the output shape for this model variant.

Returns: Reference to the output shape.

◆ getInputBufferInfo()

const std::vector< BufferInfo > & mobilint::ModelVariantHandle::getInputBufferInfo ( ) const

Returns the input buffer information for this variant.

Returns: Reference to a vector of input buffer information.

◆ getOutputBufferInfo()

const std::vector< BufferInfo > & mobilint::ModelVariantHandle::getOutputBufferInfo ( ) const

Returns the output buffer information for this variant.

Returns: Reference to a vector of output buffer information.

◆ getInputScale()

std::vector< Scale > mobilint::ModelVariantHandle::getInputScale ( ) const

Returns the input quantization scale(s) for this variant.

Returns: Vector of input scales.

◆ getOutputScale()

std::vector< Scale > mobilint::ModelVariantHandle::getOutputScale ( ) const

Returns the output quantization scale(s) for this variant.

Returns: Vector of output scales.

◆ statusCodeToString()

const std::string mobilint::statusCodeToString ( const StatusCode status_code )

Convert StatusCode into string.

Parameters

[in] status_code The StatusCode to convert.

Returns: String of each StatusCode if valid, else nullptr.

◆ operator!()

bool mobilint::operator! ( StatusCode sc )

inline

Checks whether the given StatusCode represents an error.

This operator enables StatusCode to be used in conditional statements for error checking. It returns false if sc is StatusCode::OK (0) and true otherwise.

StatusCode sc = someFunction();
if (!sc) { // `!sc` is true when `sc` is not `StatusCode::OK`.
    // Handle error
}

Parameters

[in] sc The StatusCode to evaluate.

Returns: True if sc is an error (nonzero), false if it is StatusCode::OK.

Definition at line 125 of file status_code.h.

◆ getQbRuntimeVersion()

QBRUNTIME_EXPORT std::string mobilint::getQbRuntimeVersion ( )

Retrieves the version of the qbruntime.

Returns: A string representing the runtime version.

◆ getQbRuntimeGitVersion()

QBRUNTIME_EXPORT std::string mobilint::getQbRuntimeGitVersion ( )

Retrieves the Git commit hash of the qbruntime.

Returns: A string containing the Git hash.

◆ getQbRuntimeVendor()

QBRUNTIME_EXPORT std::string mobilint::getQbRuntimeVendor ( )

Retrieves the vendor name of the qbruntime.

Typically, this function returns "mobilint."

Returns: A string containing the vendor name.

◆ getQbRuntimeProduct()

QBRUNTIME_EXPORT std::string mobilint::getQbRuntimeProduct ( )

Retrieves product information of the qbruntime.

This function indicates the product for which the qbruntime is built. For example, it may return values such as "aries2-v4" or "regulus-v4."

Returns: A string containing product details.

◆ startTracingEvents()

QBRUNTIME_EXPORT bool mobilint::startTracingEvents ( const char * path )

Starts event tracing and prepares to save the trace log to a specified file.

The trace log is recorded in "Chrome Tracing JSON format," which can be viewed at https://ui.perfetto.dev/.

The trace log is not written immediately; it is saved only when stopTracingEvents() is called.

Parameters

[in] path The file path where the trace log should be stored.

Returns: True if tracing starts successfully, false otherwise.

◆ stopTracingEvents()

QBRUNTIME_EXPORT void mobilint::stopTracingEvents ( )

Stops event tracing and writes the recorded trace log.

This function finalizes tracing and saves the collected trace data to the file specified when startTracingEvents() was called.

◆ getModelSummary()

QBRUNTIME_EXPORT std::string mobilint::getModelSummary ( const std::string & mxq_path )

Generates a structured summary of the specified MXQ model.

Returns an overview of the model contained in the MXQ file, including:

Target NPU hardware
Supported core modes and their associated cores
The total number of model variants
For each variant:
- Input and output tensor shapes
- A list of layers with their types, output shapes, and input layer indices

The summary is returned as a human-readable string in a table and is useful for inspecting model compatibility, structure, and input/output shapes.

Parameters

[in] mxq_path Path to the MXQ model file.

Returns: A formatted string containing the model summary.

Friends

◆ ModelImpl

friend class ModelImpl

friend

Definition at line 163 of file model_variant_handle.h.

C++ API Reference

C++ API Reference#

Classes

Enumerations

Functions

Friends

Buffer Management APIs

Detailed Description

Enumeration Type Documentation

◆ StatusCode

◆ Cluster

◆ Core

◆ CoreAllocationPolicy

◆ CoreMode

◆ LogLevel

◆ CacheType

Function Documentation

◆ getVariantIdx()

◆ getModelInputShape()

◆ getModelOutputShape()

◆ getInputBufferInfo()

◆ getOutputBufferInfo()

◆ getInputScale()

◆ getOutputScale()

◆ statusCodeToString()

◆ operator!()

◆ getQbRuntimeVersion()

◆ getQbRuntimeGitVersion()

◆ getQbRuntimeVendor()

◆ getQbRuntimeProduct()

◆ startTracingEvents()

◆ stopTracingEvents()

◆ getModelSummary()

Friends

◆ ModelImpl