ModelConfig Class Reference

ModelConfig Class Reference#

Runtime Library: mobilint::ModelConfig Class Reference

Runtime Library v0.30

Mobilint SDK qb

mobilint
ModelConfig

Configures a core mode and core allocation of a model for NPU inference. More...

#include <type.h>

Public Member Functions
	ModelConfig ()
	Default constructor. This default-constructed object is initially set to single-core mode with all NPU local cores included.
bool	setSingleCoreMode (int num_cores)
	Sets the model to use single-core mode for inference with a specified number of local cores.
bool	setSingleCoreMode (std::vector< CoreId > core_ids)
	Sets the model to use single-core mode for inference with a specific set of NPU local cores.
bool	setMultiCoreMode (std::vector< Cluster > clusters)
	Sets the model to use multi-core mode for batch inference.
bool	setGlobal4CoreMode (std::vector< Cluster > clusters)
	Sets the model to use global4-core mode for inference with a specified set of NPU clusters.
bool	setGlobal8CoreMode ()
	Sets the model to use global8-core mode for inference.
CoreMode	getCoreMode () const
	Gets the core mode to be applied to the model.
CoreAllocationPolicy	getCoreAllocationPolicy () const
	Gets the core allocation policy to be applied to the model.
int	getNumCores () const
	Gets the number of cores to be allocated for the model.
bool	forceSingleNPUBundle (int npu_bundle_index)
	Forces the use of a specific NPU bundle.
int	getForcedNPUBundleIndex () const
	Retrieves the index of the forced NPU bundle.
const std::vector< CoreId > &	getCoreIds () const
	Returns the list of NPU CoreIds to be used for model inference.
void	setAsyncPipelineEnabled (bool enable)
	Enables or disables the asynchronous pipeline required for asynchronous inference.
bool	getAsyncPipelineEnabled () const
	Returns whether the asynchronous pipeline is enabled in this configuration.
	ModelConfig (int num_cores)
bool	includeAllCores ()
bool	excludeAllCores ()
bool	include (Cluster cluster, Core core)
bool	include (Cluster cluster)
bool	include (Core core)
bool	exclude (Cluster cluster, Core core)
bool	exclude (Cluster cluster)
bool	exclude (Core core)
bool	setGlobalCoreMode (std::vector< Cluster > clusters)
bool	setAutoMode (int num_cores=1)
bool	setManualMode ()

Public Attributes
SchedulePolicy	schedule_policy = SchedulePolicy::FIFO
LatencySetPolicy	latency_set_policy = LatencySetPolicy::Auto
MaintenancePolicy	maintenance_policy = MaintenancePolicy::Maintain
std::vector< uint64_t >	early_latencies
std::vector< uint64_t >	finish_latencies

Detailed Description

Configures a core mode and core allocation of a model for NPU inference.

The ModelConfig class provides methods for setting a core mode and allocating cores for NPU inference. Supported core modes are single-core, multi-core, global4-core, and global8-core. Users can also specify which cores to allocate for the model. Additionally, the configuration offers an option to enforce the use of a specific NPU bundle.

Note: Deprecated functions are included for backward compatibility, but it is recommended to use the newer core mode configuration methods.

Definition at line 257 of file type.h.

Constructor & Destructor Documentation

◆ ModelConfig()

mobilint::ModelConfig::ModelConfig ( int num_cores )

explicit

deprecated

Member Function Documentation

◆ setSingleCoreMode() [1/2]

bool mobilint::ModelConfig::setSingleCoreMode ( int num_cores )

Sets the model to use single-core mode for inference with a specified number of local cores.

In single-core mode, each local core executes model inference independently. The number of cores used is specified by the num_cores parameter, and the core allocation policy is set to CoreAllocationPolicy::Auto, meaning the model will be automatically allocated to available local cores when the model is launched to the NPU, specifically when the Model::launch function is called.

Parameters

[in] num_cores The number of local cores to use for inference.

Returns: true if the mode was successfully set, false otherwise.

◆ setSingleCoreMode() [2/2]

bool mobilint::ModelConfig::setSingleCoreMode ( std::vector< CoreId > core_ids )

Sets the model to use single-core mode for inference with a specific set of NPU local cores.

In single-core mode, each local core executes model inference independently. The user can specify a vector of CoreIds to determine which cores to use for inference.

Parameters

[in] core_ids A vector of CoreIds to be used for model inference.

Returns: true if the mode was successfully set, false otherwise.

◆ setMultiCoreMode()

bool mobilint::ModelConfig::setMultiCoreMode ( std::vector< Cluster > clusters )

Sets the model to use multi-core mode for batch inference.

In multi-core mode, on Aries NPU, the four local cores within a cluster work together to process batch inference tasks efficiently. This mode is optimized for batch processing.

Parameters

[in] clusters A vector of clusters to be used for multi-core batch inference.

Returns: true if the mode was successfully set, false otherwise.

◆ setGlobal4CoreMode()

bool mobilint::ModelConfig::setGlobal4CoreMode ( std::vector< Cluster > clusters )

Sets the model to use global4-core mode for inference with a specified set of NPU clusters.

For Aries NPU, there are two clusters, each consisting of four local cores. In global4-core mode, four local cores within the same cluster work together to execute the model inference.

Parameters

[in] clusters A vector of clusters to be used for model inference.

Returns: true if the mode was successfully set, false otherwise.

◆ setGlobal8CoreMode()

bool mobilint::ModelConfig::setGlobal8CoreMode ( )

Sets the model to use global8-core mode for inference.

For Aries NPU, there are two clusters, each consisting of four local cores. In global8-core mode, all eight local cores across the two clusters work together to execute the model inference.

Returns: true if the mode was successfully set, false otherwise.

◆ getCoreMode()

CoreMode mobilint::ModelConfig::getCoreMode ( ) const

inline

Gets the core mode to be applied to the model.

This reflects the core mode that will be used when the model is created.

Returns: The CoreMode to be applied to the model.

Definition at line 336 of file type.h.

◆ getCoreAllocationPolicy()

CoreAllocationPolicy mobilint::ModelConfig::getCoreAllocationPolicy ( ) const

inline

Gets the core allocation policy to be applied to the model.

This reflects the core allocation policy that will be used when the model is created.

Returns: The CoreAllocationPolicy to be applied to the model.

Definition at line 346 of file type.h.

◆ getNumCores()

int mobilint::ModelConfig::getNumCores ( ) const

inline

Gets the number of cores to be allocated for the model.

This represents the number of cores that will be allocated for inference when the model is launched to the NPU.

Returns: The number of cores to be allocated for the model.

Definition at line 356 of file type.h.

◆ forceSingleNPUBundle()

bool mobilint::ModelConfig::forceSingleNPUBundle ( int npu_bundle_index )

Forces the use of a specific NPU bundle.

This function forces the selection of a specific NPU bundle. If a non-negative index is provided, the corresponding NPU bundle is selected and runs without CPU offloading. If -1 is provided, all NPU bundles are used with CPU offloading enabled.

Parameters

[in] npu_bundle_index The index of the NPU bundle to force. A non-negative integer selects a specific NPU bundle (runs without CPU offloading), or -1 to enable all NPU bundles with CPU offloading.

Returns: true if the index is valid and the NPU bundle is successfully set, false if the index is invalid (less than -1).

◆ getForcedNPUBundleIndex()

int mobilint::ModelConfig::getForcedNPUBundleIndex ( ) const

inline

Retrieves the index of the forced NPU bundle.

This function returns the index of the NPU bundle that has been forced using the forceSingleNPUBundle function. If no NPU bundle is forced, the returned value will be -1.

Returns: The index of the forced NPU bundle, or -1 if no bundle is forced.

Definition at line 384 of file type.h.

◆ getCoreIds()

const std::vector< CoreId > & mobilint::ModelConfig::getCoreIds ( ) const

inline

Returns the list of NPU CoreIds to be used for model inference.

This function returns a reference to the vector of NPU CoreIds that the model will use for inference. When setSingleCoreMode(int num_cores) is called and the core allocation policy is set to CoreAllocationPolicy::Auto, it will return an empty vector.

Returns: A constant reference to the vector of NPU CoreIds.

Definition at line 396 of file type.h.

◆ setAsyncPipelineEnabled()

void mobilint::ModelConfig::setAsyncPipelineEnabled ( bool enable )

Enables or disables the asynchronous pipeline required for asynchronous inference.

Call this function with enable set to true if you intend to use Model::inferAsync or Model::inferAsyncCHW, as the asynchronous pipeline is necessary for their operation.

If you are only using synchronous inference, such as Model::infer or Model::inferCHW, it is recommended to keep the asynchronous pipeline disabled to avoid unnecessary overhead.

Parameters

[in] enable Set to true to enable the asynchronous pipeline; set to false to disable it.

◆ getAsyncPipelineEnabled()

bool mobilint::ModelConfig::getAsyncPipelineEnabled ( ) const

inline

Returns whether the asynchronous pipeline is enabled in this configuration.

Returns: true if the asynchronous pipeline is enabled; false otherwise.

Definition at line 420 of file type.h.

◆ includeAllCores()

bool mobilint::ModelConfig::includeAllCores ( )

deprecated

◆ excludeAllCores()

bool mobilint::ModelConfig::excludeAllCores ( )

deprecated

◆ include() [1/3]

bool mobilint::ModelConfig::include	(	Cluster	cluster,
		Core	core )

deprecated

◆ include() [2/3]

bool mobilint::ModelConfig::include ( Cluster cluster )

deprecated

◆ include() [3/3]

bool mobilint::ModelConfig::include ( Core core )

deprecated

◆ exclude() [1/3]

bool mobilint::ModelConfig::exclude	(	Cluster	cluster,
		Core	core )

deprecated

◆ exclude() [2/3]

bool mobilint::ModelConfig::exclude ( Cluster cluster )

deprecated

◆ exclude() [3/3]

bool mobilint::ModelConfig::exclude ( Core core )

deprecated

◆ setGlobalCoreMode()

bool mobilint::ModelConfig::setGlobalCoreMode ( std::vector< Cluster > clusters )

deprecated

◆ setAutoMode()

bool mobilint::ModelConfig::setAutoMode ( int num_cores = 1 )

deprecated

◆ setManualMode()

bool mobilint::ModelConfig::setManualMode ( )

deprecated

Member Data Documentation

◆ schedule_policy

SchedulePolicy mobilint::ModelConfig::schedule_policy = SchedulePolicy::FIFO

Deprecated: This setting has no effect.

Definition at line 442 of file type.h.

◆ latency_set_policy

LatencySetPolicy mobilint::ModelConfig::latency_set_policy = LatencySetPolicy::Auto

Deprecated: This setting has no effect.

Definition at line 446 of file type.h.

◆ maintenance_policy

MaintenancePolicy mobilint::ModelConfig::maintenance_policy = MaintenancePolicy::Maintain

Deprecated: This setting has no effect.

Definition at line 450 of file type.h.

◆ early_latencies

std::vector<uint64_t> mobilint::ModelConfig::early_latencies

Deprecated: This setting has no effect.

Definition at line 454 of file type.h.

◆ finish_latencies

std::vector<uint64_t> mobilint::ModelConfig::finish_latencies

Deprecated: This setting has no effect.

Definition at line 458 of file type.h.

The documentation for this class was generated from the following file:

type.h

ModelConfig Class Reference

ModelConfig Class Reference#

Public Member Functions

Public Attributes

Detailed Description

Constructor & Destructor Documentation

◆ ModelConfig()

Member Function Documentation

◆ setSingleCoreMode() [1/2]

◆ setSingleCoreMode() [2/2]

◆ setMultiCoreMode()

◆ setGlobal4CoreMode()

◆ setGlobal8CoreMode()

◆ getCoreMode()

◆ getCoreAllocationPolicy()

◆ getNumCores()

◆ forceSingleNPUBundle()

◆ getForcedNPUBundleIndex()

◆ getCoreIds()

◆ setAsyncPipelineEnabled()

◆ getAsyncPipelineEnabled()

◆ includeAllCores()

◆ excludeAllCores()

◆ include() [1/3]

◆ include() [2/3]

◆ include() [3/3]

◆ exclude() [1/3]

◆ exclude() [2/3]

◆ exclude() [3/3]

◆ setGlobalCoreMode()

◆ setAutoMode()

◆ setManualMode()

Member Data Documentation

◆ schedule_policy

◆ latency_set_policy

◆ maintenance_policy

◆ early_latencies

◆ finish_latencies