ModelConfig Class Reference

ModelConfig Class Reference#

Runtime Library: mobilint::ModelConfig Class Reference
Runtime Library v0.30
Mobilint SDK qb

Configures a core mode and core allocation of a model for NPU inference. More...

#include <type.h>

Public Member Functions

 ModelConfig ()
 Default constructor. This default-constructed object is initially set to single-core mode with all NPU local cores included.
bool setSingleCoreMode (int num_cores)
 Sets the model to use single-core mode for inference with a specified number of local cores.
bool setSingleCoreMode (std::vector< CoreId > core_ids)
 Sets the model to use single-core mode for inference with a specific set of NPU local cores.
bool setMultiCoreMode (std::vector< Cluster > clusters)
 Sets the model to use multi-core mode for batch inference.
bool setGlobal4CoreMode (std::vector< Cluster > clusters)
 Sets the model to use global4-core mode for inference with a specified set of NPU clusters.
bool setGlobal8CoreMode ()
 Sets the model to use global8-core mode for inference.
CoreMode getCoreMode () const
 Gets the core mode to be applied to the model.
CoreAllocationPolicy getCoreAllocationPolicy () const
 Gets the core allocation policy to be applied to the model.
int getNumCores () const
 Gets the number of cores to be allocated for the model.
bool forceSingleNPUBundle (int npu_bundle_index)
 Forces the use of a specific NPU bundle.
int getForcedNPUBundleIndex () const
 Retrieves the index of the forced NPU bundle.
const std::vector< CoreId > & getCoreIds () const
 Returns the list of NPU CoreIds to be used for model inference.
void setAsyncPipelineEnabled (bool enable)
 Enables or disables the asynchronous pipeline required for asynchronous inference.
bool getAsyncPipelineEnabled () const
 Returns whether the asynchronous pipeline is enabled in this configuration.
 ModelConfig (int num_cores)
bool includeAllCores ()
bool excludeAllCores ()
bool include (Cluster cluster, Core core)
bool include (Cluster cluster)
bool include (Core core)
bool exclude (Cluster cluster, Core core)
bool exclude (Cluster cluster)
bool exclude (Core core)
bool setGlobalCoreMode (std::vector< Cluster > clusters)
bool setAutoMode (int num_cores=1)
bool setManualMode ()

Public Attributes

SchedulePolicy schedule_policy = SchedulePolicy::FIFO
LatencySetPolicy latency_set_policy = LatencySetPolicy::Auto
MaintenancePolicy maintenance_policy = MaintenancePolicy::Maintain
std::vector< uint64_t > early_latencies
std::vector< uint64_t > finish_latencies

Detailed Description

Configures a core mode and core allocation of a model for NPU inference.

The ModelConfig class provides methods for setting a core mode and allocating cores for NPU inference. Supported core modes are single-core, multi-core, global4-core, and global8-core. Users can also specify which cores to allocate for the model. Additionally, the configuration offers an option to enforce the use of a specific NPU bundle.

Note
Deprecated functions are included for backward compatibility, but it is recommended to use the newer core mode configuration methods.

Definition at line 257 of file type.h.

Constructor & Destructor Documentation

◆ ModelConfig()

mobilint::ModelConfig::ModelConfig ( int num_cores)
explicit

deprecated

Member Function Documentation

◆ setSingleCoreMode() [1/2]

bool mobilint::ModelConfig::setSingleCoreMode ( int num_cores)

Sets the model to use single-core mode for inference with a specified number of local cores.

In single-core mode, each local core executes model inference independently. The number of cores used is specified by the num_cores parameter, and the core allocation policy is set to CoreAllocationPolicy::Auto, meaning the model will be automatically allocated to available local cores when the model is launched to the NPU, specifically when the Model::launch function is called.

Parameters
[in]num_coresThe number of local cores to use for inference.
Returns
true if the mode was successfully set, false otherwise.

◆ setSingleCoreMode() [2/2]

bool mobilint::ModelConfig::setSingleCoreMode ( std::vector< CoreId > core_ids)

Sets the model to use single-core mode for inference with a specific set of NPU local cores.

In single-core mode, each local core executes model inference independently. The user can specify a vector of CoreIds to determine which cores to use for inference.

Parameters
[in]core_idsA vector of CoreIds to be used for model inference.
Returns
true if the mode was successfully set, false otherwise.

◆ setMultiCoreMode()

bool mobilint::ModelConfig::setMultiCoreMode ( std::vector< Cluster > clusters)

Sets the model to use multi-core mode for batch inference.

In multi-core mode, on Aries NPU, the four local cores within a cluster work together to process batch inference tasks efficiently. This mode is optimized for batch processing.

Parameters
[in]clustersA vector of clusters to be used for multi-core batch inference.
Returns
true if the mode was successfully set, false otherwise.

◆ setGlobal4CoreMode()

bool mobilint::ModelConfig::setGlobal4CoreMode ( std::vector< Cluster > clusters)

Sets the model to use global4-core mode for inference with a specified set of NPU clusters.

For Aries NPU, there are two clusters, each consisting of four local cores. In global4-core mode, four local cores within the same cluster work together to execute the model inference.

Parameters
[in]clustersA vector of clusters to be used for model inference.
Returns
true if the mode was successfully set, false otherwise.

◆ setGlobal8CoreMode()

bool mobilint::ModelConfig::setGlobal8CoreMode ( )

Sets the model to use global8-core mode for inference.

For Aries NPU, there are two clusters, each consisting of four local cores. In global8-core mode, all eight local cores across the two clusters work together to execute the model inference.

Returns
true if the mode was successfully set, false otherwise.

◆ getCoreMode()

CoreMode mobilint::ModelConfig::getCoreMode ( ) const
inline

Gets the core mode to be applied to the model.

This reflects the core mode that will be used when the model is created.

Returns
The CoreMode to be applied to the model.

Definition at line 336 of file type.h.

◆ getCoreAllocationPolicy()

CoreAllocationPolicy mobilint::ModelConfig::getCoreAllocationPolicy ( ) const
inline

Gets the core allocation policy to be applied to the model.

This reflects the core allocation policy that will be used when the model is created.

Returns
The CoreAllocationPolicy to be applied to the model.

Definition at line 346 of file type.h.

◆ getNumCores()

int mobilint::ModelConfig::getNumCores ( ) const
inline

Gets the number of cores to be allocated for the model.

This represents the number of cores that will be allocated for inference when the model is launched to the NPU.

Returns
The number of cores to be allocated for the model.

Definition at line 356 of file type.h.

◆ forceSingleNPUBundle()

bool mobilint::ModelConfig::forceSingleNPUBundle ( int npu_bundle_index)

Forces the use of a specific NPU bundle.

This function forces the selection of a specific NPU bundle. If a non-negative index is provided, the corresponding NPU bundle is selected and runs without CPU offloading. If -1 is provided, all NPU bundles are used with CPU offloading enabled.

Parameters
[in]npu_bundle_indexThe index of the NPU bundle to force. A non-negative integer selects a specific NPU bundle (runs without CPU offloading), or -1 to enable all NPU bundles with CPU offloading.
Returns
true if the index is valid and the NPU bundle is successfully set, false if the index is invalid (less than -1).

◆ getForcedNPUBundleIndex()

int mobilint::ModelConfig::getForcedNPUBundleIndex ( ) const
inline

Retrieves the index of the forced NPU bundle.

This function returns the index of the NPU bundle that has been forced using the forceSingleNPUBundle function. If no NPU bundle is forced, the returned value will be -1.

Returns
The index of the forced NPU bundle, or -1 if no bundle is forced.

Definition at line 384 of file type.h.

◆ getCoreIds()

const std::vector< CoreId > & mobilint::ModelConfig::getCoreIds ( ) const
inline

Returns the list of NPU CoreIds to be used for model inference.

This function returns a reference to the vector of NPU CoreIds that the model will use for inference. When setSingleCoreMode(int num_cores) is called and the core allocation policy is set to CoreAllocationPolicy::Auto, it will return an empty vector.

Returns
A constant reference to the vector of NPU CoreIds.

Definition at line 396 of file type.h.

◆ setAsyncPipelineEnabled()

void mobilint::ModelConfig::setAsyncPipelineEnabled ( bool enable)

Enables or disables the asynchronous pipeline required for asynchronous inference.

Call this function with enable set to true if you intend to use Model::inferAsync or Model::inferAsyncCHW, as the asynchronous pipeline is necessary for their operation.

If you are only using synchronous inference, such as Model::infer or Model::inferCHW, it is recommended to keep the asynchronous pipeline disabled to avoid unnecessary overhead.

Parameters
[in]enableSet to true to enable the asynchronous pipeline; set to false to disable it.

◆ getAsyncPipelineEnabled()

bool mobilint::ModelConfig::getAsyncPipelineEnabled ( ) const
inline

Returns whether the asynchronous pipeline is enabled in this configuration.

Returns
true if the asynchronous pipeline is enabled; false otherwise.

Definition at line 420 of file type.h.

◆ includeAllCores()

bool mobilint::ModelConfig::includeAllCores ( )

deprecated

◆ excludeAllCores()

bool mobilint::ModelConfig::excludeAllCores ( )

deprecated

◆ include() [1/3]

bool mobilint::ModelConfig::include ( Cluster cluster,
Core core )

deprecated

◆ include() [2/3]

bool mobilint::ModelConfig::include ( Cluster cluster)

deprecated

◆ include() [3/3]

bool mobilint::ModelConfig::include ( Core core)

deprecated

◆ exclude() [1/3]

bool mobilint::ModelConfig::exclude ( Cluster cluster,
Core core )

deprecated

◆ exclude() [2/3]

bool mobilint::ModelConfig::exclude ( Cluster cluster)

deprecated

◆ exclude() [3/3]

bool mobilint::ModelConfig::exclude ( Core core)

deprecated

◆ setGlobalCoreMode()

bool mobilint::ModelConfig::setGlobalCoreMode ( std::vector< Cluster > clusters)

deprecated

◆ setAutoMode()

bool mobilint::ModelConfig::setAutoMode ( int num_cores = 1)

deprecated

◆ setManualMode()

bool mobilint::ModelConfig::setManualMode ( )

deprecated

Member Data Documentation

◆ schedule_policy

SchedulePolicy mobilint::ModelConfig::schedule_policy = SchedulePolicy::FIFO
Deprecated
This setting has no effect.

Definition at line 442 of file type.h.

◆ latency_set_policy

LatencySetPolicy mobilint::ModelConfig::latency_set_policy = LatencySetPolicy::Auto
Deprecated
This setting has no effect.

Definition at line 446 of file type.h.

◆ maintenance_policy

MaintenancePolicy mobilint::ModelConfig::maintenance_policy = MaintenancePolicy::Maintain
Deprecated
This setting has no effect.

Definition at line 450 of file type.h.

◆ early_latencies

std::vector<uint64_t> mobilint::ModelConfig::early_latencies
Deprecated
This setting has no effect.

Definition at line 454 of file type.h.

◆ finish_latencies

std::vector<uint64_t> mobilint::ModelConfig::finish_latencies
Deprecated
This setting has no effect.

Definition at line 458 of file type.h.


The documentation for this class was generated from the following file: