Quantization configuration

Quantization configuration#

Quark Quantization Config API for ONNX

class quark.onnx.quantization.config.config.Config(global_quant_config: QuantizationConfig)[source]#

A class that encapsulates comprehensive quantization configurations for a machine learning model, allowing for detailed and hierarchical control over quantization parameters across different model components.

Parameters:

global_quant_config (QuantizationConfig) – Global quantization configuration applied to the entire model unless overridden at the layer level.

class quark.onnx.quantization.config.config.QConfig(global_config: QLayerConfig, specific_layer_config: dict[DataType, list[str]] | None = None, layer_type_config: dict[DataType | None, list[str]] | None = None, exclude: list[str | list[tuple[list[str]]]] | None = None, algo_config: list[AlgoConfig] | None = None, use_external_data_format: bool = False, **kwargs: dict[str, Any])[source]#

A class that defines quantization configuration at multiple levels (global, specific layers, specific operation types), and provides flexibility for specifying algorithm settings.

Parameters:
  • global_config (QLayerConfig) – Global quantization configuration applied to all layers unless overridden.

  • specific_layer_config (Dict[DataType, List[str]]) – Dictionary mapping specific layer names to their quantization configuration. Overrides global_config for those layers. Default is None.

  • layer_type_config (Dict[Optional[DataType], List[str]]) – Dictionary mapping layer types (e.g., Conv, Gemm) to quantization configurations. Overrides global_config for those operation types. Default is None.

  • exclude (List[Union[str, List[Tuple[List[str]]]]]) – List of nodes or subgraphs excluded from quantization. Default is None.

  • algo_config (List[AlgoConfig]) – Algorithm configuration(s), such as CLE, SmoothQuant, or AdaRound. Can be a list of algorithm configurations. Default is None.

  • use_external_data_format (bool) – Whether to use ONNX external data format when saving the quantized model. Default is False. advanced customization and extension.

  • extra_options (Dict[str, Any]) – Dictionary for additional options. Default is None.

static get_default_config(config_name: str) Config[source]#

Retrieve the default quantization configuration by name.

This function looks up the provided config_name in the DefaultConfigMapping. If a match is found, it returns a Config object with the corresponding global quantization configuration. Otherwise, it raises a ValueError.

Args:

config_name (str): The name of the default configuration to look up like XINT8.

Returns:

Config: A configuration object containing the default quantization settings.

Raises:

ValueError: If the provided config_name is not found in DefaultConfigMapping.