MultiModalPredictor.optimize_for_inference#

MultiModalPredictor.optimize_for_inference(providers: Optional[Union[dict, List[str]]] = None)[source]#

Optimize the predictor’s model for inference.

Under the hood, the implementation would convert the PyTorch module into an ONNX module, so that we can leverage efficient execution providers in onnxruntime for faster inference.

Parameters

data – Raw data used to trace and export the model. If this is None, will check if a processed batch is provided.
providers (dict or str, default=None) –
A list of execution providers for model prediction in onnxruntime.

By default, the providers argument is None. The method would generate an ONNX module that would perform model inference with TensorrtExecutionProvider in onnxruntime, if tensorrt package is properly installed. Otherwise, the onnxruntime would fallback to use CUDA or CPU execution providers instead.

Returns

onnx_module – The onnx-based module that can be used to replace predictor._model for model inference.

Return type

OnnxModule