MultiModalPredictor.evaluate¶
- MultiModalPredictor.evaluate(data: DataFrame | dict | list | str, query_data: list | None = None, response_data: list | None = None, id_mappings: Dict[str, Dict] | Dict[str, Series] | None = None, metrics: str | List[str] | None = None, chunk_size: int | None = 1024, similarity_type: str | None = 'cosine', cutoffs: List[int] | None = [1, 5, 10], label: str | None = None, return_pred: bool | None = False, realtime: bool | None = False, eval_tool: str | None = None)[source]¶
- Evaluate the model on a given dataset. - Parameters:
- data – A pd.DataFrame, containing the same columns as the training data. Or a str, that is a path of the annotation file for detection. 
- query_data – Query data used for ranking. 
- response_data – Response data used for ranking. 
- id_mappings – Id-to-content mappings. The contents can be text, image, etc. This is used when data/query_data/response_data contain the query/response identifiers instead of their contents. 
- metrics – A list of metric names to report. If None, we only return the score for the stored _eval_metric_name. 
- chunk_size – Scan the response data by chunk_size each time. Increasing the value increases the speed, but requires more memory. 
- similarity_type – Use what function (cosine/dot_prod) to score the similarity (default: cosine). 
- cutoffs – A list of cutoff values to evaluate ranking. 
- label – The label column name in data. Some tasks, e.g., image<–>text matching, have no label column in training data, but the label column may be still required in evaluation. 
- return_pred – Whether to return the prediction result of each row. 
- realtime – Whether to do realtime inference, which is efficient for small data (default False). If provided None, we would infer it on based on the data modalities and sample number. 
- eval_tool – The eval_tool for object detection. Could be “pycocotools” or “torchmetrics”. 
 
- Returns:
- A dictionary with the metric names and their corresponding scores. 
- Optionally return a pd.DataFrame of prediction results.