Utility Functions¶
gaico.utils.prepare_results_dataframe ¶
prepare_results_dataframe(results_dict, model_col='model_name', metric_col='metric_name', score_col='score', index_col='item_index')
Converts a nested dictionary of results into a long-format DataFrame suitable for plotting. Handles both single-item results and batch results (lists of scores).
Example Input results_dict (batch):
{
'ModelA': {'BLEU': [0.8, 0.85], 'ROUGE': [{'f1': 0.75}, {'f1': 0.78}]},
}
Example Output DataFrame:
item_index model_name metric_name score
0 0 ModelA BLEU 0.80
1 1 ModelA BLEU 0.85
2 0 ModelA ROUGE_f1 0.75
3 1 ModelA ROUGE_f1 0.78
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results_dict
|
Dict[str, Dict[str, Any]]
|
Nested dictionary of results. Scores can be single values or lists. |
required |
model_col
|
str
|
Name for the model column. |
'model_name'
|
metric_col
|
str
|
Name for the metric column. |
'metric_name'
|
score_col
|
str
|
Name for the score column. |
'score'
|
index_col
|
str
|
Name for the item index column in batch mode. |
'item_index'
|
Returns:
| Type | Description |
|---|---|
pd.DataFrame
|
A pandas DataFrame in long format. |
gaico.utils.generate_deltas_frame ¶
generate_deltas_frame(threshold_results, generated_texts=None, reference_texts=None, output_csv_path=None)
Generate a Pandas DataFrame from threshold function outputs with optional text strings. If output_csv_path is provided, it saves the DataFrame to a CSV file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
threshold_results
|
Dict[str, Dict[str, Any]] | List[Dict[str, Dict[str, Any]]]
|
Output from apply_thresholds (handles both single and batch) Single: {"BLEU": {"score": 0.6, "threshold_applied": 0.5, "passed_threshold": True}, ...} Batch: [{"BLEU": {"score": 0.6, ...}, ...}, {"BLEU": {"score": 0.4, ...}, ...}] |
required |
generated_texts
|
Optional[str | List[str]]
|
Optional generated text string(s) |
None
|
reference_texts
|
Optional[str | List[str]]
|
Optional reference text string(s) |
None
|
output_csv_path
|
Optional[str]
|
Optional path to save the CSV file |
None
|
Returns:
| Type | Description |
|---|---|
pd.DataFrame
|
A Pandas DataFrame containing the results |