Utility Functions¶
gaico.utils.prepare_results_dataframe ¶
prepare_results_dataframe(results_dict, model_col='model_name', metric_col='metric_name', score_col='score')
Converts a nested dictionary of results into a long-format DataFrame suitable for plotting.
Example Input results_dict
:
{
'ModelA': {'BLEU': 0.8, 'ROUGE': {'f1': 0.75}},
'ModelB': {'BLEU': 0.7, 'ROUGE': {'f1': 0.65}}
}
Example Output DataFrame:
model_name metric_name score
0 ModelA BLEU 0.80
1 ModelA ROUGE_f1 0.75
2 ModelB BLEU 0.70
3 ModelB ROUGE_f1 0.65
Parameters:
Name | Type | Description | Default |
---|---|---|---|
results_dict
|
Dict[str, Dict[str, Any]]
|
Nested dictionary where keys are model names and values are dictionaries of metric names to scores (or nested score dicts). |
required |
model_col
|
str
|
Name for the column containing model names in the output DataFrame. |
'model_name'
|
metric_col
|
str
|
Name for the column containing metric names in the output DataFrame. |
'metric_name'
|
score_col
|
str
|
Name for the column containing scores in the output DataFrame. |
'score'
|
Returns:
Type | Description |
---|---|
pd.DataFrame
|
A pandas DataFrame in long format. |
gaico.utils.generate_deltas_frame ¶
generate_deltas_frame(threshold_results, generated_texts=None, reference_texts=None, output_csv_path=None)
Generate a Pandas DataFrame from threshold function outputs with optional text strings. If output_csv_path
is provided, it saves the DataFrame to a CSV file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
threshold_results
|
Dict[str, Dict[str, Any]] | List[Dict[str, Dict[str, Any]]]
|
Output from apply_thresholds (handles both single and batch) Single: {"BLEU": {"score": 0.6, "threshold_applied": 0.5, "passed_threshold": True}, ...} Batch: [{"BLEU": {"score": 0.6, ...}, ...}, {"BLEU": {"score": 0.4, ...}, ...}] |
required |
generated_texts
|
Optional[str | List[str]]
|
Optional generated text string(s) |
None
|
reference_texts
|
Optional[str | List[str]]
|
Optional reference text string(s) |
None
|
output_csv_path
|
Optional[str]
|
Optional path to save the CSV file |
None
|
Returns:
Type | Description |
---|---|
pd.DataFrame
|
A Pandas DataFrame containing the results |