Resources¶
This page collects all GAICo resources including videos, demos, examples, and version history. Use the table of contents in the left sidebar to navigate.
Video Demo¶
Watch this comprehensive demonstration of GAICo's capabilities, including:
- Setting up and running evaluations
- Comparing multiple LLM outputs
- Generating visualizations
- Working with different metric types
- Interpreting results
Interactive Demo¶
Try GAICo without installing anything:
The interactive demo allows you to:
- Upload your own LLM outputs
- Select metrics to apply
- Generate comparison visualizations
- Download results as CSV
Version History & News¶
Recent Releases¶
This section summarizes the major releases of the GAICo library, highlighting key features and providing quick start examples.
| Release | Date | Summary | Details |
|---|---|---|---|
| v0.4.0 | January 2026 | Added optional seeding for reproducible results | Full changelog → |
| v0.3.0 | August 2025 | Added multimedia metrics (image and audio) and enhancements for the Experiment class |
Full changelog → |
| v0.2.0 | July 2025 | Added specialized text metrics: time-series & automated planning | Full changelog → |
| v0.1.5 | June 2025 | Initial release: generic text metrics, Experiment class, & visualizations |
Full changelog → |
Example Notebooks¶
All examples are available as Jupyter notebooks that can be run locally or in Google Colab.
Quick Start Examples¶
| Notebook | Description | Open in Colab |
|---|---|---|
| quickstart.ipynb | Rapid hands-on introduction to the Experiment class |
|
| example-1.ipynb | Compare multiple model outputs with a single metric | |
| example-2.ipynb | Evaluate a single model output across all available metrics |
Advanced Examples¶
Browse the full examples directory for more specialized use cases:
- Working with multimedia metrics (images, audio)
- Batch processing large datasets
- Custom metric implementation
- Advanced visualization techniques
- Integration with popular LLM frameworks
Learning Resources¶
Documentation¶
- 📖 Installation Guide - Detailed setup instructions
- 🔧 Developer Guide - Contributing and development
- 🤔 FAQ - Frequently asked questions
- 📊 API Reference - Complete API documentation
External Resources¶
- Microsoft's Guide to LLM Evaluation - Inspiration for GAICo's metrics
- Hugging Face Evaluate Library - Complementary evaluation tools
- HELM Benchmark - Holistic evaluation framework
Community & Support¶
Get Help¶
- 💬 GitHub Discussions - Ask questions and share ideas
- 🐛 Issue Tracker - Report bugs or request features
- 📧 Email Support - Direct contact with the team
Contributing¶
We welcome contributions! See our Developer Guide for:
- Setting up your development environment
- Code style guidelines
- Testing requirements
- Pull request process
Publications¶
- 📄 GAICo: A Deployed and Extensible Framework for Evaluating Diverse and Multimodal Generative AI Outputs
- 📑 GAICo: Demonstrating a Unified Framework for Multi-Modal GenAI Evaluation (Demo)
Citation:
@article{gupta2025gaico,
title={GAICo: A Deployed and Extensible Framework for Evaluating Diverse and Multimodal Generative AI Outputs},
author={Gupta, Nitin and Koppisetti, Pallav and Lakkaraju, Kausik and Srivastava, Biplav},
journal={arXiv preprint arXiv:2508.16753},
year={2025}
}
Cite GAICo¶
If you use GAICo in your research or projects, please cite our work. See the Citation section above.