Developer Guide¶

This guide covers everything you need to contribute to GAICo, including project structure, testing, code style, and development workflows. Use the table of contents in the left sidebar to navigate.

For development setup, wee the Developer Installation section.

Project Structure¶

The project is organized as follows:

.
├── README.md               # Project overview and quick start
├── LICENSE                 # MIT License
├── .gitignore              # Git ignore rules
├── uv.lock                 # UV dependency lock file
├── pyproject.toml          # Project metadata and dependencies
├── project_macros.py       # Used by mkdocs-macros-plugin (documentation)
├── PYPI_DESCRIPTION.MD     # PyPI package description
├── .pre-commit-config.yaml # Pre-commit hook configuration
├── mkdocs.yml              # MkDocs documentation configuration
├── gaico/                  # Main library code
│   ├── __init__.py         # Package initialization
│   ├── base.py             # BaseMetric abstract class
│   ├── experiment.py       # Experiment class for streamlined evaluation
│   ├── metrics/            # Individual metric implementations
│   │   ├── text/           # Text-based metrics
│   │   ├── structured/     # Structured data metrics
│   │   └── multimedia/     # Image and audio metrics
│   └── utils/              # Utility functions
├── examples/               # Jupyter notebook examples
│   ├── quickstart.ipynb    # Quick introduction
│   ├── example-1.ipynb     # Multiple models, single metric
│   ├── example-2.ipynb     # Single model, all metrics
│   └── data/               # Sample data for examples
├── tests/                  # Test suite
│   ├── test_metrics/       # Metric-specific tests
│   ├── test_experiment.py  # Experiment class tests
│   └── conftest.py         # Pytest configuration
├── docs/                   # Documentation source files
│   ├── index.md            # Documentation homepage
│   ├── installation-guide.md
│   ├── developer-guide.md
│   ├── resources.md
│   ├── faq.md
│   └── news.md
├── scripts/                # Utility scripts
│   ├── deploy-docs.sh      # Documentation deployment
│   └── generate-readme.py  # README generation
└── .github/workflows/      # CI/CD workflows
    ├── deploy-docs.yml     # Documentation deployment
    └── publish-pypi.yml    # PyPI publishing

Running Tests¶

We use Pytest for testing. Tests are located in the tests/ directory.

Basic Test Commands¶

Navigate to the project root and use uv to run tests:

# Run all tests
uv run pytest

# Run with verbose output
uv run pytest -v

# Run with coverage report
uv run pytest --cov=gaico --cov-report=html

# If pytest gives import errors, use:
uv run -m pytest

Targeting Specific Tests¶

You can run or skip tests based on markers:

# Skip slow BERTScore tests
uv run pytest -m "not bertscore"

# Run ONLY BERTScore tests
uv run pytest -m bertscore

# Run tests for a specific file
uv run pytest tests/test_experiment.py

# Run a specific test function
uv run pytest tests/test_experiment.py::test_experiment_init

Test Markers¶

We use the following pytest markers:

bertscore: Tests for BERTScore metric (can be slow)
integration: Integration tests that test multiple components
unit: Fast unit tests for individual functions

Writing Tests¶

When adding new features:

Create tests first (TDD approach recommended)
Place tests in the appropriate tests/test_*.py file
Use descriptive names: test_feature_behavior_expected_result
Include edge cases: Empty inputs, None values, type mismatches
Add docstrings explaining what the test validates

Example test structure:

import pytest
from gaico.metrics import YourNewMetric

def test_your_metric_basic_functionality():
    """Test that YourNewMetric calculates scores correctly."""
    metric = YourNewMetric()
    generated = "test output"
    reference = "test reference"

    result = metric.calculate(generated, reference)

    assert 0 <= result <= 1, "Score should be between 0 and 1"
    assert isinstance(result, float), "Result should be a float"

@pytest.mark.parametrize("generated,reference,expected", [
    ("exact", "exact", 1.0),
    ("different", "words", 0.0),
])
def test_your_metric_edge_cases(generated, reference, expected):
    """Test edge cases for YourNewMetric."""
    metric = YourNewMetric()
    result = metric.calculate(generated, reference)
    assert result == pytest.approx(expected, rel=1e-2)

Code Style¶

We maintain code quality using pre-commit hooks. Configuration is in .pre-commit-config.yaml.

Pre-commit Hooks¶

Setup (run once after cloning):

pre-commit install

Running hooks manually:

# Run on all files
pre-commit run --all-files

# Run on staged files only
pre-commit run

# Run a specific hook
pre-commit run black --all-files

Code Style Tools¶

Our pre-commit hooks include:

Black: Code formatting (line length: 88)
isort: Import sorting
Flake8: Linting
mypy: Type checking (optional)
trailing-whitespace: Remove trailing spaces
end-of-file-fixer: Ensure files end with newline

Style Guidelines¶

Formatting: Use Black's defaults (88 character line length)
Imports: Group stdlib, third-party, local (enforced by isort)
Type hints: Add type hints for public APIs
Docstrings: Use Google-style docstrings
Variable names:
snake_case for functions and variables
PascalCase for classes
UPPER_CASE for constants

Example:

from typing import List, Union
import numpy as np

class ExampleMetric(BaseMetric):
    """Brief description of the metric.

    More detailed explanation of what this metric does,
    its use cases, and any important notes.

    Args:
        parameter1: Description of parameter1
        parameter2: Description of parameter2

    Attributes:
        attribute1: Description of attribute1
    """

    DEFAULT_THRESHOLD = 0.5  # Class constant

    def __init__(self, parameter1: str, parameter2: int = 10):
        """Initialize the metric."""
        self.parameter1 = parameter1
        self.parameter2 = parameter2

    def calculate(
        self,
        generated_texts: Union[str, List[str]],
        reference_texts: Union[str, List[str], None] = None
    ) -> Union[float, np.ndarray]:
        """Calculate the metric score.

        Args:
            generated_texts: Model-generated output(s)
            reference_texts: Reference output(s) or None

        Returns:
            Score(s) between 0 and 1

        Raises:
            ValueError: If inputs are invalid
        """
        # Implementation here
        pass

Development Workflow¶

1. Create a Feature Branch¶

git checkout -b feature/your-feature-name

2. Make Changes¶

Write code following style guidelines
Add/update tests
Update documentation if needed

3. Run Tests and Checks¶

# Run tests
uv run pytest

# Run pre-commit checks
pre-commit run --all-files

# Check if documentation builds
mkdocs serve

4. Commit Changes¶

git add .
git commit -m "Add: Brief description of changes"

Pre-commit hooks will run automatically. Fix any issues they report.

5. Push and Create PR¶

git push origin feature/your-feature-name

Then create a Pull Request on GitHub.

Adding a New Metric¶

See our FAQ guide on adding custom metrics for detailed instructions.

Quick checklist:

Create new file in gaico/metrics/[category]/
Inherit from BaseMetric
Implement calculate() method
Add tests in tests/test_metrics/
Update documentation
Register in __init__.py if needed

Building Documentation¶

We use MkDocs with the Material theme.

Local Documentation Server¶

# Install documentation dependencies (included in dev install)
pip install -e ".[dev]"

# Serve documentation locally
mkdocs serve

Visit http://127.0.0.1:8000 to view the docs.

Building Documentation¶

# Build static site
mkdocs build

# Build and deploy to GitHub Pages (maintainers only)
mkdocs gh-deploy

Release Process¶

For maintainers releasing new versions:

Update version in pyproject.toml
Update changelog in docs/news.md and docs/resources.md
Run full test suite: uv run pytest
Build package: uv build
Create git tag: git tag v0.x.x
Push tag: git push origin v0.x.x
GitHub Actions will automatically publish to PyPI

Getting Help¶

💬 Open an issue on GitHub
📧 Email us at ai4societyteam@gmail.com
📖 Check the FAQ

Code of Conduct¶

We follow a standard code of conduct:

Be respectful and inclusive
Provide constructive feedback
Focus on what's best for the community
Show empathy towards other community members