genai-impact/ecologits

Unit tests

Closed this issue · 9 comments

Description

Implement unit tests in this package.

Solution

All features and cases should be explicitly verified through unit testing

Considerations

As a reference :
https://github.com/dataforgoodfr/12_genai_impact/blob/main/tests/test_compute_impacts.py

import pytest

from genai_impact.compute_impacts import compute_llm_impact


@pytest.mark.parametrize('model_size,output_tokens', [
    (130, 1000),
    (7, 150)
])
def test_compute_impacts_is_positive(model_size: float, output_tokens: int) -> None:
    impacts = compute_llm_impact(model_size, output_tokens)
    assert impacts.energy >= 0

Additional context

N/A

I can give a go on this issue, but will need some feedbacks as it will the first time ever.

@LucBERTON @samuelrince
Please find below a first proposition for 'test_compute_impacts' Please do not hesitate to comment or amend it.

import pytest
import pandas as pd
import numpy as np

from genai_impact.compute_impacts import compute_llm_impact

@pytest.fixture(scope='module')
def input_test_data():
    return pd.DataFrame(data=[(130, 1000),(7, 150)],
                        columns = ['model_size','output_tokens'])
@pytest.fixture(scope='module')
def impacts(input_test_data):
    for i in range(0,len(input_test_data)-1):
        return compute_llm_impact(input_test_data['model_size'][i], input_test_data['output_tokens'][i])

def test_compute_impacts_is_positive(impacts) -> None:
        assert impacts.energy >= 0

def test_compute_impacts_output_type(impacts) -> None:
    assert type(impacts.energy) in (float, np.float64)

def test_compute_impacts_result(impacts, input_test_data) -> None:
    for i in range(0,len(input_test_data)-1):
        assert impacts.energy == 1.17e-4 * input_test_data['model_size'][i]*input_test_data['output_tokens'][I]

I think Pandas library is not (yet) used in this project, and has a large size.

Maybe we should not to use it only for testing purposes ?

Do we know if pandas will be used in the project later ?

If we use CSV files to store model specs we will probably use pandas, though here I think it is not necessary, plus it adds complexity with the data types (float + np.float64).

I'd recommend using parametrized tests and if we need to reuse the parametrized data we can store it in a constant for now,

TEST_DATA = [...] 	# to replace the data in parametrize


@pytest.mark.parametrize("model_size,output_tokens", [
	(130, 1000), 
	(7, 150)
])
def test_compute_impacts_is_positive(model_size: float, output_tokens: int) -> None:
    impacts = compute_llm_impact(model_size, output_tokens)
    assert impacts.energy >= 0

@pytest.mark.parametrize("model_size,output_tokens", [
	(130, 1000), 
	(7, 150)
])
def test_compute_impacts___(model_size: float, output_tokens: int) -> None:
	...

Also, we can merge some assert that are always true into a single test

@pytest.mark.parametrize("model_size,output_tokens", [
	(130, 1000), 
	(7, 150)
])
def test_compute_impacts(model_size: float, output_tokens: int) -> None:
    impacts = compute_llm_impact(model_size, output_tokens)
    assert impacts.energy >= 0
    assert isinstance(impact.energy, float)
    ...

Apologies for my late reply, I had a hectic week. I made the recommended changes and committed it. Please do not hesitate to comment or amend it.
I'll try to have a look at the unit tests for the wrapper part this week end.

Thanks @domihak-project! As I mentioned in the issue #8 we can wait for the testing of the wrapper because we may have to change the way it works.

@domihak-project I have pushed more tests if you want to take a look at it! It's very basic chat completion with OpenAI, Anthropic and Mistral AI. Next step, we have to test the async and stream functions.

Closing this issue as basic tests are implemented