stanfordnlp/dspy

Default top_p=1 everywhere. Why?

Closed this issue · 1 comments

The default expected behavior by any user is determinism. For this to be true, temperature by default has rightly been set to 0 or close to 0 in most places. However, top_p ALSO needs to be set to 0! The only time top_p should be set to 1 is if temperature is greater than 0.0, i.e. user wants stochasticity. To illustrate:

In most models I see the following:

        self.kwargs = {
            "model": model,
            "port": port,
            "url": url,
            "temperature": 0.0 if "temperature" not in kwargs else kwargs["temperature"],
            "max_tokens": 75,
            "top_p": 1.0,
            "n": 1,
            "stop": ["\n", "\n\n"],
            **kwargs,
        }

when actually it should be this:

        if "top_p" not in kwargs and "temperature" not in kwargs:
            # if both temperature and top_p are not set by the user
            kwargs["top_p"] = 0.0
        elif "top_p" not in kwargs:
            # if temperature is set by the user and top_p is not
            kwargs["top_p"] = 1.0
            
        self.kwargs = {
            "model": model,
            "port": port,
            "url": url,
            "temperature": 0.0 if "temperature" not in kwargs else kwargs["temperature"],
            "max_tokens": 75,
            "top_p": kwargs["top_p"],
            "n": 1,
            "stop": ["\n", "\n\n"],
            **kwargs,
        }

Unless temperature = 0 would make top_p =1 irrelevant and still keep things deterministic. I've not found this to be true in my experiments at least.....