9.8.2 Why do I get different replies for the same request?The LLM generation process is non-deterministic. Most models can be tuned using different parameters and two commonly used parameters that control the randomness and variability are:
Temperature: Higher temperature will make replies more, random, creative and diverse, lower temperature will get you more deterministic, factual and consistent replies.
Top P: Lower values reduce diversity and focus on more probable words, higher values lead to a more diverse vocabulary and phrasing.
These parameters can be set in the models.json file (see
LLM Configuration).
|