Troubleshooting

At its core, an LLM functions as a next-token prediction engine. It predicts the most likely next word in a sequence based on the context provided by the preceding text.

The process of generating text usually involves some level of randomness in the choice of the next token. This randomness allows the model to produce varied and creative outputs instead of the same response every time given the same input, making the generation process non-deterministic.

Why do I get unexpected replies?

There are a couple of things you can do to improve the replies:

Prompt engineering

To increase the quality of a reply follow this guideline when writing a request to the LLM:

Be Specific and Clear: Clearly define your question or task to minimize ambiguity. Specific questions and tasks usually lead to more accurate and relevant responses.
Provide Context: Give the model adequate context to work with. This can include relevant code sections and accurate domain specific information. Use Prompt Snippets to easily do this.
Use Examples: Include examples in your prompt. This can help the LLM “understand” the format or style you’re aiming for in the response.
Set Explicit Constraints: If you have specific constraints regarding length, format, or content, make these clear in your prompt. Save the constraints as Custom Snippets and reuse them every time you need.
Iterate and Refine: Experiment with different phrasings and structures to find what works best. Small adjustments can often lead to significant improvements in the quality of generated replies. The Chat features are designed to help you refine the prompt.

Regenerate the reply

Generate a new reply from the LLM by clicking the Regenerate button above the LLM reply in the chat. Due to the LLM’s non-deterministic characteristics, this can lead to different responses that may be more adequate. Alternatively, you can use natural language and request the LLM to generate additional solutions for your problem.

Try a different LLM

Sometimes, a specific LLM is just not fit for the task you are trying to accomplish. You can easily change the LLM using the Set Default Language Model command or you can regenerate just a specific reply with a different model using the Switch Language Model dropdown above the reply. If you’re using a local Ollama model, try getting a different version of that model with more parameters if your hardware allows, or change it with a completely different model.

Why do I get different replies for the same request?

The LLM generation process is non-deterministic. Most models can be tuned using different parameters and two commonly used parameters that control the randomness and variability are:

Temperature: Higher temperature will make replies more, random, creative and diverse, lower temperature will get you more deterministic, factual and consistent replies.
Top P: Lower values reduce diversity and focus on more probable words, higher values lead to a more diverse vocabulary and phrasing.

These parameters can be set in the models.json file (see LLM Configuration).

What to do if I get errors or find a bug?

When unexpected things happen, the AI Assistant will notify you with a popup in the bottom right corner.

For in depth debug information you can check the AI Assistant log under the DVT AI Output Panel in VS Code or the DVT AI Console View in Eclipse.

There are several types of information displayed in the log:

debug information
warnings and errors
configuration information (e.g. API keys)
raw messages exchanged between AI Assistants and the LLM provider

Note

The log contents contain sensitive information, for example sections of code, so these logs are not collected when you report an issue in DVT IDE.