New ask Hacker News story: Ask HN: What's the consensus on "unit" testing LLM prompts?

Ask HN: What's the consensus on "unit" testing LLM prompts?
2 by thiht | 0 comments on Hacker News.
LLMs are notoriously non deterministic, which makes it hard for us developers to trust them as a tool in a backend, where we usually expect determinism. I’m in a situation where using an LLM makes sense from a technical perspective, but I’m wondering if there are good practice on testing, besides manual testing: - I want to ensure my prompt does what I want 100% of the time - I want to ensure I don’t get regressions as my prompt evolve, or when updating the version of the LLM I use, or even if I switch to another LLM The ideas I have in mind are: - forcing the LLM to return JSON with a strict definition - running a fixed set of tests periodically with my prompt and checking I get the expected result Are there specificities with LLM prompt testing I should be aware of? Are some good practices emerging?

Comments

Popular posts from this blog

How can Utilize Call Center Outsourcing for Increase your Business Income well?

New ask Hacker News story: Is someone trying to steal credit for inventing the eTicket?