Promptfoo

Open-source LLM testing and evaluation framework. Compare prompt quality across models, detect regressions, and run red-team security assessments — all from a web dashboard.

What You Can Do After Deployment

Open your domain — the Promptfoo web UI loads immediately
Create evaluations — define test cases and compare outputs across GPT-4, Claude, Llama, and others
Run red-team assessments — automatically probe models for prompt injection and jailbreak vulnerabilities
View results — side-by-side comparison tables with pass/fail scoring
Export reports — share evaluation results with your team

Use Cases

Prompt engineering and A/B testing
Model selection and benchmarking
LLM security and red-teaming
CI/CD integration for prompt regression testing
Cost and latency comparison across providers

License

MIT — GitHub

Promptfoo

Services

promptfoo

Promptfoo

What You Can Do After Deployment

Use Cases

License

Services

promptfoo