Yuki AI 客服小幫手

production env · 1 suites · 2 次完成 run

Subject № 3b8391aa-fd0a-4974-8b0d-7793779264b4 PRODUCTION
評測狀態 · 正常維護

Eval suites 維護中,全部 in sync

3 scenarios · 3 KB items 1 suite
FLEET READY · ALL KINDS COVERED

27 cases · Yuki AI 客服小幫手 (bulk R1)

kb_accuracy 7
scenario_funnel 10
mixed_qa 10
uncategorized 0
01

生命徵象

[KIND × DIMENSION] vital signs — this bot's per-dim clearance vs. its baseline
知識庫精準度 [PASS]
檢索 100.0% 100.0% ≥95.0% [±5pp] +5.0 ✓
忠實度 85.7% 85.7% ≥77.7% [±8pp] +8.0 ✓
回答品質 98.1% 98.1% ≥90.1% [±8pp] +8.0 ✓
情境調用與完成 [FAIL]
情境 100.0% 100.0% ≥95.0% [±5pp] +5.0 ✓
工具使用 70.0% 70.0% ≥70.0% [floor] 0.0 ✓
回答品質 60.0% 53.7% <70.0% [floor] -16.3 ✗
對話素養(混合問答) [FAIL]
檢索 100.0% 100.0%
忠實度 44.4% 55.6% <70.0% [floor] -14.4 ✗
回答品質 68.0% 64.7% <70.0% [floor] -5.3 ✗
02

測試套件