February 16th, 2026
New
Improved
A round of polish across the app:
Separated footer buttons — dialogs, alert confirmations, and the login/signup forms now have a visually distinct footer area with a subtle border and background, matching modern shadcn conventions.
Decluttered layouts — removed unnecessary card wrappers from the eval prompt, eval runs section, danger zone, and accuracy metrics. Pages feel lighter and less boxy.
Ground truth page tidied up — consolidated the labelling page from 6 separate cards into 3 clean sections. Navigation is now integrated into the trace card, and all sidebar controls live in a single panel.
Richer example dataset — the example dataset now includes 10 trip planning conversations (up from 3) covering diverse scenarios like solo backpacking, family holidays, ski trips, and more. It also ships with 3 example evals (boolean, score, and category) so new users can see the full range of eval types immediately.
Component showcase — added a hidden page at /app/components showing all UI components used across the app. Handy as a living style guide.