Sign in with an admin account to manage users and review feedback.
Capture scored examples, nuanced rationale, transcripts, and build guidance across any software surface. Applying an example writes it into the corpus as a training example and creates a response eval regression case.
Choose from up to 100 top products, or type your own product above.
Paste screenshots here, drag images in, or upload files that demonstrate the example.
Edit any entry to immediately update retrieval/RAG output (re-indexed on save) and the dataset the next fine-tune run trains on. Toggle Train to include/exclude a row from training; unpublish to remove it from live retrieval. Use Audit dataset to inspect the exact chat examples generated from these rows.
Grouped benchmarks of response eval cases. Run a whole set to score the active taste model, then reuse the same set to compare fine-tune candidates apples-to-apples. The default taste-holdout-v1 set adopts every case minted from taste training.
Create one QLoRA run per base model with shared hyperparameters and the same dataset, so you can compare which open base model best learns your taste. Trains on AWS SageMaker.
Builds an instruction dataset from the published taste corpus (principles, annotations, and applied taste-training examples), then trains the base model with QLoRA as a managed SageMaker training job. Cheap, fast validation before scaling up.
Run the same build prompt through a raw provider, then through the same provider with tastellm guidance injected first. Use this to demo how vibe-coded or production-product outputs change when the taste API is in the loop.