Sign in with an admin account to manage users and review feedback.
Capture scored examples, nuanced rationale, transcripts, and build guidance across any software surface. Applying an example writes it into the corpus as a training example and creates a response eval regression case.
Choose from up to 100 top products, or type your own product above.
Paste screenshots here, drag images in, or upload files that demonstrate the example.
Edit any entry to immediately update retrieval/RAG output (re-indexed on save) and the dataset the next fine-tune run trains on. Toggle Train to include/exclude a row from training; unpublish to remove it from live retrieval. Use Audit dataset to inspect the exact chat examples generated from these rows.
Run mode Active model generates a fresh response from the active taste model (what the API returns today) and scores it; Stored example scores each case's saved response.
Grouped benchmarks of response eval cases. Run a whole set to score the active taste model, then reuse the same set to compare fine-tune candidates apples-to-apples. The default taste-holdout-v1 set adopts every case minted from taste training.
Past persisted runs across cases, sets, and suite runs. Use this to compare active-model runs against stored-example runs over time.
Register your proprietary model and point it at an OpenAI-compatible inference endpoint to make it the default that serves /recommend and design proposals. One model for now; register more over time (generations, and low/medium/high tiers) and activate the one that should serve.
Validating today? Use the OpenRouter preset to serve a stock Qwen through the API while your fine-tune trains — it reuses your OPENROUTER_API_KEY. For SageMaker, deploy with finetune/deploy_sagemaker_endpoint.py, then enter the endpoint name here with provider SageMaker.
Create one QLoRA run per base model with shared hyperparameters and the same dataset, so you can compare which open base model best learns your taste. Trains on AWS SageMaker.
Builds an instruction dataset from the published taste corpus (principles, annotations, and applied taste-training examples), then trains the base model with QLoRA as a managed SageMaker training job. Cheap, fast validation before scaling up.
Run the same build prompt through a raw provider, then through the same provider with tastellm guidance injected first. Use this to demo how vibe-coded or production-product outputs change when the taste API is in the loop.