Post-trainingiswheremodelslearntaste.Thewaywedoitisbroken.
After a model learns to predict text, humans teach it to be helpful and honest. That stage is called post-training, and it runs on a single weak signal: a person clicking which of several answers is better.
It's expensive
Standard methods (RLHF, DPO) make the model write 3 to 8 answers per prompt so a human can rank them. That's 3 to 8 times the inference cost on every prompt, before you pay the person doing the ranking.
RLHF: reinforcement learning from human feedback. DPO: direct preference optimization. The two standard ways models are aligned today.
It's slow
Human rankers click through millions of pairs. A single post-training run takes weeks of human labor and tens of thousands of dollars in compute, just to teach a model that one answer beat another.
Figures are estimates.
It throws away the signal
Ranking compresses a whole human reaction into one bit: A or B. The reader who smiled at one phrase, frowned at the next, and lost the thread halfway through, none of that survives. The model learns “this answer won,” never “this phrase landed.”
Everything you need to align AI, that actually captures human emotion.
Our system observes user interaction with an AI model and captures emotional and behavioral feedback signals through text, facial reaction, voice tone, attention patterns, and response behavior.
Project Objectives
Continuous Preference Mapping
Instead of sparse binary text prompts, we train models to capture continuous vocal, body, and facial reactions as high-fidelity reward vectors.
In-Context Steering (ICS)
Enable users to act as active steersmen for critical operations. An AI tutor or safety agent steers its exposition the moment you look confused or alarmed.
Post-Training Safety Moat
Creating massive proprietary datasets of multi-modal human emotional reactions to build robust, licenseable post-training preference models.
“AI alignment shouldn't be a cold mathematical equation; it should be as natural and continuous as a parent teaching a child.”
Solving Alignment Safely
A paper clip agent would never learn to reward-hack and end humanity if, en route, it registered that every emotional signal in its preference database screams “cold, cold, cold, you are on the wrong track.” Emotion represents biological safety boundaries built over millions of years.
Shape the future,
align with emotion.
Join us in building systems that understand human expression, or partner with us to align your large models safely.