Cursor Composer uses real-time RL to improve code generation from user feedback
Cursor trains Composer in production by serving model checkpoints, observing user responses, and using those responses as reward signals to ship improved versions every 5 hours.