veda.ng
Back to Glossary

Prompt Tuning

Prompt tuning is a parameter-efficient adaptation method that learns continuous 'soft prompt' embeddings prepended to model inputs while keeping all model weights frozen, bridging the gap between manual prompt engineering and full fine-tuning. Unlike discrete prompts written in natural language, soft prompts exist only in embedding space, they're not words, just learned vectors that condition the model. Training optimizes these vectors to elicit desired behavior on downstream tasks. The method is extremely parameter-efficient: often fewer than 0.1% of model parameters, with soft prompts typically containing 20-100 learned tokens. At inference, the soft prompt is prepended to the input and processed normally by the frozen model. Prompt tuning scales surprisingly well: as model size increases, the gap between prompt tuning and full fine-tuning shrinks, with very large models achieving near-parity. Google's original 2021 paper showed that an 11B parameter model with prompt tuning matched full fine-tuning on SuperGLUE. The learned prompts are not interpretable, they don't correspond to words and can't be expressed in natural language. But they effectively encode task-specific information that shapes model behavior. Prompt tuning works best when tasks are well-defined and target behavior is consistent. It struggles with complex, multi-faceted tasks that might benefit from weight-level changes.