Chain of thought in latent space is better than chain of thought of tokens
Paper: Training Large Language Models to Reason in a Continuous Latent Space
Researchers from Meta and UC San Diego are interested in Chain of Thought Reasoning.
Hmm..What’s the background?
Large language models (LLMs) have shown impressive reasoning abilities, often expressed through chain-of-thought (CoT) reasoning where they generate solutions step-by-step using natural language.
However, this reliance on language for reasoning can be limiting. Most tokens in a reasoning chain primarily contribute to fluency rather than the reasoning process itself. Moreover, some critical reasoning steps require complex planning and can be very challenging for LLMs to generate in language. Ideally, LLMs should have the freedom to reason without language constraints, translating their findings into language only when necessary.
So what is proposed in the research paper?
Here are the main insights:
The researchers introduce Coconut (Chain of Continuous Thought), a new paradigm for LLM reasoning in an unrestricted latent space
Coconut modifies the traditional CoT process by directly feeding the last hidden state (representing a "continuous thought") as the input embedding for the next token, rather than decoding it into a word token
Coconut successfully enhances the reasoning capabilities of LLMs. Unlike language-based reasoning, continuous thoughts in Coconut can encode multiple potential next steps simultaneously. This allows for a reasoning process akin to breadth-first search (BFS), where the model explores multiple options in parallel and progressively eliminates incorrect paths.
What’s next?
While multi-stage training with language reasoning chains has proven effective, exploring better methods for learning latent reasoning, especially without language supervision, is essential.
Chain of thought in latent space is better than chain of thought of tokens
Learned something new? Consider sharing it!