Buffer of Thoughts (BoT) is a groundbreaking approach designed to enhance the reasoning capabilities of large language models (LLMs). By introducing a meta-buffer that stores high-level thought-templates distilled from various problem-solving processes, BoT allows LLMs to retrieve and adapt these templates for efficient reasoning on new tasks. This method not only boosts accuracy but also significantly improves efficiency and robustness.
One of the standout features of BoT is its buffer-manager, which dynamically updates the meta-buffer as more tasks are solved. This ensures that the system remains scalable and stable, continuously enhancing its capacity. Extensive experiments on ten challenging reasoning-intensive tasks have shown remarkable performance improvements, including a 51% increase on Checkmate-in-One and a 20% boost on Geometric Shapes. These results underscore BoT's superior generalization ability and robustness.
Moreover, BoT achieves these impressive gains while being cost-effective. It requires only 12% of the cost of traditional multi-query prompting methods, such as tree or graph of thoughts. Notably, the combination of Llama3-8B with BoT has the potential to surpass the performance of the much larger Llama3-70B model, highlighting the efficiency and power of this innovative approach. For those interested in exploring this further, the project details and code are available on GitHub.