Paper link:
How do Large Language Models (LLMs) think?
Chain-of-thought (CoT) reasoning [1, 2, 3] shows that by allowing LLMs to think out loud before answering, their performance actually improves considerably when compared to direct answering techniques without CoT.
This provides some intuition as to how LLMs reason through their tasks.
Recent work [3, 4] suggests that for CoT reasoning, LLMs answers could be unfaithful to the intermediate reasoning steps (simply put, the LLM answers do not tally with their “workings”).
Some guiding questions: