Anthropic’s Claude can now describe its own reasoning about 20% of the time, drastically cutting interpretability time, but it demands continuous oversight.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results