Anthropic’s Claude can now describe its own reasoning about 20% of the time, drastically cutting interpretability time, but it demands continuous oversight.