Raw LLM Responses

Inspect the exact model output for any coded comment.

Comment
That's where Anthropic's mechanistic interpretability research comes in. The final goal is an "AI brain scan" where you can literally peer into the model to find signs of potential deception.
reddit AI Moral Status 1750452114.0 ♥ 9
Coding Result
DimensionValue
Responsibilitynone
Reasoningconsequentialist
Policynone
Emotionapproval
Coded at2026-04-25T08:06:44.921194
Raw LLM Response
[ {"id":"rdc_mytdwcd","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"}, {"id":"rdc_mythcx4","responsibility":"user","reasoning":"virtue","policy":"unclear","emotion":"outrage"}, {"id":"rdc_mytxtkw","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"approval"}, {"id":"rdc_myvkshw","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"approval"}, {"id":"rdc_mz04hp0","responsibility":"company","reasoning":"deontological","policy":"unclear","emotion":"outrage"} ]