Raw LLM Responses

Inspect the exact model output for any coded comment.

Comment
No, I’m a pro subscriber. The o3 and o4-mini models have a noticeably higher hallucination rate than o1. This means they get things wrong a lot more… which really matters in coding where things need to be very precise. So the models often feel dumber. Comparing with Gemini 2.5 Pro, it may be a problem in the way OpenAI is training with CoT.
reddit AI Harm Incident 1746999581.0 ♥ 16
Coding Result
DimensionValue
Responsibilitycompany
Reasoningconsequentialist
Policyunclear
Emotionindifference
Coded at2026-04-25T08:33:43.502452
Raw LLM Response
[ {"id":"rdc_mrtrxfv","responsibility":"none","reasoning":"unclear","policy":"unclear","emotion":"indifference"}, {"id":"rdc_mrufmer","responsibility":"company","reasoning":"consequentialist","policy":"unclear","emotion":"outrage"}, {"id":"rdc_mru3nu7","responsibility":"company","reasoning":"consequentialist","policy":"unclear","emotion":"outrage"}, {"id":"rdc_mrtzc7h","responsibility":"company","reasoning":"consequentialist","policy":"unclear","emotion":"resignation"}, {"id":"rdc_mrtcwmp","responsibility":"company","reasoning":"consequentialist","policy":"unclear","emotion":"indifference"} ]