Raw LLM Responses

Inspect the exact model output for any coded comment.

Comment
Nightshade is trained to trick CLIP specifically. It only produces meaningful effects if images are classified with CLIP. Nobody uses CLIP to generate their image descriptions to train on. but the Nightshade Q&A page says this: "I gave a Nightshaded/Glazed image to GPT/BLIP and it recognized it perfectly. Does this mean Nightshade/Glaze failed? No, it does not. Nightshade and Glaze both target image generators, which are built on diffusion architectures. Image classification, which is what you get when you ask a model to tell you what is in an image, is a completely different task. The normal properties of transferability that allow attacks or perturbations targeting one model to affect another similar model, generally does not extend to models that perform different tasks. Today's prompt extraction tools are not traditional DNN classifiers, but are still different enough architecturally from image generators to completely break transferability. To put it plainly, Nightshade and Glaze are designed to NOT affect those models. If a model identifies the contents of your shaded/glazed image perfectly, then that is correct behavior. In fact, consider the opposite. If anyone can identify a shaded image not as the original, but as the shade target, then this is a super fast and easy way to identify/filter out all Nightshaded images, and Nightshade would not be useful." However, the whole purpose of nightshade is to fool the classifier by having it wrongly classify the image. So this section is a blatant lie. If it can't fool classifiers, it doesn't do anything. Their Q&A is deliberately misleading. Note how they provide no explanation why this would still work if the classifier isn't fooled by it. The nightshade paper itself confirms this: "Automated image captioning. Next, we look at a defense method where model trainer completely removes the text prompt for all training data in order to remove the poison text. Once removed, model trainer can leverage existing image captioning tools [95, 96] to generate new text prompts for each training image. Similar approaches have been used to improve the data quality of poorly captioned images [97, 98]. For a poisoned dataset, we generate image captions using BLIP model [95] for all images, and train the model on generated text paired up with original images. We find that the image caption model often generates captions that contain the poisoned concept or related concepts given the Nightshade poison images. Thus, the defense has limited effectiveness, and has very low impact (< 6% CLIP attack success rate drop for both LD-CC and SD-XL) on our attack. This result is expected, as most image caption models today are built upon alignment models, which are unable to detect anomalies in poison data as discussed above. Here, the success of this approach hinges on building a robust caption model that extracts correct text prompts from poisoned samples" This means if you can classify the image correctly, nightshade is useless, and the nightshade Q&A page does tell you that modern image classifiers correctly identify the original concepts of nightshaded images. Also, in the paper they "show their efficacy" by just training on wrongly classified images and the question how - using nightshade - wrongly classified images ever end up being trained on never gets addressed. Honestly, the paper is absurdly bad and written in deliberately misleading ways. Nobody seems to feel particularly threatened by this, as otherwise there would be more research into its effectiveness, but I could only find some anecdotal evidence where people trained exclusively on nightshaded images and reported no change whatsoever and one page which noted a small increase in aesthetic quality. "Noise Offset Noise offset, as described by crosslabs's article works by adding a small non-0 number to the latent image before passing it to the diffuser. This effectively increases the most contrast possible by making the model see more light/dark colors. Glaze and Nightshade effectively add noise to the images, acting as a sort of noise offset at train time. This can explain why images generated with LoRAs trained with glazed images look better than non-glazed images." Also, nightshade effects seem to be completely removed by running an image through a VAE first as it doesn't have the space to represent the noise introduced by nightshade and thus effectively filters it out. The reason the video creator didn't find anything disproving the effectiveness of this is that nobody takes it seriously enough to even bother disprove. You will be just as hard-pressed to find any evidence that it does anything, which there also have been a total of 0 studies on. Note how this video totally glosses over trying to find any evidence that nightshade does work, writing it off with "I couldn't find evidence it doesn't" I could create an hour-long video ripping apart the nightshade paper, because it's like the worst paper I've read in a hot minute. I went through the whole damn thing trying to find an explanation of how this is actually supposed to result in bad image caption pairs ending up in the dataset. Apparently, this isn't important enough of a detail to mention in their paper. Don't be fooled by Youtubers who don't understand the technology and have never trained an AI of their own.
youtube Viral AI Reaction 2025-04-01T13:3… ♥ 1
Coding Result
DimensionValue
Responsibilitynone
Reasoningunclear
Policyunclear
Emotionindifference
Coded at2026-04-27T06:24:59.937377
Raw LLM Response
[ {"id":"ytc_UgxtJvU9ZwmAKJm6ig94AaABAg","responsibility":"none","reasoning":"unclear","policy":"unclear","emotion":"indifference"}, {"id":"ytc_UgznBcUmfqC7ErGT3BZ4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"unclear","emotion":"resignation"}, {"id":"ytc_UgxagMvPeapMBsS0O9x4AaABAg","responsibility":"ai_itself","reasoning":"consequentialist","policy":"regulate","emotion":"outrage"}, {"id":"ytc_UgyVPCHZLf22r8NCMr94AaABAg","responsibility":"company","reasoning":"deontological","policy":"ban","emotion":"outrage"}, {"id":"ytc_UgygekkxkjS0bF69Uat4AaABAg","responsibility":"user","reasoning":"virtue","policy":"none","emotion":"approval"}, {"id":"ytc_Ugyg5G0-zOfHD8QOhqd4AaABAg","responsibility":"none","reasoning":"deontological","policy":"unclear","emotion":"indifference"}, {"id":"ytc_Ugz0TvhCk99fZkc5z3N4AaABAg","responsibility":"none","reasoning":"mixed","policy":"unclear","emotion":"mixed"}, {"id":"ytc_UgzS5l3ElW8qohspZ854AaABAg","responsibility":"company","reasoning":"deontological","policy":"regulate","emotion":"outrage"}, {"id":"ytc_Ugzfxq_JCN7HekFOJph4AaABAg","responsibility":"company","reasoning":"consequentialist","policy":"regulate","emotion":"fear"}, {"id":"ytc_UgxnZwupW66Q_5QdA6F4AaABAg","responsibility":"none","reasoning":"unclear","policy":"unclear","emotion":"approval"} ]