What’s in the box
- NO
Seems like a quality product to me.
What’s in the box
- NO
Seems like a quality product to me.
Sure, to an extent. ActivityPub is an independent protocol not controlled by lemmy or any lemmy devs, so there’s a layer of protection there. This is also a trick that can only be pulled once, because any other instances would likely defederate in response and ML would render itself irreparably untrustworthy. I don’t mean to downplay your concerns as they are valid, but I also don’t think it’s an existential threat.
I don’t agree with the “hiding the problem” notion because different instances are independently operated, and defederation is the by-design way to “fix” malignant instances (see the LW defed of hexbear and lemmygrad for exactly this kind of behavior).
As for the whole system not being safe, I’d also disagree on that point as the entire lemmy server code is licensed under a copyleft license which allows anyone with a copy of the code to modify and distribute it. Ergo, hard forking lemmy is possible. Based on the github page, over 800 individuals already have forks of the server code. Any one of them, group of them, or some other individuals entirely, could pick up lemmy development and run with it if need be.
As you say OP, the solution here is to use the fediverse model as intended and use different instances/communities. It sucks because it fragments the community, but that’s the way it is. I’ve long held the opinion that I’m grateful to the lemmy developers for building this whole thing that we all get to enjoy, but their approach to administering an instance is reprehensible and actively damaging to the relatively free and open exchange of ideas that should happen on the fediverse.
Site’s finished boss!
Again, in many instances, folks training models are using repositories of images that have been publicly shared. In many cases the person/people who assembled the image repositories are not the same person using them. I agree that reckless scraping is not responsible, but if you’re using a repository of images that’s presented as ok to use for AI training, I’d argue it’s even more ethical to strip out the Nightshaded images, because clearly the presence of Nigthshade means you shouldn’t use that one. I guess we’re just going to have to agree to disagree here, because I see this as a helpful tool to specifically avoid training on images you shouldn’t be.
I don’t think most people are collecting images by hand and saying “ah yes I’m just gonna yoink this and use it in my model”. There are a plethora of sites for sharing repositories of training data, and therefore it’s pretty easy for someone training a model to unknowingly pull down some data they don’t actually have permission to use. It’s completely infeasible to check licensing by hand on what could be millions of images, so this tool makes it easy to simply not train on images that have gone through Nightshade. I fail to see how that’s unethical, as not training on the image is the whole reason the original image was put through Nightshade in the first place.
The tagline is really poorly written IMO. From reading the README, this doesn’t outwardly appear to be a tool for bypassing an artist’s choice to use something like Nightshade, but rather it seems to detect if such a tool has been used.
I’m assuming that the use case would be to avoid training on Nightshade-ed images, which would actually be respecting the original artist’s decision?
This is basically what I’ve been telling people for years. Prototype in Python to get the concepts down, then when you’re serious about the project, write it in a serious language.