AniSora: Open-source anime video generation model
komiko.app356 points by PaulineGar a year ago
356 points by PaulineGar a year ago
Some of these are very obviously trained on webtoons and manga, probably pixiv as well. This is very clear due to seeing CG buildings and other misc artifacts. So this is obviously trained on copyrighted material.
Art is something that cannot be generated like synthetic text so it will have to be nearly forever powered by human artists or else you will continue to end up with artifacting. So it makes me wonder if artists will just be downgraded to an "AI" training position, but it could be for the best as people can draw what they like instead and have that input feed into a model for training which doesn't sound too bad.
While being very pro AI in terms of any kind of trademaking and copyright, it still make me wonder what will happen to all the people who provided us with entertainment and if the quality continue to increase or if we're going to start losing challenging styles because "it's too hard for ai" and everything will start 'felling' the same.
It doesn't feel the same as people being replaced with computer and machines, this feels like the end of a road.
It’s great that you have sympathy for illustrators, but I don’t see a big difference if the training data is a novel, a picture, a song, a piece of code, or even a piece of legal text.
As my mom retired from being a translator, she went from typewriter to machine-assisted translation with centralised corpus-databases. All the while the available work became less and less, and the wages became lower and lower.
In the end, the work we do that is heavily robotic will be done by less expensive robots.
Here’s the argument:
The output of her translations had no copyright. Language developed independently of translators.
The output of artists has copyright. Artists shape the space in which they’re generating output.
The fear now is that if we no longer have a market where people generate novel arts, that space will stagnate.
A translation is absolutely under copyright. It is a creative process after all.
This means a book can be in public domain for the original text, because it's very old, but not the translation because it's newer.
For example Julius Caesar's "Gallic War" in the original latin is clearly not subject to copyright, but a recent English translation will be.
So if a machine was to do the translation, should that also be considered a creative work?
If not, that would put pressure on production companies to use machines so they don’t have to pay future royalties
Well that's the real question, isn't it?
Our current best technology, LLMs, are good enough for translating an email or meeting transcript and getting the general message across. Anything more creative, technical, or nuanced, and they fall apart.
Meaning for anything of value like books, plays, movies, poetry, humans will necessarily be part of the process: coaxing, prompting, correcting...
If we consider the machine a tool, it's easy, the work would fall under copyright.
If we consider the machine the creator, then things get tricky. Are only the parts reworked/corrected under copyright? Do we consider under copyright only if a certain portion of the work was machine generated? Is the prompt under copyright, but not its output?
Without even getting into the issue of training data under copyright...
There is some movement regarding copyright of AI art, legislation being drawn up and debated in some countries. It's likely translations would be impacted by those decisions.
> So if a machine was to do the translation, should that also be considered a creative work?
No, but it will be derived work covered by the same copyright as original.
The quality of human translation is better, for now.
> The output of artists has copyright.
Copyright is a very messy and divisive topic. How exactly can an artist claim ownership of a thought or an image? It is often difficult to ascertain whether a piece of art infringes on the copyright of another. There are grey areas like "fair use", which complicate this further. In many cases copyright is also abused by holders to censor art that they don't like for a myriad of unrelated reasons. And there's the argument that copyright stunts innovation. There are entire art movements and music genres that wouldn't exist if copyright was strictly enforced on art.
> Artists shape the space in which they’re generating output.
Art created by humans is not entirely original. Artists are inspired by each other, they follow trends and movements, and often tiptoe the line between copyright infringement and inspiration. Groundbreaking artists are rare, and if we consider that machines can create a practically infinite number of permutations based on their source data, it's not unthinkable that they could also create art that humans consider unique and novel, if nothing else because we're not able to trace the output to all of its source inputs. Then again, those human groundbreaking artists are also inspired by others in ways we often can't perceive. Art is never created in a vacuum. "Good artists copy; great artists steal", etc.
So I guess my point is: it doesn't make sense to apply copyright to art, but there's nothing stopping us from doing the same for machine-generated art, if we wanted to make our laws even more insane. And machine-generated art can also set trends and shape the space they're generated in.
The thing is that technology advances far more rapidly than laws do. AI is raising many questions that we'll have to answer eventually, but it will take a long time to get there. And on that path it's worth rethinking traditional laws like copyright, and considering whether we can implement a new framework that's fair towards creators without the drawbacks of the current system.
Ambiguities are not a good argument against laws that still have positive outcomes.
There are very few laws that are not giant ambiguities. Where is the line between murder, self-defense and accident? There are no lines in reality.
(A law about spectrum use, or registered real estate borders, etc. can be clear. But a large amount of law isn’t.)
Something must change regarding copyright and AI model training.
But it doesn’t have to be the law, it could be technological. Perhaps some of both, but I wouldn’t rule out a technical way to avoid the implicit or explicit incorporation of copyrighted material into models yet.
> There are very few laws that are not giant ambiguities. Where is the line between murder, self-defense and accident? There are no lines in reality.
These things are very well and precisely defined in just about every jurisdiction. The "ambiguities" arise from ascertaining facts of the matter, and whatever some facts fits within a specific set of set rules.
> Something must change regarding copyright and AI model training.
Yes, but this problem is not specific to AI, it is the question of what constitutes a derivative, and that is a rather subjective matter in the light of the good ol' axiom of "nothing is new under the sun".
> These things are very well and precisely defined in just about every jurisdiction.
Yes, we have lots of wording attempting to be precise. And legal uses of terms are certainly more precise by definition and precedent than normal language.
But ambiguities about facts are only half of it. Even when all the facts appear to be clear, human juries have to use their subjective human judgement to pair up what the law says, which may be clear in theory, but is often subjective at the borders, vs. the facts. And reasonable people often differ on how they match the two up in many borderline cases.
We resolve both types of ambiguities case-by-case by having a jury decide, which is not going to be consistent from jury to jury but it is the best system we have. Attorneys vetting prospective jurors are very much aware that the law comes down to humans interpreting human language and concepts, none of which are truly precise, unless we are talking about objective measures (like frequency band use).
---
> it is the question of what constitutes a derivative
Yes, the legal side can adapt.
And the technical side can adapt too.
The problem isn't that material was trained on, but that the resulting model facilitates reproducing individual works (or close variations), and repurposing individual's unique styles.
I.e. they violate fair use by using what they learn in a way that devalues other's creative efforts. Being exposed to copyrighted works available to the public is not the violation. (Even though it is the way training currently happens that produces models that violate fair use.)
We need models that one way or another, stay within fair use once trained. Either by not training on copyrighted material, or by training on copyrighted material in a way that doesn't create models that facilitate specific reproduction and repurposing of creative works and styles.
This has already been solved for simple data problems, where memorization of particular samples can be precluded by adding noise to a dataset. Important generalities are learned, but specific samples don't leave their mark.
Obviously something more sophisticated would need to be done to preclude memorization of rich creative works and styles, but a lot of people are motivated to solve this problem.
It seems like your concerns is about how easy it is going to be to create derivative and similar work, rather than a genuine concerns for copyright. Do I understand correctly?
No, I am just narrowing down the problem definition to the actual damage.
Which is a very fair use and copyright respecting approach.
Taking/obtaining value from works is ok, up until the point where damage to the value of original works happen. And that is not ok. Because copyright protects that value to incentivize the creation and sharing of works.
The problem is that models are shipping that inherently make it easy to reproduce copyrighted works, and apply specific styles lifted from single author's copyrighted bodies of work.
I am very strongly against this.
Note that prohibiting copying of a recognizable specific single author's style is even more strict than fair use limits on humans. Stricter makes sense to me, because unlike humans, models are mass producers.
So I am extremely respectful of protecting copyright value.
But it is not the same thing as not training on something. It is worth exploring training algorithms that can learn useful generalities about bodies of work, without retaining biases toward the specifics of any one work, or any single authored style. That would be in the spirit of fair use. You can learn from any art, if it's publicly displayed, or you have paid for a copy, but you can't create mass copiers of it.
Maybe that is impossible, but I doubt it. There are many ways to train that steer important properties of the resulting models.
Models that make it trivial to create new art deco works, consistent with the total body of art deco works, ok. Models that make it trivial to recreate Erte works, or with an accurately Erte style specifically. Not ok.