PlagAIrism
(Please don't steal my beautiful words)
"Imitation is the sincerest form of plagiarism" - Wilde
"If I have seen further it is by standing on the shoulders of Giants." - Newton
If I wrote this just by quoting people then would it be an original work? It's original in the sense of the selection and ordering, but I could easily write, or perhaps I should say cobble together, something which is 90% the work of other people. And yet, I would argue that it could be original if the way I put those quotes together tells a story, or defines an idea, or even just is.
What is art? I'm not even going to begin to try and answer that question. What's an original work? Well that's perhaps even harder, but I'm going to give it a shot. But only for text today. I'll try and tackle pictures another time.
We all owe a debt to those who came before us. It is impossible for me to provide all the attributions for the concepts and phrases which influence my work. And do I really need to? When I reference Marvin the manically depressed android the reader will either a) know him and appreciate the reference or b) not get part of my point, or c) perhaps even realise a point is being made and if they google it, they may find one of the greatest um, trilogies, ever written. And what about the things I reference which I can't remember the origin of? Who said the unreasonable man is responsible for progress? If I misquote, is that something I should be castigated for?
Should I give a reference to all the works or even teachers who taught me specific words?
A facetious question, and yet that's what is being asked of some AI tools. It's a tricky one, because as an author, I want use of my work to be acknowledged, and in the case of works I've not put out into the internet for free, my books, then I'd want to be paid.
The problem is that two years ago these types of AIs, LLMs, were below the radar, and their developers were, shall we say optimistic, in their use of data. They scraped the internet, and grabbed whatever data they could get their mitts on so that they could feed as much as possible to make bigger and bigger models, in the belief (which I'll discuss a different time) that bigger was going to make better. The issues with this approach are many, but taking other people's work without asking or even crediting them, and then making billions, smacks of... disingenuity.
I am a Wikipedian, if perhaps not as active as I have been on occasion in the past. One of the key rules is to not copy wholesale other people's works (link). In effect we're encouraged to paraphrase and then provide references (where relevant). Perhaps this is what we should be doing with AI?
Actually, I think a simpler solution should be used. Everything which is either explicitly free, or out of copyright, should be fair game for the LLMs. Anything else should require a specific authorisation on either a paid or open source basis from the creators of the work, or their representatives. This would simplify things, and also give a 19th century flavour to the language used by our largest LLMs. But, let's be honest, wouldn't a bit more Austen make ChatGeminAude more fun?




