TWIT 966: He's Got a Huge Corpus

Leo · February 12, 2024, 3:02am

Beep boop - this is a robot. A new show has been posted to TWiT…

What are your thoughts about today’s show? We’d love to hear from you!

Jerrysmith · February 13, 2024, 4:58am

I’m not convinced by Cathy’s arguments when it comes to how everyone should just give their written material to AI (which is basically giving it to amazon, openai, google and so on for free so they can make money off of it), and be glad about it.

Artists are already losing jobs because someone can make their graphic design or stock photo using AI. I know concept artists who are up in the trees because their jobs are at stake. They should be happy that they’re providing free labor to multibillion dollar businesses.

Not to mention that saying you should be glad your work is being used to train AI, is basically the same as saying be glad that AI is going to take your job. Why the heck would someone want to read your blog when bing chat can just summarize it. It already does that. Comparing that feature to google’s search engine is disingenuous, when you make a query on google to discover the information you need you have to click on the URL. AI, not so much. Nice for Sam, but when people aren’t seeing those ads those websites don’t have a reason to keep making content.

These types of pro AI arguments are academic and show that people aren’t actually talking to the ones these AI’s affect, if you don’t look at what these writers, artists, musicians, and so on are saying then your arguments are just free PR for google, etc.

PHolder · February 13, 2024, 6:51pm

As I said in the Discord during the show. AIs are not humans and so should not be given the same limits and rules as humans.

If we do decide to let AIs take all the jobs then we’re going to have to force them to “pay” into a fund that provides basic human income.

jmiahjones · February 14, 2024, 12:42am

I agree. I think copyright already provides a bit of guidance around this. Whether something is fair use or not depends not only on whether something is transformative, but also about the market.

I think purposes like research and personal use gives a way to allow training AI, even under copyright, because the purpose is not to provide automation for any purpose for money, thus undercutting the market. But other uses don’t just ban uses of AI, but do require compensation.

I also think the argument that “having your work cannibalized without compensation vastly benefits society” is a [citation needed.]

big_D · February 14, 2024, 5:25am

Totally. I’ve been saying similar things for a while now. AI is at the moment a parasite and they are destroying the host they are feeding from.

This is especially true in current news. I don’t want to save old media, as @JeffJarvis puts it, but I do want to save independent, quality journalism. But if people are going to go to AI and ask it for a summary of the news and current events, those reporting the news are going to be out of a job, because their employers are no longer being paid to publish the news.

That means no more reporters, no more “media” and only AI is left, which is reliant on those media sites to get its news. The AI can’t itself go out and do interviews, report from the spot, where there is a war, catastrophe, accident, home coming, social event etc. The best it can do is scrape YouTube videos and hope they are real, maybe some posts on social media with no reputation for accurate reporting - and I can see companies like Meta and X keeping their floodgates closed off to other AIs, so you will have to read their AI news summaries on their platforms…

Once the media sources have closed down, AI will have to rely on itself to regurgitate and hallucinate out of its learning model. By the time people finally face the problem, there will be no news outlets, no reporters, no world-wide news services, just an AI getting dumber by the day, because it doesn’t know what is happening, or the sources it still has are unverified and not trustworthy.

Either we will have big business or political parties controlling the input to the AIs for news, or we will have totally wild speculation as the new news. Neither is conducive to rational reporting and balanced conversation and debate. It is bad enough in America with partisan reporting on TV and many papers, real independent reporting is already few and far between, but AI is going to make it worse. We still have relatively independent reporting, here in Germany, there is some bias, but they talk rationally about all sides of the political spectrum, if one side messes up, they are taken to task as much as when the other side messes up. The only heavy bias is anti-right wing and anti-left wing, neither extremes put themselves in a good light to start with, with violence, racism and anarchy and both sides are equally hounded by the press (AfD and Die Linke, not the normal left and right of the mainstream political spectrum). But, I digress.

If we are not careful, we will create an idiocracy, fuelled by AI misinformation.

AI companies need to learn to stop being a parasite and start being a symbiot. And, yes, where they take office and manufacturing jobs away, they need to provide for a basic income for those no longer able to find work, because there are no more jobs.

The problem is, business doesn’t see this, they just see increased profits for the companies, but, at the end of the day, the companies are irrelevant without people to buy or use the products they manufcature. But they won’t realise this, until the only people left who can buy their products are the company owners and the whole thing collapses.

The world needs a slap in the face to wake it up now and they need to work out a new model for working and compensation now, before they destroy most of the jobs. It is too late, when they react to the problem, when it comes up. The time to react is now, before the problem grows out of control.

I’ve worked in IT for over 40 years and I love IT and I can see a lot of great uses for AI, as a technology, but the current AI isn’t ready for the great unwashed and it needs to be refined and improved, before it is released to the general public, but the greed and the current attitudes of Big Tech - release unfinished products and let people pay us to repair it - isn’t going to work with AI, because it is too destructive in its current guise.

Giving them equivalent status to humans under the law is the wrong way to go. They aren’t humans, they can’t even think, they are just businesses and they should be treated as such, when they want to do B2B transactions, like scraping other people’s websites and publications for data, until a new way forward can be defined that keeps the quality of the source data high - i.e. save the reporters, give them compensation, I don’t really care about Old Media itself, but regardless of whether Old Media or AI is providing the news, we still need those reporters going out there and gathering the news, otherwise society is lost, so AI companies have to come up with a model of how they are going to keep reporters reporting, how they will be compensated, or if we switch to a basic income, how they will be motivated to continue reporting.

jmiahjones · February 14, 2024, 12:25pm

It also depends on context, though. Chatbots are absolutely horrible at coming up with new strings of words, but can be decent at summary. The fact that you know where the content comes from means it should be compensated, in my mind.

Images are trickier, because there are many novel things created of quality, likely using a large portion contained in the training set. There has to be some way to identify what is in use so it can be attributed. But I agree, this is absolutely something that ML researchers routinely try to do.

AI products can be super productive for certain tasks, but I just can’t imagine people will be paying $20/mo indefinitely, at least outside of a small niche. I just wonder if AI as a product dies sooner rather than later. In that respect, it feels like these products are just a way for tech companies to grab cash to fund their other activities.

ChrisKez · February 14, 2024, 3:39pm

The impact on the market value of material is a component that others here have noted, but I think is too often missed on TWiT and other networks in discussing Fair Use. From the US Copyright office page on Fair Use:
4. Effect of the use upon the potential market for or value of the copyrighted work: Here, courts review whether, and to what extent, the unlicensed use harms the existing or future market for the copyright owner’s original work. In assessing this factor, courts consider whether the use is hurting the current market for the original work (for example, by displacing sales of the original) and/or whether the use could cause substantial harm if it were to become widespread.

Widespread use can certainly harm the future business prospects for all of the reporting that AI is happy to ingest for free and then spit back out for money. And maybe the ability for AI to ingest everything at scale is a key point of differentiation from what a single person (or even a large group of people) can do.

jmiahjones · February 14, 2024, 10:38pm

Exactly. Maybe Sarah Silverman can’t win an individual copyright suit, but a class action involving an industry of creators probably could. But at by the time that happens, the damage will be done. Regulation needs to come sooner rather than later.

The open source community already recognizes this. The panelists on Floss Weekly 756 discussed their involvement in the drafts in the EU (RIP Twit Floss, though I’m happy it’s carried on).

Jerrysmith · February 15, 2024, 3:46am

Mr Jarvis plus Cathy don’t offer up a way that a person say concept artist or a writer is going to make money from these AI’s, while sharing their data with multi billionaire. Unless they can then their arguments aren’t really useful for those people.

Topic		Replies	Views
TWIT 927: The Cheese Tax This Week In Tech episode	19	418	May 20, 2023
TWIG 703: Spicy Autocomplete This Week In Google episode	4	297	February 22, 2023
TWIG 774: My AI Fiancé This Week In Google episode	2	63	June 27, 2024
TWIT 952: A Gathering of the Protons This Week In Tech episode	2	164	November 9, 2023
TWIT 923: Intelligence Explosion This Week In Tech episode	16	297	April 22, 2023

TWIT 966: He's Got a Huge Corpus

Related topics