IM 836: I See OJ and He Looks Scared

Beep boop - this is a robot. A new show has been posted to TWiT…

What are your thoughts about today’s show? We’d love to hear from you!

@Leo I am still struggling with the “right to read” argument:

  • How does a corporation have a right to read everything for free? Even citizens can’t do that, they, try walking into Barnes and Noble and sitting there, reading a book every day, or trying to walk out with a bunch of books without paying.
  • Even if corporations have to pay for the content they use in their models, how does that affect the individual’s rights? One is a multi-billion dollar corporation, the other is an actual human being, they aren’t in any way equivalent.

I have been saying for a long time that we need something like Common Crawl (I didn’t realise that it existed, until Rich Skrenta was on the show). Why don’t we make all AIs use the Common Crawl database?

That would reduce the costs for website owners, who are facing huge bills just from the various AIs scouring their sites. Why not get the AI companies to fund Common Crawl properly and get CC to do regular single passes, maybe more frequent than current budget constraints allow, plus an API where, for example, news sites could push their content to Common Crawl, so that the site doesn’t need to be constantly crawled?

The curation would mean that the dataset is less full of misinformation and unsuitable material, the websites would benefit (and using the push API, plus funding from AI companies), they could pay the sites for the content being pushed to them (after vetting the sites, naturally, and regular random sampling, to ensure that the information being pushed is still accurate and not AI slop or misinformation, for example). That sounds like a win for all concerned, to me.

2 Likes

I also feel that the right to read for a human being is related to a human being’s need to earn a living. As a human, you have the right to acquire knowledge from say a library in an effort to educate yourself so you can be employable. I don’t think we should allow a machine to be equivalent in this way. A human has a fairly limited capacity to absorb and process books and written content, whereas the machine theoretically has no limit. The human will need to keep the books around as a form of reference to be able to refer back to, the machine basically incorporates the book into itself, and no longer needs the source material. I think this clearly indicates how a human, while maybe technically a machine itself, is not on the same order as a LLM, and there needs to be a distinction, or else we will lose our humanity.

2 Likes