Affiliate Disclosure
If you buy through our links, we may get a commission. Read our ethics policy.

Apple Intelligence wasn't trained on stolen YouTube videos

Apple has refuted using unethically obtained data to train Apple Intelligence — but it has acknowledged its use for another project.

On Tuesday, it was learned that an AI research lab called EleutherAI had harvested subtitles from YouTube videos without express permission from the creators. It also gathered data from Wikipedia, the British Parliament, and Enron staff emails. The data was then added to a dataset called "the Pile."

EleutherAI notes that its goal was to lower the barrier to AI development for those outside Big Tech. However, companies such as Nvidia, Salesforce, and Apple have all used the Pile to train various AI projects.

Now, Apple has spoken out, saying that while it had used the Pile, the dataset was not used for Apple Intelligence. Instead, it was used to train its open-source OpenELM models, which it released in April.

Apple has since confirmed to AppleInsider that OpenELM models don't power any of its AI or machine learning features. Instead, the tech giant claims that it created OpenELM to contribute to the research community.

It also notes that OpenELM models were never intended to be used for Apple Intelligence. It also says it has no plans to build any new versions of the OpenELM model.

Apple has repeatedly claimed that its sources for its artificial intelligence projects are ethical, and it's known to have paid millions to publishers, and licensed images from photo library firms.



14 Comments

🎁
macca 13 Years · 27 comments

Correction England doesn’t have a Parliament. Its the British Parliament

wdowell 15 Years · 235 comments

So that’s all ok then as they didn’t use it one thing but on another ? 😂 

(in all honestly , I don’t have  problem with scraping the internet -  and the terms of YouTube doesn’t mean that it makes it illegal - but to somehow say it’s any more ok  because it wasn’t used for Apple Intelligence but instead another  is just comical) 

🎁
wdowell 15 Years · 235 comments

macca said:
Correction England doesn’t have a Parliament. Its the British Parliament

It’s not British Parliament - it’s UK parliament - (Northern Ireland..  ) https://www.parliament.uk/

❄️
Stabitha_Christie 3 Years · 582 comments

wdowell said:
So that’s all ok then as they didn’t use it one thing but on another ? ߘ⦡mp;nbsp;

(in all honestly , I don’t have  problem with scraping the internet -  and the terms of YouTube doesn’t mean that it makes it illegal - but to somehow say it’s any more ok  because it wasn’t used for Apple Intelligence but instead another  is just comical) 

The claim that Apple is responding to is that they used it in Apple Intelligence, So they were responding to a very specific claim. 


 And while you may see it as comical, how you use things does matter legally. For example I can sit around making videos using other people’s footage and music as long as it is for private use. If do the same thing for commercial use it is an entirely different thing. Apple didn’t release this as a commercial product and the license excludes use as a commercial product. That keeps it on the correct side of YouTubes terms. 

 So, yeah … how you use things does in fact matter.

commentzilla 10 Years · 777 comments

wdowell said:
So that’s all ok then as they didn’t use it one thing but on another ? ߘ⦡mp;nbsp;

... but to somehow say it’s any more ok  because it wasn’t used for Apple Intelligence but instead another  is just comical) 

Actually it does under copyright law. Copyright protections are not absolute, nor were they meant to be since they eventually expire.

But before expiration, Fair Use Doctrine must be taken into account for socially beneficial activities such as teaching, learning, and scholarship [research].

Copyright and Fair Use

https://guides.lib.uci.edu/copyright/copyright_how_to_use

Introduction

The Fair Use Doctrine protects the use of copyrighted works for socially beneficial activities such as teaching, learning, and scholarship. Courts consider four factors in deciding whether a use is Fair Use or an infringement:

  1. Purpose of the Use (learning, commentary, criticism OR commercial);
  2. Nature of the Publication (factual OR creative);
  3. Amount and Substantiality of the Whole (small OR substantial);
  4. Effect on the Market (has no effect OR replaces a sale).

Does copyright protect data?

Copyright law is, for good or for ill, an increasing concern for academics in their work. One area receiving particular attention is the copyright status of data and data representations. The Copyright Act and relevant case law are clear on copyright protection for data in the United States: there is none. This excellent site on the Copyrightability of Charts, Tables, and Graphs from the University of Michigan explains why.