Pin It
yayoi kusama
Photo by Alain Nogues/Sygma/Sygma via Getty Images

Here are all the artists Midjourney allegedly uses to train its AI

A newly exposed database lists up to 16,000 artists that the company has allegedly used to train its art-generation tools, from Frida Kahlo, to Yayoi Kusama, Banksy and Andy Warhol

It’s a terribly-kept secret of generative AI that it depends on a vast collection of real humans’ work for its training data. Musicians like Drake and Kurt Cobain are being turned into guinea pigs for the robot musicians of the future, while A-list authors have accused the tech of “systematic theft on a mass scale”. Communities of visual artists have also fought back against models like Stable Diffusion, claiming they unethically ‘scrape’ data from sites such as DeviantArt

Last weekend, though, the scale of the issue became even more apparent, thanks to the release of a database that allegedly contains a list of artists used to train Midjourney, one of the leading AI art generators of the moment.

Shared around on social media sites over New Year, the database comes in the form of a Google Sheets spreadsheet, listing various time periods, styles, genres, movements, mediums, and techniques that were apparently used to train the program. Causing more scandal, though, were the names of individual artists – many thousands of them – whose work was seemingly fed into the machine as part of its training process.

The list is taken from a 24-page document included as part of an amendment to a class-action complaint against Midjourney, Stability AI and DeviantArt, first filed back in January 2023. The amendment followed, with 455 pages of supplementary evidence, on November 29, 2023. Among the artists included are commercial illustrators, game systems, and digital artists, as well as blue chip modern and contemporary artists.

Some of the biggest names include Yayoi Kusama, Frida Kahlo, Banksy, Guerrilla Girls, HR Giger, Harmony Korine, Anish Kapoor, David Hockney, Damien Hirst, Cy Twombly, Walt Disney, Picasso, Egon Schiele, Mark Rothko, Francis Bacon, and Andy Warhol (which seems kind of appropriate actually). Going a bit further back, the database also spans the likes of Matisse, Monet, and Vincent van Gogh, alongside broader art periods and “core” styles such as cottagecore, glitchcore, gorpcore, and gorecore.

Fans of the trading card game Magic: The Gathering have pointed out that the list even includes art by a six-year-old, Hyan Tran, who was invited to contribute his take on a character from the game in a 2021 fundraiser for the Seattle Children’s Hospital.

Access to the spreadsheet has been restricted since it was intially shared on social media, but is preserved via the Internet Archive. The amendment itself was filed in response to the dismissal of several artists’ claims against Midjourney and Stability AI on October 30, by a judge in California federal court.

This is far from the only legal battle revolving around AI image generation at present, of course. In September 2023, artist Jason Allen lost an appeal that would have allowed him to copyright his award-winning, AI-generated artwork Théâtre d’Opéra Spatial, due to the fact it “lacks human authorship”.

Allen, like other AI artists, has pledged to keep fighting for copyright law changes; at the same time, more traditional artists will no doubt maintain pressure on AI companies like Midjourney or Stability AI until a legal decision is reached. In the meantime, artists have been advised to search databases for their own names, and seek legal action if required.