The dataset doesn't store the pictures in repositories but points towards it on the internet The post Stanford Finds Abusive Child Imagery in LAION-5B, used by Stable Diffusion appeared first on Analytics India Magazine.
Stanford University recently found that a popular dataset used to train AI, called LAION-5B, had links to child sexual abuse material (CSAM). This dataset was used by the Stable Diffusion creator Stability AI, and it had at least 1,679 harmful pictures taken from social media and adult websites.
Starting from September 2023, researchers at Stanford checked this dataset. They used codes called ‘hashes’ to look at the images. They then sent these codes to tools that verify the harmful content, like PhotoDNA, which was checked by the Canadian Centre for Child Protection.
The LAION website states that the dataset doesn’t keep these pictures in their repositories. It just points to where they are on the internet. An earlier version of Google’s tool, Imagen, used a different version of this dataset, called LAION-400M. But as per the Stanford report, even this older version included “a wide range of inappropriate content including pornographic imagery, racist slurs, and harmful social stereotypes.”
LAION, the group managing the dataset, told Bloomberg that adhering to its ‘zero tolerance’ policy, they quickly took it offline. Stability AI, the company that used it, told the publication that they only used a small part of the dataset, and their guidelines against misuse of the platform(s) made sure it was safe.
However, the researchers from Stanford warn that even if AI is trained with this CSAM data, there might be hidden problems. They are especially concerned about pictures of specific victims appearing repeatedly, the report noted.
Because of this issue, many state lawyers in the US want the government to look into how AI might be misused to harm children and to stop AI from making such harmful CSAM images.
The post Stanford Finds Abusive Child Imagery in LAION-5B, used by Stable Diffusion appeared first on Analytics India Magazine.