Facebook’s internal documents has revealed the difference between the works of AI moderation tools and that of human moderators which has resulted in a lot of hate speech and violent post bypassing the platform’s scrutiny therefore getting exposed to the public space.
Its “over-enforcement” let hate speech through as WSJ claimed the company tweaked its algorithm in a way that led it to ignore more user reports leading to confusion with the AI moderation tool which mistakenly flagged cockfights as a car crash and videos livestreamed by perpetrators of mass shootings as paintball games or a trip through a carwash.
While the situations like this takes place in US and other English-speaking countries, it is however worse in other non-English speaking countries like Afghanistan where the company only identified just 0.23 percent of hate speech shared on its platform.
With internal reports showing that Facebook’s users want the company to take more aggressive approach irrespective of whether it will result in removing higher number of innocent posts, Facebook noted that engineers are now working towards training AI models that avoid false positives letting more hate speech slip through undetected.
For more information, read the original story in Arstechnica.