How to: beat Chinese social media image-filtering

Researchers from the University of Toronto's Citizen Lab (previously) have published an extensive report on the image filtering systems used by Chinese messaging giant Wechat to prevent the posting of banned political messages and other "sensitive" topics that are censored in China.

They found that the system uses two methods to decide what to censor: any text in images is run through an optical character recognition system and then compared to a list of banned terms; then the whole image is checked to see whether it appears to match a blacklist of banned images.

Both systems are vulnerable to easy circumvention; the researchers probed them until they were certain of how they were checking user-submissions and then designed and tested successful systems for evading censorship: text that matches the hue of its background is reliably missed by the OCR filter, while the visual filter can be defeated by rotating/flipping images, changing their aspect ratios, adding a variety of borders, or blurring their edges.

Similarly, the visual-based algorithm was able to match sensitive images to those on a blacklist under a variety of conditions. The algorithm had translational invariance. Moreover, it detected images even after their brightness or contrast had been altered, after their colours had been inverted, and after they had been thresholded to only two colours. However, due to the way it was implemented, we found multiple ways to evade filtering:

*
By mirroring or rotating the image, since the filter has no high level semantic understanding of uploaded images. However, many images lose meaning when mirrored or rotated, particularly images that contain text which may be rendered illegible.

*
By changing the aspect ratio of an image, such as by stretching the image wider or taller. However, this may make objects in images look out of proportion.

*
By blurring the photo, since edges appear important to the filter. However, while edges are important to WeChat's filter, they are often perceptually important for people too.

*
By adding a sufficiently large border to the smallest dimensions of an image, or to both the smallest and largest dimensions, particularly if both dimensions are of equal or similar size.

*
By adding a large border to the largest dimensions of an image and adding a sufficiently complex pattern to it.