Decanonization

Meaning boxes. This is the way programmers and engineers usually call the areas with markings/tags in the images that train the so-called computer vision. In the Visual Genome model paper, the authors call this process of selecting specific regions of images as “canonicalization”. Visual areas canonized in computer vision, as in Art History, in museums, in the institutionalized process of seeing.

In 2021, friend/programmer Bernardo Fontes and I thought of a way to “decanonize” these same training images that made Visual Genome possible. With philosopher Caroline Carrion, we wrote an essay for Rosa magazine about what this experience. Read the full text here (in Portuguese).

When programmer Bernardo Fontes proposed a programming experience that would invert the game of computer vision, we immediately thought about reverse engineering processes.

The Python code used in the images of this visual essay is able to identify the selections of specific areas of the images that contain labeling — small boundaries in the image called, by engineers and programmers, “meaning boxes”. Once this identification is made, the command created by Fontes is to instead of highlighting these nobler regions of the images, erase them.

The results are the leftover images that are not important to computer vision. By excluding what is considered important and highlighting everything that was disregarded, this experience helps us to understand something beyond the specific situations portrayed there.

Computer vision is much less the complex process of seeing, and much more the process of EXTRACTION, segmenting, separation and decontextualizing. In these images, we see what was discarded during this process.