Politics of seeing

By Matti Pohjonen|February 12, 2020|AI, Digital cultures, India, Research, Social media|

“Knowledge is a practical assemblage, a ‘mechanism’ of statements and visibilities.” — Deleuze

People often ask why I bother learning the algorithms and technologies that drive today’s AI innovations – I am a digital anthropologist after all and not a hard-baked computer scientist. Should I just not focus on the bread-and-butter of qualitative research – thick description, deep contextual knowledge of cultures, in-depth understanding of the nuances of language – that SOAS is known for?

My response to this usually is quite simple: “This is too important to leave to computer scientists and technology corporations alone.”

Indeed, one of the most heated debates in contemporary digital cultures has to do with the politics of facial recognition, that is, the growing use of computers to automatically detect elements (such as faces or objects) from pictures and video. The London MET police, for instance, is planning on rolling out such facial recognition systems across London to identity “suspects” from video feeds. The widespread deployment of this technology has raised, among other things, critical questions about civil liberties and privacy. There have also been concerns raised about the historical racial and gender biases embedded into such systems and the historical links of these to colonial legacies globally.

To better understand what is at stake in debates, my research has increasingly begun to explore what the critical questions raised by technologies such as computer vision are globally. As a part of this, we organised, for instance, a public event at the Barbican called Culture.Trace where we tried to envision a more inclusive approach to these new AI-enabled tools. We also partnered together with a Singapore-based AI company, Quilt.ai, to imagine computer vision in a way “where people would be regarded with kindness, empathy and imagination rather than historical biases and stereotypes.”

As crucially, a growing part of my research involves “hijacking”, so to speak, some of the power from the technology companies by exploring these tools as intrinsic parts of my research. That is, instead of approaching the question of AI technology from a position of inferiority that many humanities and social science researchers still unfortunately feel (no, they are too difficult to learn, too technical for us, we cannot use them!), I am instead now trying to re-envision our relationship to AI technology in a new way that starts from a position of strength.

If you think about it, it is increasingly rigorous critical thinking, contextual knowledge and the ability to ask difficult questions that is becoming more valuable in society rather than just having the technological know-how that anybody can learn.

One thing I am trying to wrap my head around now, thus, is how – as a digital ethnographer – these new AI-enabled tools can now augment my research in a new and original ways. The practice of learning to use them has already raised a lot of interesting questions about how we understand the shifting production knowledge in digitally-mediated societies globally where most of the online communication that takes place is now visual. You can see a simple example of what I mean by this below:

For instance, to explore some of these questions, I downloaded 10,000 images from Instragram related to the CAA protest in India. I then used some new techniques in computer vision and deep learning to cluster all these images together based on their visual similarity.

This then allowed me to further zoom in and look in more detail what types of clusters are present based on their visual/topical similarity.

Finally, what these new computer vision systems also allow – somewhat similar to more classical content analysis developed for textual analysis – is to identify the types of “objects” that are present in these images and, furthermore, to produce bulk analysis of these divergent topics present in the content of the pictures (such as crowds of people indicating protests).

All of this is, of course, still exploratory. Obviously these systems also come with their own limitations and challenges of interpretation and analysis. However, the most interesting question I am now grappling with is how can the many possibilities provided by these tools for visual analysis be now negotiated together with the already existing theoretical insights from visual anthropology and/or media studies to better understand the growing role images have in globally and the differences in this across different parts of the world with potentially different histories behind such “practical assemblages of statements and visibilities” always present in contemporary digital cultures.