Professional Documents
Culture Documents
John Cheney-Lippold
Abstract
Marketing and web analytic companies have implemented sophisticated
algorithms to observe, analyze, and identify users through large surveillance
networks online. These computer algorithms have the capacity to infer cat-
egories of identity upon users based largely on their web-surfing habits. In
this article I will first discuss the conceptual and theoretical work around
code, outlining its use in an analysis of online categorization practices. The
article will then approach the function of code at the level of the category,
arguing that an analysis of coded computer algorithms enables a supplement
to Foucauldian thinking around biopolitics and biopower, of what I call soft
biopower and soft biopolitics. These new conceptual devices allow us to
better understand the workings of biopower at the level of the category,
of using computer code, statistics and surveillance to construct categories
within populations according to users’ surveilled internet history. Finally,
the article will think through the nuanced ways that algorithmic inference
works as a mode of control, of processes of identification that structure and
regulate our lives online within the context of online marketing and algo-
rithmic categorization.
Key words
algorithm j biopolitics j biopower j code j control j Deleuze j Foucault
j internet
j Theory, Culture & Society 2011 (SAGE, Los Angeles, London, New Delhi, and Singapore),
Vol. 28(6): 164^181
DOI: 10.1177/0263276411424420
Cheney-Lippold ^ A New Algorithmic Identity 165
Introduction
L
ET ME begin this article with a hypothetical. You open up a new com-
puter and fire up a web browser. You go to the washingtonpost.com,
visit a couple of blogs at Wordpress and Tumblr, and go on the busi-
ness social networking site linkedin.com. Maybe you take a break from the
internet, go grab a cup of coffee, but return to watch some videos on
Hulu, check US gossip at tmz.com, and look at the weather at wunder-
ground.com. At this point you decide it might be best to go to work so you
close your computer, get dressed, and go outside. While you may proceed
with your day as if nothing has happened, something has changed about
who you are online. You have been identified. Your IP address has been
logged; you have a cookie file installed on your computer. And somewhere,
in a database far, far away, you very well may have a gender, class, and race.
This small tour of web sites provides us with a mundane example of
what I call in the title of this article a ‘new algorithmic identity’. The net-
worked infrastructure of the internet, with its technological capacity to
track user movements across different web sites and servers, has given rise
to an industry of web analytics firms that are actively amassing information
on individuals and fine-tuning computer algorithms to make sense of that
data. The product of many of these firms is a ‘new algorithmic identity’, an
identity formation that works through mathematical algorithms to infer cat-
egories of identity on otherwise anonymous beings. It uses statistical com-
monality models to determine one’s gender, class, or race in an automatic
manner at the same time as it defines the actual meaning of gender, class,
or race themselves. Ultimately, it moves the practice of identification into
an entirely digital, and thus measureable, plane.
This article will examine the consequence of some of those practices
aimed at understanding exactly what kind of user is visiting web sites, pur-
chasing products, or consuming media. Online a category like gender is
not determined by one’s genitalia or even physical appearance. Nor is it
entirely self-selected. Rather, categories of identity are being inferred upon
individuals based on their web use. Code and algorithm are the engines
behind such inference and are the axis from which I will think through
this new construction of identity and category online. We are entering an
online world where our identifications are largely made for us. A‘new algo-
rithmic identity’ is situated at a distance from traditional liberal politics,
removed from civil discourse via the proprietary nature of many algorithms
while simultaneously enjoying an unprecedented ubiquity in its reach to
surveil and record data about users.
In this article I will first discuss the conceptual and theoretical work
around code, outlining its use in an analysis of online categorization prac-
tices. The article will then approach the function of code at the level of the
category, arguing that an analysis of coded computer algorithms enables a
supplement to Foucauldian thinking around biopolitics and biopower, of
what I call soft biopower and soft biopolitics. These new conceptual devices
166 Theory, Culture & Society 28(6)
Categorization
My argument will focus on the process of defining X, of the identification of
particular groups within populations largely through the marketing logic of
consumption. Within marketing, considerable research goes to identify the
composition of a consumer audience. In the past, consumers were discrimi-
nated based on census-data laden geographies: rich people lived in a certain
ZIP code, poor people lived in another, and businesses could provide cou-
pons for class-differentiated products and services accordingly (Gandy,
1993). A move from demographic to psychographic categorizations followed,
where clusters of consumer types were created to better understand the
non-essentialist character of demographic-based consumption patterns (not
all white people are interested in timeshares in Boca Raton, but those who
are older and with money are) (Yankelovich and Meer, 2006; Arviddson,
2006). An important shift in marketing happened as marketers went
online and were able to use data from search queries to create behavioral
168 Theory, Culture & Society 28(6)
have shifted to practices external to the body, where discipline has often
become more or less unnecessary if control can be enacted through a
series of guiding, determining, and persuasive mechanisms of power.
Instead of constructing subjectivity exclusively through disciplinary power,
with normative discourses prefiguring how we engage with and even talk
about the world, a regulative approach allows for a distance between power
and subject (Foucault, 2008). It configures life by tailoring its conditions
of possibility. Regulation predicts our lives as users by tethering the poten-
tial for alternative futures to our previous actions as users based on con-
sumption and research for consumption. Control in this formation
becomes best represented by the concept of modulation, ‘a self-deforming
cast that will continuously change from one moment to the other, or like a
sieve whose mesh will transmute from point to point’ that sees perpetual
training replacing serial encounters with singular, static institutions
(Deleuze, 1992: 4). And modulation marks a continuous control over society
that speaks to individuals in a sort of coded language, of creating not indi-
viduals but endlessly sub-dividable ‘dividuals’ (Deleuze, 1992). These divi-
duals become the axiom of control, the recipients through which power
flows as subjectivity takes a deconstructed dive into the digital era. When
we situate this process within questions around digital identity, dividuals
can be seen as those data that are aggregated to form unified subjects, of
connecting dividual parts through arbitrary closures at the moment of the
compilation of a computer program or at the result of a database query.
Dividual pieces, onto which I conceptually map the raw data obtained
by internet marketing surveillance networks, are made intelligible and thus
constitute the digital subject through code and computer algorithms. The
algorithmic process that I will focus on in this article looks at the web ana-
lytics firm Quantcast as it infers certain categorical identifications from
seemingly meaningless web data.1 As new information gets created about
an individual through the tracking of her web presence, a user’s identity
becomes more defined within the system in accord with the logic of
Quantcast’s identification algorithms. If I would surf on Quantcast’s
member sites for several hours, my initially unknown gender would
become more concrete in the face of Quantcast’s algorithms, as exemplified
in the hypothetical that begins this article. Dividual fragments flow across
seemingly open and frictionless networks and into rigid database fields as
part of the subsumption implicit in data mining (the practice of finding pat-
terns within the chaos of raw data). As a user travels across these networks,
algorithms can topologically striate her surfing data, allocating certain web
artifacts into particular, algorithmically-defined categories like gender. The
fact that user X visits the web site CNN.com might suggest that X could
be categorized as male. And additional data could then buttress or resignify
how X is categorized. As X visits more sites like CNN.com, X’s maleness
is statistically reinforced, adding confidence to the measure that X may be
male. As X visits more sites that are unlike CNN.com, X’s maleness
170 Theory, Culture & Society 28(6)
to algorithm marks a move away from offline stereotypes and into a form of
statistical stereotyping.
Rather than maintaining a particular and concrete relationship to
maleness in the offline world, a cybernetic conception of gender would posi-
tion maleness as a modulating category, based on a vector and composed
of statistical assessments that define male group membership. New
information can come to light within this algorithmic system and an algo-
rithm can subsequently move to change both a category’s meaning and the
once-essential ideals ascribed to that category. Gender as a category still
holds the capacity for discipline ^ of serving targeted advertisements and
targeted content according to the inferred digital identity of a user ^ but it
also is embodied by a new and flexible cybernetic categorization system.
Instead of standards of maleness defining and disciplining bodies according
to an ideal type of maleness, standards of maleness can be suggested to
users based on one’s presumed digital identity, from which the success of
identification can be measured according to ad click-through rates, page
views, and other assorted feedback mechanisms. The regulation of gender
as a category then becomes wholly embedded within the logic of consump-
tion, where categorical behaviors are statistically defined through a cyber-
netics of purchasing and research that marketers have deemed valuable for
identification and categorization.
As the specifics of Quantcast’s algorithms remain proprietary, we can
use other examples of ‘machine learning’ to help us explicate this process
in the public domain. Put simply, machine learning is the ‘technical
basis . . . used to extract information from the raw data in databases ^ infor-
mation that is [then] expressed in a comprehensible form and can be used
for a variety of purposes’ such as prediction and inference (Witten and
Frank, 2005: xxiii). Machines are programmed to learn patterns in data
and use those correlative patterns to analyze and make assessments about
new data. Yahoo researchers, for example, have described the capacity for
an algorithm to infer demographic categories of identity (race, gender).
Based on search query logs, user profile data, and US census data, these
researchers found they can subsequently tailor results according to user cat-
egorizations. If a particular user issues the query ‘wagner’, an algorithm
can offer differentiated results as determined by that user’s gender categori-
zation. Based on the observed web habits of ‘typical’ women and men,
search results for ‘wagner’ can include sites about the composer ‘Richard
Wagner’ for women while at the same time provide sites on ‘Wagner USA’
paint supplies for men (Weber and Castillo, 2010). Similarly, researchers in
Israel have created an algorithm that predicts an author’s gender based on
lexical and syntactic features. Upon separating the written works of women
and men, statistical correlations were run to create a vector-based categoriza-
tion of two gender’s written features, through which the algorithm was able
to then identify the ‘correct’ gender of a new, non-categorized work with a
success rate of 80 percent (Koppel et al., 2002). But undergirding all of
this is the theoretical possibility that these statistical models can change
172 Theory, Culture & Society 28(6)
given new data. The chance that men, in the Yahoo example, stop clicking on
paint supplies and start clicking on music sites could statistically suggest
that Yahoo’s definition of ‘men’ may not be appropriate. And the possibility
that an inferred-as-male writer may not write like every man before him
can make an algorithm statistically second guess its initial vector-based cat-
egorization. Categories always have the capacity for change, off and online,
but the shift I want to bring to the world of identity is the continuous,
data-centered manner that modulates both user experience (what content
they see) as well as the categorical foundation of those identifications.
which tries to predict the probability of those events (by modifying it, if nec-
essary) or at least compensate for their effects’ (Foucault, 2003a: 249).
With the construction of categories we can see how biopolitical work
controls those random events, of creating what Foucault calls ‘caesuras’, or
breaks in the biological continuum of characteristics of life (Foucault,
2003a; Macey, 2009). Biopower works through a series of dividing practices,
a process of objectification outlined in Louise Amoore’s (2006: 339) analysis
of border technology biometrics that act on populations through the catego-
rizations of ‘student’, ‘Muslim’, ‘woman’, ‘alien’, ‘immigrant’, and ‘illegal’.
But the move I want to make is to analyze the role of what I term soft biopo-
litics, of how biopolitical processes give meaning to the categorizations that
make up populations and through which we can look at the variable indirect
processes of regulation. To explicate soft biopolitics is to better understand
not just how populations are managed but the changing relationship that
categories have to populations and, most importantly, what categories actu-
ally come to mean and be redefined as.
We can also take a grounded example with a biopolitical tinge to
explore this process. Say you are categorized as a male through your use-
patterns online. Medical services and health-related advertisements could
be served to you based on that gendered categorization. This means that
those who are not categorized as male may not experience the same adver-
tisement and opportunities, enacting the consequences of what was already
referred to as ‘software-sorting’ (Graham, 2005). But while the initial biopo-
litical relationship of subject and power would stop at this point, the intro-
duction of the notion of cybernetic categorization can provide another
perspective of Foucauldian power. Users are not categorized according to
one-off census survey data but through a process of continual interaction
with, and modification of, the categories through which biopolitics works.
In marketing, these categorizations are constructed within the logic of con-
sumer capitalism, where biopower can govern subjects at a distance, guard-
ing their apparent autonomy while ‘[optimizing] systems of difference, in
which the field is left open to fluctuating processes. . . [and] in which there
is an environmental type of intervention instead of the internal subjugation
of individuals’ (Foucault, 2008: 259^60). Category construction in itself
has become a biopolitical endeavor, where Foucauldian security dilemmas
are resolved not just by enactment or change of policy but also through an
alteration in how existing policy reaches subjects. Maleness can be con-
stantly evaluated according to feedback data, through which the definitions
of maleness can shift (to include or exclude transgendered folk, for just one
example) according to the logic of the algorithm. Policies that rely on gen-
dered categorizations then can be automatically and continuously reoriented
to address new populations.
These algorithmic definitions then supplement existing discursive
understandings of a category like gender ^ most fundamentally through
the language of gender itself ^ with statistical conclusions of what an idea
like maleness actually is. The exclusivity of gender’s meaning then becomes
174 Theory, Culture & Society 28(6)
case of the shift from offline to online marketing, but also redefined cate-
gorical meanings in the case of cybernetic categorization (Foucault, 1980:
52). This redefinition is part of what others have argued is a shift toward
a ‘topological’ approach to genealogy, one that identifies ‘patterns of correla-
tions’ that lead to the formation of particular dispositions of unified hetero-
geneous elements (Collier, 2009). While Collier’s treatment of a ‘topology
of power’ takes these elements as a wide array of different and recombined
techniques and technologies to create a unified state of power relations, the
same logic can be applied to algorithm. Patterns of correlation can be
found in technologies of algorithmic categorization, of recombining and uni-
fying heterogeneous elements of data that have no inner necessity or
coherence.
The softer versions of biopower and biopolitics supplement the discur-
sive production of categories’ meanings, as it is also through data and statis-
tical analysis that conceptions of gender change ^ not just through
discourse and its subsequent naturalization (Foucault, 1973).3 In order to
understand the regulatory process of biopolitics we need to place additional
attention on the biopolitical construction of these categorizations. Here we
can better define soft biopower and soft biopolitics. The former signifies
the changing nature of categories that on their own regulate and manage
life. Unlike conceptions of hard biopower that regulate life through the use
of categorizations, soft biopower regulates how those categories themselves
are determined to define life. And if we describe biopolitics as Foucault
does, as ‘the endeavor . . . to rationalize the problems presented to govern-
mental practice by the phenomena characteristic of a group of living
human beings constituted as a population’, soft biopolitics constitutes the
ways that biopower defines what a population is and determines how that
population is discursively situated and developed (Foucault, 2003b: 73).
Control
I argue these defining practices of soft biopower and soft biopolitics act as
mechanisms of regulatory control. For the remainder of this article I will
understand control as ‘operating through conditional access to circuits of
consumption and civility’, interpreting control’s mark on subjects as a guid-
ing mechanism that opens and closes particular conditions of possibility
that users can encounter (Rose, 2000: 326). The feedback mechanism
required in this guiding mechanism is the process of suggestion. I define
suggestion as the opening (and consequent closing) of conditional access as
determined by how the user is categorized online.4
As categorizations get constructed by firms like Quantcast, refined by
algorithms that process online behavior, and continually improved upon as
more and more time passes and more and more web artifacts are inputted
into a database, advertisements and content are then suggested to users
according to their perceived identities. In cybernetic categorization these
groupings are always changeable, following the user and suggesting new
176 Theory, Culture & Society 28(6)
toward the structuring forms of suggestion at work when users that are gen-
dered (or with any other categorical identity) visit a web page.7 Such ubiq-
uity demonstrates a capacity, and the increasing use of that capacity, to
employ surveillance technologies to gather data about both users and popu-
lations in general. But this ubiquity also needs to take into account an
anti-technological deterministic perspective that allows for technological fail-
ure. Surveillance technologies do not work as designed all the time, and we
have seen with the example of CCTV that technological capacity does not
ensure a human capacity to adeptly monitor and analyze that technology.8 I
agree with Wendy Hui Kyong Chun (2006) when she argues that the inter-
net is neither a tool for freedom nor a tool for total control. Control is
never complete, and neither is our freedom. At this present moment, web
analytic firms have made it nearly impossible to not be incorporated into
surveillance networks. We are always vulnerable to some degree of surveil-
lance’s gaze. But vulnerability, the historical point of departure when we
define freedom in its most classical liberal sense, should then be thought
not as the revocation of autonomy or one’s suffocation under control
(Chun, 2006).
Control has become a modulating exercise, much like Deleuze pre-
dicted, which constitutes an integral part of contemporary life as it works
not just on the body nor just on the population, but in how we define our-
selves and others (Deleuze, 1992). Deleuze is vital for exploring these pro-
cesses for the very fact that he understood the shift from ‘enclosed’
environments of disciplinary society to the open terrain of the societies of
control. Enclosure offers the idea of walls, of barriers to databases and sur-
veillance technologies. Openness describes a freedom to action that at the
same time is also vulnerable to surveillance and manipulation. And these
open mechanisms of control, the automated categorization practices and
the advertisements and content targeted to those categorizations effectively
situate and define how we create and manage our own identities.
Surveillance practices have increasingly moved from a set of inflexible disci-
plinary practices that operate at the level of the individual to the statistical
regulation of categorical groupings: ‘it is not the personal identity of the
embodied individual but rather the actuarial or categorical profile of the col-
lective which is of foremost concern’ to new, unenclosed surveillance net-
works (Hier, 2003: 402).
Cybernetic categorization provides an elastic relationship to power, one
that uses the capacity of suggestion to softly persuade users towards
models of normalized behavior and identity through the constant redefini-
tion of categories of identity. If a certain set of categories ceases to effec-
tively regulate, another set can quickly be reassigned to a user, providing a
seemingly seamless experience online that still exerts a force over who that
user is. This force is not entirely benign but is instead something that tells
us who we are, what we want, and who we should be. It is removed from tra-
ditional mechanisms for resistance and ultimately requires us to conceive
of freedom, in whatever form, much more differently than previously
178 Theory, Culture & Society 28(6)
Notes
1. Quantcast (www.quantcast.com) is a web analytics company that operates as a
free service where member sites include HTML snippets of code into each
HTML page on a web server. These snippets ‘phone home’ to Quantcast databases
every time a user visits a site. This recording of user visits is then aggregated
over time to create a history of a user’s web use across various web sites.
2. Quantcast uses more than just web-use data to infer categories upon its users ^
‘An increasing variety of inputs can be used to continually validate and refine
Quantcast inference models’ ^ but the impact and extent of these data inputs are
unknown. I will thus focus on web use exclusively (Quantcast Corporation, 2008:
1).
3. Soft as a prefix can be elaborated by linking biopower to two similar concepts
from different literatures. In the more proximate field of computer science, soft
computing developed as programmers moved from a world of perfect computation
and into a fuzzier variety of close-enough, inexact solutions to problems (Zadeh,
1994). This mimics the shift from essential modes of identity online and toward
fuzzy conceptions of statistical belief ^ a user is likely male based on probability.
And in a theoretical parallel to diplomatic soft power, soft biopower works as a
more flexible form of power that uses attraction and influence to achieve its
goals. Where the brute force of biopolitical action (population control) is mediated
through definition and redefinition of the targeted category, the brute force of
hard state-power (war) is mediated through diplomatic mechanisms and argu-
ments around mutual benefit (Nye, 2002). ‘Hard’ biopower acts by dividing popu-
lations and enacting policy to control subjects through those divisions; soft
biopower acts by taking those divisions and modulating categories’ meaning so
as to best serve the rationale of hard biopower.
4. One example in particular of how one’s conditions of possibility may be
affected through targeting is what Cass Sunstein (2007) defines as ‘The Daily
Me’, an individualized array of information that pertains to the perceived or pro-
vided interests of a user, theoretically decreasing the chance that the user will
encounter news or information that may contradict his existing views about the
world.
Cheney-Lippold ^ A New Algorithmic Identity 179
References
Amoore, L. (2006) ‘Biometric Borders: Governing Mobilities in the War on
Terror’, Political Geography 25: 336^351.
Arvidsson, A. (2006) Brands: Meaning and Value in Media Culture. New York:
Routledge.
Bamford, J. (2001) Body of Secrets. New York: Random House.
Battelle, J. (2005) The Search: How Google and its Rivals Rewrote the Rules of
Business and Transformed Our Culture. New York: Portfolio.
Becker, K. (2009) ‘The Power of Classification: Culture, Context, Command,
Control, Communications, Computing’, in K. Becker and F. Stalder (eds) Deep
Search: The Politics of Search Beyond Google. Munich: Studienverlag &
Transaction.
Becker, K. and F. Stalder (2009) ‘Introduction’, in K. Becker and F. Stalder (eds)
Deep Search: The Politics of Search Beyond Google. Munich: Studienverlag &
Transaction.
Chun, W.H.K. (2006) Control and Freedom: Power and Paranoia in the Age of
Fiber Optics. Cambridge, MA: MIT Press.
Collier, S. (2009) ‘Topologies of Power: Foucault’s Analysis of Political
Government Beyond ‘‘Governmentality’’’, Theory, Culture & Society 26(6): 79^109.
Cramer, F. (2005) ‘Words Made Flesh: Code, Culture, Imagination’. Available at:
http://pzwart.wdka.hro.nl/mdr/research/fcramer/wordsmadeflesh (consulted
January 2010).
Deleuze, G. (1992) ‘Postscript on the Societies of Control’, October 59: 3^7.
Dick, P. (1968) Do Androids Dream of Electric Sheep? New York: Random House.
Foucault, M. (1973) The Order of Things: An Archaeology of the Human Sciences.
New York: Vintage Books.
Foucault, M. (1979) Discipline and Punish: Birth of the Prison. New York: Vintage
Books.
Foucault, M. (1980) ‘Two-Lectures’, in C. Gordon (ed.) Power/Knowledge: Selected
Interviews and Other Writings, 1972^1977. New York: Pantheon Books.
180 Theory, Culture & Society 28(6)
Foucault, M. (1982) ‘Afterward: The Subject and the Power’, in H. Dreyfus and P.
Rabinow (eds) Michel Foucault: Beyond Structuralism and Hermeneutics.
Chicago: University of Chicago Press.
Foucault, M. (2003a) Society Must Be Defended: Lectures at the Colle'ge de
France, 1975^1976. New York: Picador.
Foucault, M. (2003b) Ethics: Subjectivity and Truth. New York: New Press.
Foucault, M. (2007) Security, Territory, Population: Lectures at the Colle'ge de
France, 1977^1978. New York: Palgrave Macmillan.
Foucault, M. (2008) The Birth of Biopolitics: Lectures at the Colle'ge de France,
1978^1979. New York: Palgrave Macmillan.
Fuller, M. (2003) Behind the Blip: Essays on the Culture of Software. Brooklyn:
Autonomedia.
Fuller, M. (ed.) (2008) Software Studies: A Lexicon. Cambridge, MA: MIT Press.
Galloway, A. (2004) Protocol: How Control Exists After Decentralization.
Cambridge, MA: MIT Press.
Gandy, O. (1993) The Panoptic Sort: A Political Economy of Personal
Information. Boulder: Westview Press.
Gibson, W. (1984) Neuromancer. New York: Penguin.
Graham, S. (2005) ‘Software-Sorted Geographies’, Progress in Human Geography
29(5): 1^19.
Hier, S. (2003) ‘Probing the Surveillant Assemblage: On the Dialectics of
Surveillance Practices as Processes of Social Control’, Surveillance & Society
1(3): 399^411.
Koppel, M., S. Argamon and A. Shimoni (2002) ‘Automatically Categorizing
Written Texts by Author Gender’, Lit Linguist Computing 17(4): 401^412.
Lessig, L. (2006) Code: Version 2.0. New York: Basic Books.
Lomell, H.M. (2004) ‘Targeting the Unwanted: Video Surveillance and Categorical
Exclusion in Oslo, Norway’, Surveillance Studies 2(2/3).
Lyon, D. (2003a) Surveillance After September 11. Cambridge: Polity.
Lyon, D. (2003b) Surveillance as Social Sorting: Privacy, Risk and Digital
Discrimination. New York: Routledge.
Macey, D. (2009) ‘Rethinking Biopolitics, Race and Power in the Wake of
Foucault’, Theory, Culture & Society 26(6): 186^205.
McNay, L. (2009) ‘Self as Enterprise: Dilemmas of Control and Resistance in
Foucault’s The Birth of Biopolitics’, Theory, Culture & Society 26(6): 55^77.
Mosco, V. (1996) The Political Economy of Communication. London: SAGE.
Nakamura, L. (2002) Cybertypes: Race, Ethnicity, and Identity on the Internet.
New York: Routledge.
Nye, J. (2002) ‘Soft Power’, Foreign Policy 80: 153^171.
Quantcast Corporation (2008) ‘Quantcast Methodology Overview: Delivering an
Actionable Audience Service’. Available at: http://www.quantcast.com/white-
papers/quantcast-methodology.pdf (consulted February 2010).
Cheney-Lippold ^ A New Algorithmic Identity 181