One major trend in current technological innovation is personalization. People can look up anything of interest with unprecedented speed, and are presented with information specifically tailored to their needs, preferences, and past behaviors. To effect this personalization, massive amounts of data are continuously collected about users’ interactions with technology—what they search for, what they look at, and what they choose to share with others online. There is a tension between the usefulness of having technology anticipate your needs and the Orwellian implications of having all the data you generate collected, stored, and analyzed.

In thinking about the production of e-books, we have to recognize that these knowledge systems will increasingly incorporate knowledge about the consumers of the books. For digital books to become more intelligent and adaptive to reader characteristics, they need to collect massive amounts of data about individual readers. Other essays from this book sprint have positioned e-books as platforms for performance, platforms for expression, and platforms for community in ways that emphasize the positive role of books in modern society. We also need to recognize that digital books, like much modern computing technology, are platforms for large-scale surveillance in ways that can have problematic implications.

One area of surveillance is the intentional actions users take: books they buy, books they read, passages they underline, annotations they make, and comments or reviews they leave for the broader online community. This data can be logged and stored, and it is easy to imagine scenarios where the act of reading books counter to your group norms is discouraged by the fact that it could be made public. Most text data will soon be able to be automatically interpreted, and comments and annotations will be crawled and categorized. The thought of an automated aggregation of every spontaneous and potentially trivial reaction by each individual reader across several years is somewhat discomfiting. On the other hand, this data generated by intentional actions is easily interpretable by readers themselves. In today’s world, many people are comfortable sharing this kind of information about themselves with their broader community. When readers have power to manage and curate this data as part of the way they present their identity, the collection of the data somehow seems less ominous.

A second area of surveillance is how books are read—user reactions to the text that are less intentional but integral to the act of reading itself. Gaze data can tell us where on the page the reader is looking at any given point in time; and while eye trackers are currently expensive and cumbersome, in the near future it is entirely likely that accurate tracking will be accomplished through camera-based technologies. Physiological data can provide information about readers’ emotional reactions to particular passages, and brain data can provide information about their cognitive states. While currently these technologies are intrusive and mostly limited to research applications, they will not always be.

The implications of this second kind of data collection are sinister. If Sara is assigned a reading from a textbook, and eye tracking indicates she barely glanced at one section, is that going to have negative academic consequences? Should it? If Jane has an emotional reaction to a passage that provokes a painful memory, should that be catalogued, stored, and interpreted, even if that information is never used? If Bob is recreationally reading a book on business, and cognitive state information indicates that he does not understand an essential concept, could that information be found and held against him later in a job interview for a position as a market analyst?

The more data we collect on the reader, the more we can tailor books to their unique needs and preferences. The knowledge system of the digital book of the future includes the characteristics of the reader. Readers themselves might want to examine that data, finding that it provides them with insight into their own habits, or curate that data, finding that it enhances how they wish to present themselves online. However, the collection of data which users do not produce intentionally while reading—gaze, physiological, and brain data—will mean that every failure of understanding or frustration is permanently indexed and potentially accessible. The future book is a platform for gathering an unprecedented level of information about each individual reader that catalogs their past experiences, current abilities, and potential for future success.

    This reminds me of a 2012 piece by Evgeny Morozov: In Soviet Russia, Book Reads You”. In it, Morozov discusses the software CourseSmart, which creates an “engagement score” that allows a teacher to see how much a student read. He points out that there is sometimes value, even intellectual rigor, in skimming or skipping reading assignments; it is part of our development as thinkers. With regards to students, at least, too close monitoring of their reading seems to be about short-term accountability as opposed to allowing them to develop their own learning habits.

