Shaping the Contour of Streams for Personalized TV
Mark Freeman, Nick West, Eric Gould Bear, Rachel Strickland
MONKEYmedia, Inc. & Interval Research Corporation
CHI 2000 • April 1, 2000
Is it possible to create an interactive experience that sustains viewers' emotional involvement with streaming media at the same time that it enlists their attention in determining the order of events? How does one integrate the fluid continuity of cinematic experience with the kinds of choice and control that hypermedia affords? We developed a technique called Seamless Expansion that takes advantage of advances in interactive technology and broadband networks to let streaming audiovisual content be modified without interrupting perceptual continuity. Seamless Expansion affords individuals the opportunity to personalize their interactive experiences while preserving the cinematic continuity that sustains audience engagement.
Interactive video, digital television, broadband, video pause, entertainment, hypermedia, hypertext, hypervideo, multimedia, DVD, expansion, contraction, entertainment applications, home, beyond the desktop
Figure 1 A continuous media stream showing: (0) passive jump;
(1) seamless expansion; (2) premature contraction
INTRODUCTION: CINEMATIC CONSTRUCTION AND THE PART OF THE BEHOLDING
As the lights go down and the digital sound fills the theater, you begin scooping handfuls of popcorn into your mouth and concentrating on the characters projected on the screen in front of you. The story advances, loves and lives are won and lost, vehicles explode, and you find yourself asking questions: What would have happened if the protagonist had turned at the other corner? Did that minor character who appeared in the opening scene go on to do anything interesting or funny? What would that last scene have looked like had it been shot from the spurned lover’s point of view? Maybe you even find your attention drawn to a product that sponsors have placed in the movie and wish you could find out a little more about it while it is fresh in your mind. You may be assured that you are not alone in your mental wanderings. Other members of the audience have questions too, probably different from yours. Unfortunately, even if you voice them aloud to your fellow audience members, your questions remain unanswered by the film. There is no way that filmmakers can tailor movies to address the curiosities of individual viewers.
Or can they? Digital video technology holds the potential for transforming cinematic construction into a process influenced by the interests and attentions of individual viewers. Until recently, any film or video experience — regardless of content, recording approach, or the producer's intention — needed to be once-and-for-all monolithically constructed for one way linear playback. With today's technology, it becomes possible to create interactive experiences that invite individual participants to pursue the threads of their own interest and evolving attention. Seamless Expansion is the phrase we adopted to describe a technique that enables viewers to modify the flow of an audiovisual data stream at the time of its display. In contrast to hyperlinks, which let people browse a data space by jumping between discrete points, Seamless Expansion employs the syntax of cinematic construction to sustain perceptual continuity among non-contiguous, overlapping, and parallel segments of a unified media stream.
Cinematically speaking, "montage" refers to the ordering of motion picture sequences in time. Traced by most historians to films of D.W. Griffith (such as "Birth of a Nation" (1915) and "Intolerance" (1916)), montage is widely regarded as the singular development which gave birth to film as an art — transforming it from mere technology to an expressive medium in its own right, through the creation of a language. Filmmakers since Griffith have extended this syntax to serve a variety of dramatic purposes. So called "invisible montage" uses multiple camera angles to synthetically construct the representation of a continuous action. "Parallel montage" alternates shots from two discrete locations to convey simultaneity. An "accelerated montage" sequence suggests increasing velocity through a series of shots of decreasing duration. The "montage of attraction," introduced by Soviet filmmakers in the 1920’s, advanced those filmmakers’ realization that the juxtaposition and sequence of images conveys a meaning which is not inherent in the individual images .
The artistry of the film editor, which relies on a keen sense of pacing, is an exercise of determining precisely how long to hold a given shot — and not a split second longer — for maximum effect, and how to join it with adjacent shots in order to move the story along. In fact, there is no absolute formula. If film were a more flexible medium, it might be inclined to extend or abbreviate particular scenes to suit the occasion of the screening and the mood of the audience — just as a skillful storyteller elaborates or condenses elements of the story, or a jazz orchestra extends a riff to support an improvised solo.
As the illusion of motion in film and video depends on the perceptual phenomenon called "persistence of vision," so does montage rely on people’s psychological habit of "projective construction" [3, 5]. While the film editor makes sense of a sequence by tailoring its drift, rhythm, and pacing from to shot to shot, the viewer contributes another level of sense to the experience through bridging gaps, inferring meaning from juxtapositions, ascribing causal linkages to incidental connections. Projective construction, this proclivity of our minds to strive to fill in any blanks in the narrative or thematic structure of the media, also plays an important role in shaping the cinematic experience afforded by Seamless Expansion.
The principle of Seamless Expansion exploits the ellipsis of the cinematic cut with all the film editor’s logic of perceptual and implied continuity contained therein. What differentiates the experience from that of watching a conventional movie is that the viewer can choose whether to expand or contract particular sequences. When combined with technologies and hardware such as those used in interactive television, DVD, TiVo, and Replay TV, the technique expands the power of the home viewer to play with the media, not simply react to it.
Figure 2 - Insertion Model of Seamless Expansion
INTERACTING WITH MEDIA: PRECURSORS AND PREMONITIONS
The idea of allowing the audience to direct the flow of the narrative or to pursue sidebars and tangential themes on their own has not been limited to computer based media. For example, during the 1980's, a kind of book called "Choose-Your-Own-Adventure" enjoyed some popularity among adolescents. These adventure stories were organized around a constellation of critical points, at each of which the reader was supposed to decide the course of action for the characters by selecting from a multiple choice menu and skipping directly to the relevant passage in the book. The plot structures of these stories exemplify branching more than flowing, however. And although the narrative might take various turns, enabling the reader to find out "What would have happened if the protagonist had taken a different corridor?", this genre was not designed to provide for open-ended exploration along thematic lines. The reader could not, for example, during a scene that takes place in a castle, turn to a different section to learn more about castles and fortifications. The limited choice that such a story actually affords is revealed from the very outset by the artifact that binds it. A reader need merely take the book in hand, and size up its contents by flipping the pages to discern that the permutations of plot are numbered.
The introduction of hypertext to the Internet several years ago gave birth to the World Wide Web. This global publishing system not only enables people to browse and retrieve information across a vast array of media types, it also lets them formulate their own search criteria to a high degree of specificity, through use of the associative links that are generated by authors of the individual hypertext documents. And everyone is welcome to post new documents with links of their own. The hypertext protocol which has been implemented in the Web, augmented by search engines and portals, supports an information space that is readily extensible and seemingly inexhaustible. Yet interaction with the Web might be said to approximate a hop, skip, close-your-eyes-and-jump experience, with discrete segments of media bracketed between static pauses, in contrast to the continuously flowing stream of media that Seamless Expansion affords.
A third precursor of Seamless Expansion, like the books mentioned previously, came into its own during the 1980's — the video laserdisc game. Laserdisc games, such as Dragon’s Lair, introduced cinematic visuals and montage to the repertoire of the game arcade. They also represented the first attempt to engage video game enthusiasts with something resembling a story, rather than simply offering players the opportunity to eat things or blow them up. Like other experiments with branching fiction, these laserdisc games sought to force a dubious marriage between interactivity and conventional narrative’s linear form, overlooking that disjunction of temporality and logic which underlies the natural drift of interaction. Whereas players are enticed to entertain an illusion of countless destinies crossed with secret passageways, the truth is that the elaborate plot has been entirely pre-constructed. The player is allowed to make local choices, on behalf of an oft-hapless hero, that produce inconsequential or predictable effects, as he is surreptitiously lured along a cleverly constrained path. The solution of a series of gratuitous puzzles leads to the "one true ending." Interactivity amounts to a subterfuge for concealing a form that has been specified by the author.
Compared with traditional television, which has encouraged passive viewing rather than receptive viewing, interactive media require an increased cognitive investment on the part of participants. Making decisions about the order of events is likely to result in disengagement by interfering with one’s concentration on the events themselves. Adventure novels and laserdisc games both sought to answer the question of how to keep the audience engaged by giving people a personal stake in the outcome of the plot. Role-playing enhances a viewer’s emotional and intellectual involvement with the story. On the other hand, neither of these two models addresses the issue of enabling thematic searches and open-ended explorations; the participant is locked into following some finite set of sparely articulated branches, that often reconnect at decision points, to one of several predetermined conclusions. Web sites support individuals’ threads of interest and encourage pattern discovery, but the still images, scrolling text, and short video clips displayed in small windows do not elicit an attitude of cinematic engagement. Nothing further happens until the viewer makes the next move, and every selection is punctuated by a wait. Seamless Expansion offers a solution for balancing people’s disposition toward interactive participation with their capacity for attentive reception, because it sustains cinematic continuity throughout the experience. This technique builds on strengths of its forerunners, and it makes the viewer a partner in the process of cinematic construction.
FITTING MOVIES WITH AFFORDANCES FOR ADJUSTABLE PLAY
To produce a conventional film or television program, an editor typically takes ten or more hours of footage and whittles it down to one. This means structuring the media such that the flow would go from segment A to segment C, as in Figure 2. Seamless Expansion lets the user call for segment B to be dropped into the continuously playing stream. Rather than insert a trip to the kitchen with video pause technology, viewers may choose to insert additional segments of the movie. In the grammar of montage, such an expansion can assume one of two forms: Either the tail of a shot is extended to a later logical cutting point, should the viewer wish to see more; or else an additional shot (or sequence of shots), pertaining to the moment of interest, is introduced to fill in the cut.
Figure 3 - Sketch of Seamless Expansion Interaction Technique
Figure 3 illustrates how this progression of events might appear to an observer. The storyboard depicts a media segment containing a choosing juncture — the duration of play when a viewer might decide to expand the sequence. A visual cue, so labeled, is seen against the basic continuous play segment. By responding to the cue, the viewer invokes an expansion. The cue itself can take various forms: it may be a phrase of text, a superimposed graphic element, or a visually highlighted object in the scene.
Figure 4 - TV Pause Model
Once the segment has been selected, it expands to fill the screen, as illustrated in the third and fourth frames of the storyboard. The viewer may watch the expanded content until it finishes, or choose to contract the expanded section, according to her discretion. In either case, the expanded stream will re-converge with the main program at some point. We take advantage of the video pause ability and hard drive cache that are provided by systems such as TiVo and Replay TV, to insert content expansions. Refer to Figure 4. The cache from the hard drive can also be used to enable contraction. As in Figures 1 & 2, the continuous media flow typically goes from A to C, with Seamless Expansion allowing the insertion of segment B. Sometimes, however, the viewer may prefer to condense a segment from A:B:C to A:C. This becomes possible once the hard drive has accumulated enough material such that, while the normal broadcast sequence proceeds into detail on some topic, one of the alternate streams may contain a segment that would typically diverge from the main thread at a later point.
Figure 5 - Prototype Interactions:(a) & (b) expand to more detail,
(c) & (d) contract or jump ahead.
For entering and exiting expansions, we wanted viewers’ notifications to the system to be casual, almost effortless. The more conspicuous and cumbersome the action, the greater the risk of losing engagement. Thus, in our prototype we experimented with a quick side-to-side movement of the mouse or trackball, as illustrated in figure 5, instead of clicking. Similar gestures would need to be developed for remote-controlled devices.
CUES FOR EXPANSION
Although we have described a visual cue in the previous example, our prototypes actually experimented with three different types of cues as staging points for viewer choices. Any of these three allow an alert viewer to assemble another piece of the montage. The first, object expansion, roughly corresponds to one of the visual cues mentioned earlier. A character in the continuous media stream may hold aloft, for example, a basketball, and the basketball becomes a cue for the audience to select an expansion pertaining to this basketball.
The second kind of juncture is topical. This type of cue does feature any object which is visible in the scene, although it could still employ a visual cue. The audience might also be able to discern that a choosing juncture has arrived by closely listening to the dialogue. This type of expansion would work best when tying together interrelated threads in the media as a whole. Characters discussing dogs — without any dogs being present in the scene — might allow for expansion to other segments of the continuous media that contain dogs, or those characters discussing dogs at different times. A third type of cue is tangential. Tangential cues provide a jumping off point to segments whose relation to the primary gist of the continuous play sequence is indirect. For instance, the characters discussing dogs in the current example might provide a connection to a segment on the history of canine domestication, even though the program itself might be a documentary portrait of a blind musician.
Although we speak of three "kinds" of expansion, these categories by no means have discrete boundaries. A cue can be both an object and tangential, for example. The difference is not so much in kind as in degree.
Figure 6 - This diagram shows the expanding universe of subject material that we are able to include in the concept of a "program" by using the concept of seamless expansion. In the most restrictive concept of a "program", the subject matter is strictly determined by the author(s) of the program, usually for purposes of keeping the program within the prescribed time limits. Since Seamless Expansion allows us to return to program flow irrespective of time limits, we can now include other styles of authorship in the program material. Immediately outside the tightly-authored program lies material that had to be excluded originally because of time constraints.(here labeled "Director's Cut") Outside this set lies the set of material that is created by the program's authors, but is clearly intended as ancillary material to the main program flow. The outermost set in this diagram is the set of material that is not created by the authors of the original program, but is triggered solely by viewer interest. These (apparently) tangential links can now be included as "program" material thanks to seamless expansion.
The pattern of expansion may be visually represented as a set of concentric circles, as in Figure 6. At the center of the diagram lies the main video program, or what might be regarded as the normal broadcast. The first level of expansion denotes the "director's cut", a version which contains additional scenes that the producers have deleted in order to fit the television time slot. The next level contains other material by the program authors, but not originally intended for the program where we began. These might be scenes that are only marginally related to the program material – scenes from the cutting room floor, for example, or bloopers. The outer level of the diagram consists of material displayed through user-triggered links. The media segments opened at this level have not been created by the original authors of the program material.
The route through this stream system does not necessarily traverse every bend. One expansion might preclude others. For instance, expanding to the director's cut might bypass certain links to other authored material, but Seamless Expansion makes it possible to revisit the same material multiple times and discover something new each time.
HOME AUDIENCE APPLICATIONS
Digital Television (DTV) and other digital video technologies are poised to redefine the way that viewers relate to content. DTV gives broadcasters several options for deploying bandwidth. They may, for example, beam one high-definition signal into homes, or they may split the signal into four separate channels, as depicted in Figure 7. These four simultaneous streams readily support Seamless Expansion. While one channel carries the normal broadcast, the other three could carry expansion material.
These three alternate streams lend flexibility to television viewing. As technologies such as TiVo and Replay TV permit viewers to take phone calls or grab a refreshment from the kitchen, Seamless Expansion further allows a crossover between the timed viewing of television and self-scheduled viewing of the Web. For example, suppose that a nature show normally starts at 8:00 p.m. and lasts until 9:00. With Seamless Expansion, viewers can wait to begin the show at 8:30 and extend it to 10 as they choose to explore details in the story more thoroughly. The show that normally starts in the 9:00 p.m. slot will, in the meanwhile, cache to the hard drive until they elect to view it.
As a corollary of this effect, Seamless Expansion enables a convergence between navigation techniques used in television viewing and those employed in Web browsing. Channel surfing on television consists of rapidly accessing multiple streams, not necessarily pursuing thematic threads, but more typically sampling a variety of unrelated content. While tuning in to check the program on channel 5, the viewer misses whatever flowed on channel 6 in the meantime. By contrast, Web browsing allows a viewer to pursue a tangential interest without losing the page that’s presently displayed. Seamless Expansion encourages exploration because the cached material prevents relevant content from getting lost.
TECHNIQUES FOR CONTENT PROVIDERS
We have noted previously that the pause function afforded by ReplayTV and TiVo offers new convenience to the viewer. These technologies also afford content authors an opportunity to insert program material from the source of the transmission. It is important to note that Seamless Expansion can be performed from the back end as well as the front end — that is to say that it can be initiated by the author (or broadcaster or programmer) as well as by the viewer. We typically think of interactivity as something triggered solely by the viewer's actions and solely under the viewer's control as regards time and pacing. This is not, however, necessarily the case. Any party can insert expansions into the interaction, based on conditions that can be specified beforehand, or on the fly, or some combination of both contexts. Program material can be tailored to specific personal profiles, or inserted to readjust program time schedules after a session of Seamless Expansion by a viewer.
Figure 7 - TV Pause with Expansions Model: As time passes, viewers accumulate opportunities to expand into cached material at any of several points in video time.
This application of server-side expansion can solve some of the problems that will arise when program splits and jumps become prevalent in interactive television. For example, imagine a case where an advertiser wants to broadcast an advertisement that allows viewers to branch off to find out more information about a product, or to place an order. Assuming that the advertiser has bought a 30-second spot, events may unfold in such a manner that the viewer spends an extended amount of time completing a purchase or finding out more about the product. Even if the entire purchase or exploration consumes as little as five minutes, the viewer will have missed a significant portion of the regular program. Neither the viewer nor the programmer desires this outcome.
Figure 8 - Staggered Expandable Interstitial Model
Utilizing four channels broadcasting in parallel, the broadcaster can stagger the transmission streams of the program, as illustrated in Figure 8. This staggered expandable interstitial model differs from that of Figure 7 in that the broadcaster beams the same sequence at staggered time intervals, rather than transmits alternative scenarios on each beam. In this model, depending on the amount of time the viewer has spent in the Seamless Expansion, the wait for the regular program will vary accordingly. If the transmissions come in 5-minute waves and the viewer spends 3 minutes in the Seamless Expansion, the program will resume where it left off for him or her in 2 minutes. If, however, the viewer returns in 2.5 minutes, the program content can seamlessly expand to fill the extra thirty seconds with pre-cached material. If the viewer returns to the main program after 8 minutes, they will have the same 2 minutes of interstitial material to watch as in the first case.
USER TESTING AND OPEN ISSUES
Problems can arise when viewers are challenged to participate in shaping the order of events that they are watching. When designers introduce interactivity, they run the risk of disrupting concentration and compromising emotional engagement, redirecting people’s attention to the business of decision making and control. In order to gain insight regarding these issues, we commissioned an audience response study to help us evaluate a prototype application that we had built with Seamless Expansion.
Fourteen subjects participated in the study. In its basic, unexpanded form, the video material in our prototype resembled a collection of edited documentary sequences, such as one might expect to see on public broadcast television. Although the testers explicitly prompted subjects to regard our prototype as a new form of television, the media was displayed on a computer screen. This may explain the approach that most participants took; they tended to interact with the media as if it were a computer program. Some seemed to infer that Seamless Expansion had been conceived as a reference tool and responded accordingly. That is, they did not attempt to follow the thread of the narrative, and they spent more time learning to master the interface than absorbing and processing content.
Another important lesson learned was that expansion could be too seamless. Participants sometimes could not ascertain that an expansion or contraction had commenced. Furthermore, they were not always certain if their actions had triggered one. For future implementations, it may be worth considering how to design transitions that are more easily discernible.
The results of the study raise a provocative question. How can the media manifest Seamless Expansion if the segues are apparent? Decades of television have evidently conditioned viewers to expect stronger cues for their attention. One group of subjects said that they would prefer to watch the normal broadcast first (analogous to the innermost circle in Figure 6), then watch the program again using Seamless Expansion as a tool for deeper inquiry. Others seemed amenable to the idea of receiving programs encoded with Seamless Expansion by means of such devices as "Interactive Television".
Subjects did not fail to observe that Seamless Expansion could enhance many types of programming. For example, news programs could present snapshots that can expand on demand, soap operas could offer expansion flashbacks to missed episodes, talk shows could offer expansions to related media when the hosts are movie stars or recording artists, sporting events could offer expansion to stats on players, and cooking shows could offer viewers a chance to shop for ingredients used in recipes.
Today’s video editing tools and multimedia authoring systems do not make it easy to shape media with Seamless Expansions, as we learned in the course of building a prototype. So-called nonlinear editors, presently regarded as the state of the art, are designed for producing linear results. What kind of system would let you construct the motion picture for Jorge Luis Borges’s conception of time in "The Garden of Forking Paths," for example? We have yet to devise an approach to time-based media that helps producers visualize, align, and manage the connections among simultaneous events. Until the process for authoring polylinear streams becomes simpler, the new digital video medium will continue to fall back on conventional forms.
We thank David Liddle, Bill Verplank and Marc Davis for encouragement; Brian Williams, Serge Lourier and Baldo Faieta for code support; Laura Strong and Golan Levin for visual design; Jeramy Bassermann and David Hannibal for sound design; Sasha Wizansky, Becky Fuson and Janna Buckmaster Bear for video annotation; Kevin George and Gilles Tasse for video editing; Naoko Amemiya for Japanese localization; and Diane Schiano, Kim Kessler, Josh Loftus and the Aquarium group at Interval for usability testing.
1. Bazin, A. Evolution of the language of cinema, circa 1950. Translated by Hugh Gray in What is Cinema? University of California Press, Berkeley and Los Angeles, 1967, 17•22.
2. Elliot, E. and Davenport, G. Video streamer, in CHI '94 Conference Companion (Boston, MA, April 1994), ACM Press, 65-66.
3. Gombrich, E.H. Art and Illusion: A Study in the Psychology of Pictorial Representation, 1956. Second Edition. Princeton University Press, 1961.
4. Gould, E. Relativity controller: reflecting user perspective in document spaces, in ACM INTERCHI '93 Adjunct Proceedings (Amsterdam, The Netherlands, April 24-29, 1993), ACM Press, 125-126.
5. Sampat, K. and Kembel, J. Intel Corporation. User interface, method, and apparatus selecting and playing channels having video, audio and/or text streams. 1993. U.S. Pat. 5,557,724.
6. Strickland, R. Notes on projective construction, presented in the panel on Interactive Narrative, CHI 91.
7. Wilson, B. The documentary film as scientific inscription, in Theorizing Documentary, edited by Renov, M. Routledge, New York, 1993.
8. Yeo, B., Yeung, M., Wolf, W. and Liu, B. The Trustees of Princeton University. Method and apparatus for video browsing based on content and structure. 1995. U.S. Pat. 5,708,767.