HOME

Communicating Beasts

The Coming Communications Convergence

 

Katrina Glerum
6 June 2001

Flashing each other public spaces
in rambling artistic conversations
passing baubles of words, beauty, glee, and context

I've found that my friends and I habitually send single links to articles or websites as seamless parts of our ongoing conversations. Offering support for a position perhaps, giving each other tools or jokes, or maybe simply pulling a rabbit out of a hat for the pure majic of it.

What a transformative change in dialogue this is, adopted without hesitation or consideration as soon as it became available.

Consider the following:

  • Cell phones, answering machines, voicemail, faxes
  • Beepers, text-based pagers, SMS (aka "texting"), email pagers (like Blackberry), locators, infrared message "beaming" (Newton to Newton, Palm to Palm)
  • Email, email lists
  • Text chatrooms; voice chatrooms, video chat rooms
  • News groups, discussion boards
  • Web pages
  • Instant messengers, collaborative groupware
  • Video tapes/disks, video clips, Web cams, video conferencing
  • Virtual worlds, realtime multiplayer games, simulators

This is just the past twenty years! Most of these communication technologies have entered our lives in the past three years with an adoption rate far higher than any other sort of technology, and the list keeps growing. Plainly, as a species, we love to talk more than virtually anything else we do. Give us a new chatter toy and we'll put it in practice with nary a blink.

What Can We Predict from This?

It's all a bit confusing and we've accumulated a barrel of redundancies along the way. Clearly, convergence is coming, but it won't be total convergence. Why? Because we've come up with some new ways to make messages that we like very much, and as communicating beasts, once we've put our grubby little minds on a useful nuance we're unlikely to give it up.

Any device or software manufacturer who misses this point, is going to have a product that flops, because it's not so much about technological progress as it is about intuitively useful communication strategies.

The key to predicting where we're going comes from asking: What are our brains good at doing? And when it comes to communication, from the pause for a punchline to the length of a sigh, there's nothing our brains are more adept at grasping and utilizing than time. So, taking as many of our wonderful communication methods as I could think of, I mapped them against one another using two different ways of looking at time, which I called Time-to-Target (or conversational speed) and Dimension.

  1. Time-to-Target is a gauge of how much time it takes from when you send a message to when you expect the recipient/viewer to experience it. It runs from Immediate to Permanent.
  2. Dimension refers to the type of communication produced (graphics, text, video and audio) and how time relates to experiencing it. It runs in general from Static to Linear.

Looked at this way, all sorts of interesting results suggest themselves. Not only have our new toys dramatically expanded our conversational arsenals (giving us whole new categories of conversational speed), but we can make some pretty good guesses about where to expect both new products and convergence. Essentially, wherever you see a gap in the chart, expect a product.

Time-to-Target:
Immediate
- -
Delayed
- -
Permanent

Dimension:

Picture
(Static)

whiteboarding photo "beaming”     painting,
sculpture,
photo albums

Text

text chat

instant messengers,
SMS "texting",
message "beaming”,
beepers,
locators

email,
email lists,
faxes,
email pagers

newsgroups,
discussion boards,

webpages,
mail,
newspapers

articles,
reports,
books

Video

video chat,
virtual worlds,
video conference,
video phone,
live TV,
personal meeting

web cams

 

TV news

video tapes / clips,
TV shows,
Movies,
video games,

plays

Audio
(Linear)

phone,
voice chat,
mobile phone,
call-in radio

 

answering machines,
voicemail

radio news,
MP3 sites

radio reports,
radio plays

 

Time-to-Target: AKA Conversational Speed

Let's put time-to-target, or "conversational speed" into practical terms. What's the difference between talking to someone on the phone and leaving them a voicemail? What's the difference between a newspaper article and a book? Or for a more subtle question, why might you prefer a discussion in email over a discussion on a message board? Conversational speed relates to how long you think it will take your message to reach someone, and carries at least two broad (but easily falsifiable) implications, namely, that the shorter the time-to-target then a) the shorter the production time of the message and b) the sooner the presumed expiration date on what you've said.

  • At the immediate end of this spectrum, let's say you're talking with someone in person or on the phone. This is happening in-time, so for the most part you expect that person to listen to you (immediate reception) with largely undivided attention. The same goes for an online chat session. While we know that anything can be recorded, either mechanically or in a person's memory, in general we act with the assumption that these messages have the shortest life spans and think the least before making them.
  • With instant messages, if the recipient is present, you expect a reasonably speedy read, but not immediate—perhaps within a few minutes if they're present at the computer or cell phone. Because of this expectation an etiquette has sprung up of indicating to others when you're away so that your friends don't find themselves writing to thin air.
  • With a personal email, if your recipient reads it in the next few hours—even within a day or two, that's generally ok. After a week unread, we may be annoyed though.
  • With discussion boards, no one is really expected to read a message at all, but if they do it can be any time in the next few days or weeks—although all too soon it goes stale.
  • At the most delayed end of conversational speed, if you write a book or report for example, you are trying to create communications that withstand time. As a result we generally take the longest time and effort to produce these.

There are other ways we could be thinking about what's going on here, like whether a message is is public vs. private, professional vs. informal, interactive vs. unidirectional, or mobile vs. stationary, which are interesting in their own right. However, these distinctions are more likely to be dictated by culture or technology rather than time, which is in a sense how our brains innately perceive the world.

Likewise from a pure communication standpoint, distinctions like device or mobility can be irrelevant. This is not to say that usability, convenience and other technological hurdles don't influence adoption, rather I'm just making the point that for example, instant messaging and short message systems, are essentially the same thing with only device dependence being the key difference. In other words, for instant messages you might be seated at your desk, while for SMS you would be using your cell phone, but we use them in much the same way. Thus we should expect those two communication methods to converge so we can "text" to our AOL buddy lists on our Palm portable phones.

From the gaps in this chart, we can make a pretty good stab at what types of communication have yet to be developed. Eventually I expect that every gap will be filled. Video mail is clearly in the cards. How about picture mail where I snap a photo and it shows up in your inbox a minute later? Can I get replies from you on my camera? Will we enjoy audio or video instant messaging? Will audio or video discussion boards ever take off?

Dimensions of Communication

For the reasons why a new method may or may not be successful, it's useful to think about what I'm calling the "dimension" of communication. There are a few major distinctions at work here such as

  1. the difference between representational (written) and actual (oral) language,
  2. the sensory difference between visible and audible communication, and
  3. the richness, or depth, of information in the message.

None of the four categories (pictures, text, video or audio) are entirely cut and dried, because they all contain aspects of one another and they can all be combined. A video can have audio, a graphic might have text, text has graphical elements like font, or might be written by hand instead of a machine. Still, when we get right down to it, we find again that underlying these categories are the variations that our brains are particularly well designed for, namely, language, the representation of abstract concepts, sensory input, time and space.

So, while there are surely other ways to order these categories, I found the dimension of time the most interesting. From top to bottom (pictures, text, video, audio), think of this like geometry where at the picture end of the spectrum you have one dimension (a point) and at the audio end of the spectrum you have two dimensions (a line). In other words, a snapshot permits completely non-linear access, while a tune is unalterably linear requiring a duration and a direction.

Why this is critical is because it effects how we experience a message. You can easily scan a picture, leaping from one place to another looking for what interests you and lingering there as long as you choose. You can do the same with text, flipping through a magazine for example, or pausing to appreciate a phrase. So while text is static, like anything that can be printed, because it is language it also has linear order that pictures don't. Just try reading this sentence backwards and making sense out of it. ("It of out sense making and backwards sentence this reading try just.")

You can only scan video a little by speeding it up and running it forwards or backwards. You can "pause" or represent video by chopping it into stills, thumbnails, etc.. But you can't really scan audio at all. You either hear something or you don't. If you skip past something you can never know what you missed. Speeding sound up moves it quickly out of our comprehension, while pausing or running it backwards changes any meaning of a sound completely. Into what exactly? We don't know, but it's entirely possible that hiding a satanic message backwards in your rock song is a total waste of time—unless you're trying to terrorize the people who worry about such things.

If we wanted to get really ambitious, there's a third dimension to consider. Our brains are as naturally adept with space as they are with time. We regularly use two-dimensional space to communicate, through graphical layout, for instance, or stereo headphones. We use three-dimensional space some too, for example through architecture or by putting a cover on a book. However, given how good our heads are with space, and how fluently we use it in person (when I can wave my hands around or bop you on the head) I suspect that today's 3D games, smart houses, and virtual reality environments are only the first baby steps towards where we're going with this stuff.

Behavior

Both dimension and speed effect conversational behavior, which means we can probably predict what people will like. For example, pure audio or video might make for poor discussion boards, because most people prefer to skim a discussion looking for points of interest to focus on, rather than reading every post all the way through. It gets to me that I never know until after I've finished listening to the whole phone message whether it was entirely useless, so I'd never abide an audio-only message board. However, if I were skimming a bunch of messages and came across one I particularly enjoyed, I might take a moment to listen to the poster speak it in their own voice. Since voices carry far more information (emotion, humor) than text alone, and faces more than voices alone, chances are that the entire message board experience would be so much richer for everyone that their popularity would spike. Thus, I'd guess audio or video message boards will really take off once we have better transcription programs because then I could speak my piece and let the software automatically turn my message into text before posting both voice and text together to the list.

For instant messages though, a quick sound or video clip might suit me just as well as text. A photo message might be exactly what I want sometimes too, but having grown accustomed to the greater information I get from text, I probably wouldn't buy a device that only brought me pictures but no language.

As both of these examples show, we can't talk about behavior without bringing up informational richness (complexity), or the amount of information being conveyed in any message. This is because there's a whole set of tradeoffs we need to make as we move up or down the complexity scale, (from the NASA flight simulator to a stop sign).

How much attention does a message require, for example? Can we listen to it with half an ear or does it call for that earnest eye-contact routine? How much time does it take to produce? How much time does it take to receive? Is a response required? Can we put it off? How much context is required to understand it? How trustworthy is it?

The beauty of text is that it strips out all the complexity of facial expression and tone of voice from language (which itself is already an abstraction of thought) and leaves the bare words. That's the problem with text too, depending on what you're trying to achieve. It means that text is the most impersonal mode of communication, allowing people to use it as a shield to hide their emotions. It also means that text is the easiest type of communication to cut, past, forward, reuse, and alter. Although ultimately you can do this with graphics, video and audio too, it's harder. This means that text is the least trustworthy form of communication. Generally we could also say that that messages grow more untrustworthy with greater time-to-target due to the greater effort and time it takes to produce them and the higher rewards that they earn.

A familiar example of how complexity and time-to-target relate to trust is plagiarism. We worry about this most with books or reports. To date, the best guard against a phony or stolen book has been the reputation of a publisher. We should expect that as self-publishing increases so will plagiarism because the reward for ripping off someone else's work is high when it includes eliminating the effort of creating it yourself. While plagiarism isn't as big a problem in motion pictures because it's hard to steal someone's scene outright (unless you're a cute kid), we know that anything can be, and is, doctored. It doesn't bother us too much, but it does explain the popularity of reality TV because sometimes audiences appreciate the higher degree of trust they feel by watching and listening to real people in unscripted situations.

For more immediate communications think of email. How much can you trust that an email was written by whoever sent it? It's probably more reliable than a webpage, especially if you can gauge how long it took the sender to produce it. Plagiarism isn't a high concern because the reward for doing it is generally fairly low. But in truth we never really know. If I forward an email to you, often the original message header information will go along with the forward and you can inspect it if you like. But what if I decided to write an email to someone and construct a false header to pretend I was forwarding your message? Nothing could detect my deception and your reputation could be damaged (or enhanced) at my whim.

Bring this into realtime and consider two friends in a text chat. Plagiarism is almost irrelevant at this point. Trust is highest because there are fewer things the writers can fake due to the speed required to respond to one another and the context they presumably share. But still there's no way to be absolutely sure who's on the other end when all you have to go on is text on a screen. Pick up the phone though (or better yet, turn on the web cam) and the game's up when the friends hear one another's voices.

Combine all these ideas together and you get a typical strategy in the corporate world. If you want to make a point of something (clearer emotion), but you don't want a permanent record to haunt you, it's generally safer to use voicemail (which typically gets expunged fairly quickly because the files are fat with information and yet difficult to scan) than email (which sits around tidy and small and is easily scanned or analysed). On the flip side, if you need to cover your ass, it's wiser to create a paper trail with email.

Greater Complexity and Greater Convergence

What's amazing is that we have so dramatically increased the conversational complexity of our lives without exploding in confusion. If our roads suddenly grew 10 times more complicated in a decade, or we developed twenty new ways to cook food, we would probably be screaming our hearts out at the frustration of modern life. But as communicating beasts we've absorbed all this change with less grumbling than glee, and are ready for more.

This is our gift. We intuitively incorporate new conversational techniques and strategies as soon as we see them, and automatically fiddle with aspects of time, information and context to send messages of varying depth and significance, all with very little reflection on how we know what to do.

So where is this taking us? Are we going to demand a reduction in complexity? Or are we like pro wrestlers shouting 'Bring it on!' to ever more tools and variables?

I'd argue both. We want it all. All that complexity of capability without the redundancy of application and device. Because we want the complexity, we are willing to accept a cacophony of separate inboxes and address books and recording devices and publishing tools and transmission devices today. Nevertheless, communication is so much our instinctual birthright, that we are operating under a constant compulsion to bring our tools and methods in line with how our brains perceive the world—in time, space and relation to others.

Fundamentally, our minds reside at the center of our world views, with communication being the only way we can relate to other minds. Ultimately therefore, this picture is our model for convergence—one tool through which we conduct and manage all our communications with others.

Essentially this is convergence of device, and we're clearly on our way. Personal computers can help us produce, send, receive or experience almost any message in our chart, just not necessarily within the same application. Eventually I believe the PC and its software are morphing into a full communications station, with a subset of your records available to you over the net anywhere, anytime. As handheld devices get more powerful, the mobile/stationary distinction will go away to a great degree, although major qualitative differences will persist. In other words, your home communication station is going to allow richer creation and display capabilities than anything mobile might, mostly for reasons having to do with size.

However, I wouldn't hold your breath waiting for any of these new forms of communication to converge itself into the sunset. Now that we have email and voicemail and instant messages and message boards and the whole chattering pantheon, we'll just be keeping them all thank-you-very-much.

 

 

Everything in these pages is original work unless otherwise mentioned.
Copyright Katrina Glerum 1998, 1999, 2000, 2001.
HOME