Monday, August 15, 2011

James Gleick, The Information (2011)

If there is one book that touches so many of the posts in this blog, it is James Gleick's The Information. Covering much of the territory as the two books covered by the first two posts --- Seth Lloyd's Programming the Universe (August 17, 2009 post) and Charles Seife's Decoding the Universe (August 23, 2009 post) --- including the physical nature of information, the relevance of the laws of thermodynamics to information theory, as well as the exploits of World War II codebreakers and the story of information theory developed by Claude Shannon and others --- Gleick also covers the history of information storage and transmission from the beginning of recorded history to Google. And there is even a tip of the hat to John Banville's The Infinities, (see March 28, 2010 post, where I noted the novel's connections to Lloyd and Seife).

Much of The Information covers the history of the transmission of information from Sumerian times to the present. The media evolves from markings in the earth, to stone tablets, to paper wrapped in scrolls, to paper bound in book form, to rhythmic pulses of drums, to pulses of energy over electrical wire,to pulses of energy through the air. With Claude Shannon's revolutionary 1949 paper on a mathematical theory of communication, whose purpose was to address an engineering problem of "reproducing at one point either exactly or approximately a message selected at another point," the way we look at information transmission changed. As described in the August 23, 2009 post on Charles Seife's Decoding the Universe, Shannon understood that information could be measured and quantized, which would enable him to determine how much information could be transmitted through a given channel of transmission. Semantics and meaning are irrelevant to this exercise, but the measurements were extremely useful in solving various communications problems associated with telephone communications at the time and has likewise been useful in developing greater bandwidth so more information is capable of transmission at a given time from one point to another. Shannon also understood that information was physical and was subject to the laws of thermodynamics just as other physical features of the universe, and particularly the second law pertaining to entropy. (See August 17, 2009 post). Shannon further addressed the problem of "noise" in the communication channel that corrupted or interfered with information in transmission, and a mathematical means for filtering out that noise to determine probabilistically that the information received was the same as the information sent.

I have previously complained that whoever said that entropy refers to disorder did the world a disservice. (See August 17, 2009 post), and Gleick confirms that much confusion has ensued over the centuries in our discourse over the "second law" of thermodynamics because of the way entropy has been characterized. For those looking for some clarity about the principle of entropy, Gleick is worth reading. As initially conceived, entropy was a measure of the unavailability of energy for work because of dissipation. It referred to a "thermodynamic condition," where previously separate hot and cold bodies (could be a liquid or gas) generated steam that could be converted to energy for work, but once the hot and cold bodies mixed, the mixture achieved a uniform temperature and the amount of energy in steam dissipated making it unavailable for work. The "thermodynamic condition" that previously existed was said to have "dissipated," and there was no more energy available for work. The phrase "disorder" entered the picture because the previous thermodynamic condition where hot and cold bodies were separated and were capable of generating energy for work was considered "orderly," and the homogeneous mixture that was not capable of generating energy for work was considered "disorderly." The second law of thermodynamics stands for the proposition that the universe "tends" to flow from the orderly to the disorderly. Ironically, the "orderly" was viewed as a "less likely" macrostate, and the "disorderly" was viewed as a "more likely" macrostate. Probability theory therefore enters this discussion, and what is more probable is a homogeneous state where we are less likely to find separate distinct states.

Shannon's insight was that entropy was relevant to all dynamic systems, not just thermodynamics. To the physicist, entropy is a measurement of uncertainty about the state of a physical system. To the information theorist, entropy is a measure of uncertainty about a message, where it is less clear that the message that was transmitted is the same as the message that was received. Ironically, notes Gleick, humans, contrary to physical laws, seem bent on curbing entropy, making the environment in which they live, more orderly. We build things; we compile; we categorize; we separate. As Schrodinger noted, living things feed on negative entropy. The second law contemplates a closed system, and earth is not a closed system. Life feeds upon energy and negative entropy leaking into the earth's system. Similarly, humans have proven creative in dealing with probabilities and discerning information from communications that are not certain in order to make messages more certain and information theory provides us tools to do just that.

In contrast to Seth Lloyd and Charles Seife's works on the same subject, Gleick spends more time chronicling the technological development of information transmission. As I read on, I found myself thinking about a different subject to which Gleick gives only passing attention: memory. The subject of memory has been a continuous visitor to this blog, (See posts of September 9, 2010, September 28, 2010, November 27, 2010, and April 8, 2011). We tend to think of memory in terms of an information storage device. A 500 page book stores two million bits of information. A computer's hard drive or a CD-ROM store vastly more bits of information. These devices do not store words or images. They store information that encodes for words or images, but they also contain an electrical means for retrieving those images and words that are encoded. Thus a system of retrieving memory is essential to memory, otherwise it has no utility. This is entirely consistent with the description of the human brain's "architecture of memory" as described by Antonio Damasio. (See April 8, 2011 post). By Damasio's reckoning, the architecture of memory he describes evolved because of storage capacity limitations on our ability to retain images of prior events. Instead of files of memories of images and words, memory is found in the connections between cells in the brain. (See post of November 27, 2010). Shannon's research influenced the development of cognitive psychology, Gleick reports, because it became useful in understanding the limitations on our capacity to receive and store information, short-term and long-term memory, pattern recognition, attention, and problem solving.

Gleick does briefly allude to another subject that has received attention in prior posts: the human mind's capacity for storytelling and providing an explanation for what we experience, along the lines described by Michael Gazzaniga in Human. (See May 22, 2011 post and June 12 and June 28, 2011 posts). Storytelling has to be linked to memory formation and retrieval. Recording information, --- on sticks, on walls of caves, whatever --- says Gleick, "served as aids to memory." Homer's poetry, composed and sung in a mnemonic meter, "served first and foremost as an aid to memory." I think storytelling evolved in part to preserve our memory, but the irony is that those same stories can lead to distortions of the memory we try to preserve. Information is corrupted and does not copy accurately.

We have heard about this same sort of phenomenon in the context of the genome. (See November 27, 2010 post). The ultimate information storage device is the cell. The cell stores genetic material --- genes --- each of which is comprised of bits of information. At some point in evolutionary history, the means for communicating the information stored in the cell through replication became possible. Information contained in one information storage device was moved to another information storage device. Gleick quotes George Gamow: "The nucleus of a living cell is a storehouse of information," adding that it is a transmitter of information from which stems the continuity of life. Gamow described the study of genetics as the study of "the language of the cells." Matt Ridley (see November 27, 2010 post) would agree. When biologists discovered redundancy in the codons of the cell, they realized it was nature's way of dealing with "noise" in the transmission of information by providing tolerance for errors in transmission.

At some point in human evolution, external storage of information became important for survival. Gleick describes some of the earliest known examples of external information storage --- Sumerian cuneiform tablets recording contracts, governance documents, and business transactions. Interestingly, these "documents" memorialize agreements --- matters of consensus between two or more persons. These forms of external information storage are undoubtedly significant to early economic and political socialization. Human development of external information storage devices is unparalleled in the animal kingdom. From tablets, to papyrus, to paper, to the construction of libraries, to film and vinyl media, punch cards, optical and magnetic media, and semiconductors. But external information storage did not begin with out species. As Holldobler and Wilson documented in the case of the social insects: certain ants secrete pheromones in soil for colony ants to follow leaders from the nest to food sources and back. (See post of November 4, 2009). Soil becomes an information storage device. Urination by animals sometimes serves a social purpose to mark territory. Trees, plants, and soil become an information storage device. But this merely confirms my point about storytelling serving as an aid to memory. Sometimes that storytelling is an accurate, but memorable account of what really happened --- and we call that non-fiction. And other times it is exaggerated and manipulated to make it memorable --- and we call that fiction.

I want to close with two quotes that Gleick recites that say a little bit about what this book is about. From Claude Shannon: "The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning." As Gleick recounts, Shannon was not interested much in meaning, but just being able to determine that the information sent was exactly or approximately the same as the message received. And "approximately" was good enough for Shannon, because his mathematical algorithms would filter out the noise to provide a probability of what the information was that was sent. If this sounds like the quantum world that was described in Quantum (see previous post), well it is. And Heisenberg's uncertainty principle and Godel's incompleteness theorem figure into information theory.

Gleick also quotes Seth Lloyd, whose book Programming the Universe, inaugurated this blog two years ago: "The more energy, the faster the bits flip. Earth, air, fire, and water in the end are all made of energy, but the different forms they take are determined by information. To do anything requires energy. To specify what is done requires information." This is a statement about the physicality of information, a subject that I have alluded to many times over the past two years in various posts.

No comments:

Post a Comment