Re: [asa] Information and knowledge

From: David Opderbeck <dopderbeck@gmail.com>
Date: Thu Apr 12 2007 - 21:22:12 EDT

Let me just dump in a little more background on my interest in this:
there's an interesting debate in the legal scholarship about whether
biotechnology patents should be treated differently than other chemical
patents. Part of this debate revolves around the nature of genetic
"information." Here's an excerpt from a recent paper on this by a leading
patent law scholar (Dan L. Burk, The Problem of Process in Biotechnology, 43
Hous. L. Rev. 561 (2006)). The part I'm quoting is kind of long, so to cut
to the chase, Burk says at the end of this section: *These considerations
of molecular architecture and information theory allow us to reformulate the
information patenting argument to account for the recurrent peculiarities of
biotechnology process patents. Correctly articulated, the argument should
observe that it is the information flow that is of interest in
biotechnology, and hence of interest in biotechnology patenting.*
**
Interestingly, I think Burk recognized where his argument was leading in
relation to ID theory, even though his paper has nothing to do with ID or
any such thing, so somewhere he dumped in a gratuitous footnote bashing ID.

In my own scholarship on biotechnology patents and intellectual property in
general, I try to critique some of the epistemological assumptions that
underly some aspects of approaches like Burk's. That's one reason why I
focus on the ontology of "information" -- knowledge and "information," I
think, are in many ways socially constructed and not person- mind- or
medium-independent.

Here is the longer excerpt from Burk:

The argument that biotechnology patents are essentially drawn to information
proves too much, in part because it plays fast and loose with the term
"information." To the extent that the argument employs the term in a
meaningful way, it might equally be said that all chemical patents--and
indeed perhaps all patents--are in fact drawn to "information." And in the
sense that this argument uses *583 the term "information," the embodiment
claimed in a patent is always an excuse for controlling the underlying
information
.
This version of the "information patents" argument in part suffers from what
has been called the "DNA mystique"; the trope that DNA constitutes a "master
molecule" or "blueprint" for the living organism. [FN109] The argument
assumes that genetic information is hierarchically lodged in the DNA
molecule. But one might equally plausibly argue for mRNA, [FN110] or DNA
polymerase, [FN111] or peptidyl transferase [FN112] as the "master molecule"
of the cell. Critical genetic information is lodged in the architecture of
each of these molecules, as well as many others. Each molecule "knows" and
"recognizes" the other molecules with which it interacts, effectively
distributing information across a series of interconnected chemical
pathways. Proper cellular functions are dependent on the joint and several
functionality of such molecules.

DNA is of course a key component in the expression of genetic information,
but it is not the only informational component and it does not exist in
isolation. [FN113] It rather functions within an interactive structural
apparatus that as a whole forms an information transfer system. [FN114]
Rather than comparisons to blueprints and the like, DNA might better be
compared to a cog in a machine, something like Babbage's famous "difference
engine," the conceptual precursor to modern computing, which was intended to
accomplish complicated numerical calculations by means of mechanical gears.
[FN115] DNA is an admittedly complex cog in a nanometer scale machine, but
one that nonetheless physically interoperates with other structural
mechanisms, and which is essentially inoperative outside its functional
matrix. Due to its size, DNA can carry a very large amount of structural
information, but this structural encoding is similarly the case for *584 all
biological macromolecules and indeed is at some greater or lesser degree
true of all chemical structures.

Adding some rigor to the use of the term "information" in the "information
patents" argument would go far toward correcting the problems I have
identified, but a rigorous reformulation of the argument would likely
require elucidation of a theory of information flow well beyond the scope of
this paper. [FN116] For present purposes, it is sufficient to consider one
widely employed measure of information transfer from communications theory
and the physical sciences. This measure of information was famously
articulated in the information "channel" equations developed by Claude
Shannon, which ultimately define the transmission of information in terms of
the uncertainty, or the entropy of a system. [FN117] According to Shannon's
theorems, all communications channels have some carrying capacity, or
bandwidth. [FN118] So long as a particular message remains within the
carrying capacity of the channel, it can be transmitted losslessly by use of
certain error-correction mechanisms. [FN119] Additionally, the informative
content of a message is directly related to the degree of uncertainty
regarding its content. Predictable messages have a low information content;
unexpected or surprising messages have a high information content. [FN120]
Resolution of uncertainty functions as a measure of information. [FN121]

Although Shannon was considering primarily electronic communications when he
developed these theories, his work maps out a general theory of information
and is not limited to electronic communications or for that matter to other
human systems such as spoken or written communications. [FN122] Since
biological macromolecules carry a type of message, Shannon's theories yield
important insights regarding the nature and *585 functions of the code and
of the channels carrying those messages. Shannon's work on information
theory has been applied with considerable success to biological systems at
the molecular level. Information theory has been used to define the
information carrying capacity or bandwidth of macromolecules and their
interactions in cellular processes. [FN123] Information theory has also
shown considerable explanatory power in illuminating the character and
variety of error correction mechanisms that exist in cellular processes for
transcription, translation, and replication. [FN124]

In terms of molecular biology, information is encoded in the architecture or
structure of molecules. Information flow within a cell--or for that matter,
between cells--occurs via the interaction of particular configurations of
molecular structure with complementary configurations of molecular
structure. [FN125] Biological molecules interact and encode information not
only via the spatial exclusions of their molecular form--which is to say,
the spaces occupied by their repulsive electron shells--but also via the
extended secondary, tertiary, and quaternary structures formed by the
macromolecular chains, the clustered arrays of water molecules surrounding
the macromolecules, the clouds of charged ions that macromolecules draw in
their wake. These interlocking physical structures are the Shannonian
"channels" by which information is conveyed from molecule to molecule.
[FN126] *586 Although this is not quite what Marshall McLuhan meant in his
famous aphorism, in biotechnology the medium is quite literally the message.
[FN127]

And it is here that the naïve version of the argument that biotechnology
patents are information patents goes wrong; the argument assumes that the
information is somehow separable from the molecule. The familiar "CATG" of
DNA, which seems to display the information carried by the molecule, is
merely a notational shorthand for a set of spatial configurations, as for
that matter is the more extensive letter code for designating amino acid
residues in a polypeptide chain. Such information is not by itself of
interest, at least not the information represented in human-readable
notation. However superficially appealing it may be to regard macromolecular
information as the string of ATGCs or RGDSs in a database, it is the
three-dimensional configuration of the molecule, as well as its associated
physical structures, taken in the context of a complex molecular system,
that encodes biological information.

As genomics shades into proteomics and beyond, [FN128] molecular biologists
are well aware that most of this extended molecular architecture is not
reflected in their databases and may well be unpredictable based solely on a
knowledge of primary macromolecular sequence. [FN129] Primary sequence data
is only a small part of the story, though over time ongoing research will
doubtless flesh out other aspects of the story. But the point here is that
primary sequence data, and for that matter even more elaborate molecular
mapping, is at best only a human-readable shorthand for a particular spatial
structure, and it is the spatial structure that acts to transfer information
among the components of biological systems.

*587 This in some ways turns the argument from information patenting on its
head; the problem is not that the information embedded in the molecule is
valuable and the actual molecule is itself superfluous. Rather, since
information is encoded as molecular structure, the information is only
useful when embodied in such structures, which is to say that, ultimately,
no one is really interested in strings of human-readable letters--they are
instead interested in what can be done with the structures such letters
represent. And that in turn means that by necessity they must be interested
in building physical informational structures--the molecules that are the
conduit for information transfer.

These considerations of molecular architecture and information theory allow
us to reformulate the information patenting argument to account for the
recurrent peculiarities of biotechnology process patents. Correctly
articulated, the argument should observe that it is the information flow
that is of interest in biotechnology, and hence of interest in biotechnology
patenting.

On 4/12/07, David Opderbeck <dopderbeck@gmail.com> wrote:
>
> If the receiver is a computer then Shannon applies if the receiver is
biological it doesn't. That's because genetic engineering techniques does a
transform of the DNA before replication.
>
> But so does a computer. Even if the information is transmitted from one
computer to another, it comes off of one kind of medium (say, a hard drive)
and may end up in a different kind of medium (say, a flash drive -- or a
human brain). The computer / biological distinction seems completely
arbitrary to me.
>
> Let's say it's computer injection mold instruction for a pen that get
transmitted then the pen would be properly considered the receiver but not
the source The coding must be such to create an exact copy of the original.
>
> Let's say a device takes a three dimensional scan of the pen all the way
down to the molecular level, the scan data is uploaded to a computer and fed
to a three-dimensional printer, and another pen is "printed." It's not
teleportation because the original pen still sits on the desk; now there are
two pens, the original and a copy. It seems to me the pen is then the
source and the receiver. (So, ok, we don't have Star Trek replicators yet,
but we do have 3D scanners that will scan an object into CAD software and 3D
"printers" that will fabricate a tangible object from the CAD design, so the
concept isn't necessarily impossible.)
>
> We object to ID because it misapplies information theory.
>
> I don't necessarily disagree with you here, but I think I disagree with
your reasons. Your reasons seem to be an a priori assumption -- that we
can't admit to a "coder" of biological information in DNA but rather we must
understand DNA as something entirely material. Of if not -- what am I
missing?
>
> My objection is that information theory in general is taken too far when
it leads to the view that information is truly ontologically a separate,
medium-independent property.
>
> On 4/12/07, Rich Blinne < rich.blinne@gmail.com> wrote:
> >
>
> > The sequence itself is shannon information because it's just a sequence
of ACTG. Note Randy:
> >
> > >
> >
> > >
> >
> > >
> >
> > Information about the genome and its sequence of course is classical
information. This is transmitted.
> >
> > If the receiver is a computer then Shannon applies if the receiver is
biological it doesn't. That's because genetic engineering techniques does a
transform of the DNA before replication.
> >
> > Take another example. If I take a digital photograph of a pen and
transmit it over the net then information is transmitted in the Shannon
sense. Does the pen itself have information? Nope because the pen does not
get reconstructed. Let's say it's computer injection mold instruction for a
pen that get transmitted then the pen would be properly considered the
receiver but not the source The coding must be such to create an exact copy
of the original. So, we can use portions of a DESCRIPTION of the genome to
create proteins. But that's no different than a chemical engineer observing
a reaction transmitting a description of the reaction and having the
reaction replicated.
> >
> > When Francis Collins uses the phrase the genetic code is the instruction
set for life it's an over-simplified illustration.
> >
> >
> > On 4/12/07, David Opderbeck < dopderbeck@gmail.com> wrote:
> >
> > >
> > > Genetic information is transferred through replication, not through
transmission via a channel. There's a fundamental difference.
> > >
> > > I undertand that is true in an organism qua organism. But genetic
information can be extracted from an organism and transmitted over a
channel. Look again at that gene synthesis link I posted (
http://www.blueheronbio.com/genemaker/technology.html) and note what is
happening. Sequence data is extracted from an organism. It is transmitted
(via a website, no less) to the synthesis company. The synthesis company
runs it through some computational algorithms and then constructs synthetic
DNA. The synthetic DNA can be used, say, to "instruct" a cloned
microorganism to express an enzyme that digests chemical waste.
> > >
> > > It seems to me that this is a quite clear example of genetic
information being transmitted over a channel. It is a set of instructions
that people are sending around, manipulating, and than inserting into the
"hardware" ("wetware") of a clone as instructions for what functions the
clone must perform.
> > >
> > > I get the sense that you all "protest too much" to the notion that
genetic information can be Shannon information because of the ID
implications of that notion.
> > >
> > > Or maybe I'm being completely dense. How is extracting a gene
sequence into a set of A,C,T, and G's, transmitting that data over the
internet, and then reassembling it into a biological substrate for insertion
into a clone not the transmission of "information?"
> > >
> > > As far as I know, which I admit isn't very far, the concept of Shannon
information is employed widely in biotechnology and bioinformatics.
> > >
> > > See, e.g., this paper:
http://ieeexplore.ieee.org/Xplore/login.jsp?url=/iel5/9262/29416/01332413.pdf:
> > >
> > >
> > > Shannon information in complete genomes
> > > Chang-Heng Chang; Li-Ching Hsieh; Ta-Yuan Chen; Hong-Da Chen; Liaofu
Luo; Hoong-Chien Lee
> > > Computational Systems Bioinformatics Conference, 2004. CSB 2004.
Proceedings. 2004 IEEE
> > > Volume , Issue , 16-19 Aug. 2004 Page(s): 20 - 30
> > > Digital Object Identifier 10.1109/CSB.2004.1332413
> > > Summary: Shannon information in the genomes of all completely
sequenced prokaryotes and eukaryotes are measured in word lengths of two to
ten letters. It is found that in a scale-dependent way, the Shannon
information in complete genomes are much greater than that in matching
random sequences - thousands of times greater in the case of short words.
Furthermore, with the exception of the 14 chromosomes of Plasmodium
falciparum, the Shannon information in all available complete genomes belong
to a universality class given by an extremely simple formula. The data are
consistent with a model for genome growth composed of two main ingredients:
random segmental duplications that increase the Shannon information in a
scale-independent way, and random point mutations that preferentially
reduces the larger-scale Shannon information. The inference drawn from the
present study is that the large-scale and coarse-grained growth of genomes
was selectively neutral and this suggests an independent corroboration of
Kimura's neutral theory of evolution.
> > >
> > >
> > >
> > >
> > > On 4/12/07, Randy Isaac <randyisaac@comcast.net > wrote:
> > > >
> > > >
> > > > I used the word "teleportation" too loosely for you to extract all
that. Strictly speaking, teleportation involves coherence over a long
distance between entangled quantum systems so that there is a one-to-one
correlation of the states of the relevant particles. I shouldn't have tried
to extrapolate the meaning. It's just an analogy there.
> > > >
> > > > To be honest, I don't know what you are saying here. (not sure I
know what I'm saying either, for that matter!!) Let me try again.
> > > >
> > > > It may be useful to think of the various types of 'information.' The
word is often used indiscriminately. Three of the different uses of the word
are:
> > > >
> > > > 1. Information capability or capacity. This would be the total
number of physical elements which can embody information. Like 80GB on your
hard drive. Or 10^80 as the amount of information in the universe since that
is the number of fundamental particles thought to be in the universe (or at
least it was way back when I went to school)
> > > >
> > > > 2. Information as meaning or a message. This is the message that is
being conveyed through some physical channel.
> > > >
> > > > 3. Information as complexity. This is the configuration of a
physical entity, not the meaning or message ascribed to it. The amount of
information required to describe a physical configuration is a measure of
its complexity. The description should not be confused with the complexity
of the system itself.
> > > >
> > > >
> > > > My point about genetic 'information' vs message 'information' is as
follows:
> > > >
> > > > Genetic information is really complexity. It is a particular
configuration. This should not be confused with our description of that
complexity. Any computer code or information transmitted by sentient beings,
human or otherwise, involves assigning a meaning to a particular physical
configuration. That is fundamentally different from the genetic code where a
particular physical configuration has a function but not an assigned
meaning.
> > > >
> > > > Genetic information is transferred through replication, not through
transmission via a channel. There's a fundamental difference. Shannon talks
about noisy channels and limits of information transfer through those
channels. Genetic replication is quite different and doesn't follow those
theorems.
> > > >
> > > > To me, the conclusion of all this is that genetic information, while
having a lot of similarities to anthropogenic computer code, is
fundamentally different from any information transmitted by sentient beings.
It is therefore not appropriate to infer an intelligent designer from an
analogy between genetic information and human information.
> > > >
> > > >
> > > >
> > > > Randy
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: David Opderbeck
> > > > To: Randy Isaac
> > > > Cc: asa@calvin.edu
> > > > Sent: Thursday, April 12, 2007 9:33 AM
> > > > Subject: Re: [asa] Information and knowledge
> > > >
> > > >
> > > > Randy said: Your sci-fi example doesn't negate the argument. What
you describe is really teleportation, in a sense. Information about the
genome could in principle be sufficiently complete that it could be
reconstructed. That information which is teleported is indeed
Shannon-information. The DNA itself isn't information of that type.
> > > >
> > > > Maybe I'm being dense, but this seems to me different only in degree
from the notion of Shannon information in computing. My laptop's hard
drive comprises a platter with of many small magnetic regions that encode
bits of data. Those bits of data can be extracted from the platter /
magnetic medium and transferred to an array of transistor cells on my USB
flash drive. The same bits of data can be extracted from the flash chip and
transferred onto the capacitors of the temporary DRAM memory on the
workstation in a classroom. Then I can teach a class, and hopefully,
between students dozing, IM'ing, surfing the web, and daydreaming, at least
some of the same data can be transferred into the "wetware" medium of my
student's brains.
> > > >
> > > > Certainly I haven't in this process reconstructed all the
information on my laptop's hard drive and transferred it to my student's
brains -- not even all the information that was on my hard drive concerning
my lecture, since at least some of the laptop-resident information is
specific to the medium on which it resides. But, from the perspective of
information theory, I don't think you'd say I merely "teleported" my lecture
from the laptop to my students. There was a relatively lossless transfer of
some information over a series of communications channels.
> > > >
> > > > Likewise, I don't see why extracting information from a genetic
sequence -- say, a group of genes responsible for regulating the expression
of an enzyme that breaks down industrial waste -- transferring that
information to a computer medium, and then "printing" that information to a
set of synthetic genes for insertion into a biological waste management
device, would be a form of "teleportation" rather than a transfer of Shannon
information across a communications channel to different media. I don't see
why this would be merely a transfer of information "about" a genome any more
than taking my lecture notes off of the hard drive and teaching a class
would be merely a transfer of information "about" my brain or my hard drive
-- unless the whole project of information theory is simply misplaced as an
ontological matter. (I also don't think, BTW, that the wetware "printer" is
entirely in the realm of science fiction anymore.)
> > > >
> > > > On 4/9/07, Randy Isaac <randyisaac@comcast.net> wrote:
> > > > >
> > > > >
> > > > > Dave,
> > > > > The argument is a little different from what you are citing.
I'm not saying that genetic information isn't Shannon-type information
because it isn't medium-independent. Rather, it isn't medium-independent
because it isn't Shannon-information. That is merely the easiest way to see
the ramification of it. It's the fundamental definition of information and
complexity. Complexity can be thought of as the amount of information
required to describe an object or any entity. Complexity even applies to
information itself. Data compression is least efficient in the most complex
information streams. The so-called genetic code is the information we use to
describe the genome.
> > > > >
> > > > > Your sci-fi example doesn't negate the argument. What you
describe is really teleportation, in a sense. Information about the genome
could in principle be sufficiently complete that it could be reconstructed.
That information which is teleported is indeed Shannon-information. The DNA
itself isn't information of that type.
> > > > >
> > > > > The novelty of DNA is that, unlike virtually everything else
in our universe, it is self-replicating. That replication, with an
infinitesimal but non-zero error rate, is incredibly potent as a means for
generating additional complexity. Other inanimate objects can and do also
become more complex--that's entropy, if you will--but nothing comes close to
the effectiveness of self-replication.
> > > > >
> > > > > Randy
> > > > >
> > > > > ----- Original Message -----
> > > > > From: David Opderbeck
> > > > > To: Randy Isaac
> > > > > Cc: asa@calvin.edu
> > > > > Sent: Sunday, April 08, 2007 7:40 PM
> > > > > Subject: Re: [asa] Information and knowledge
> > > > >
> > > > >
> > > > > Randy, I think you're alluding here to a really important and
usually overlooked aspect of the ID discussion: the ontology of
information. Bill Dembski, following in the footsteps of communications
and cybernetics theorists who've built on Shannon, views information as a
sort of ontic entity apart from matter and energy (at least that is how I
understand the implications of Dembski's ideas). This idea can't be
dismissed lightly -- it is being built into a discipline, the Philosophy of
Information, that has nothing to do with ID, and it underlies much
contemporary sociological and legal theory concerning social norms and law
regarding communications, the Internet, and other types of information.
> > > > >
> > > > > Personally, my present view is that it's misguided to think of
information as something ontologically separate from matter and energy. I
think this reflects a sort of Cartesian dualism that I'm keen to avoid in
both theology and legal theory. But I'm not so sure its as simple
as arguing that genetic information isn't Shannon information just because
genetic information doesn't appear at present to be medium-independent.
It's not impossible to imagine a biotechnology scenario in which genetic
information can be extracted from an organismal genome, stored on a
computing device, and then "printed" to a "wet ware" printer to produce a
synthetic medicine, body part, organism, etc. After all, whod've thunk
fifty years ago that today we'd be walking around with gigabytes of data on
pocket flash drives?
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>

To unsubscribe, send a message to majordomo@calvin.edu with
"unsubscribe asa" (no quotes) as the body of the message.
Received on Thu Apr 12 21:22:44 2007

This archive was generated by hypermail 2.1.8 : Thu Apr 12 2007 - 21:22:44 EDT