Bootstrapping and Semantic Assumptions
3.6. Starting from the Agent
This partial order on semantic assumptions induces a partial order on agents as well, if we can find a way to associate to each agent its semantic assumptions.
3.6. Starting from the Agent
“We think we got it,” we tell the stranger, who has patiently been waiting for our elucubrations. “Could you pop your agent open? We need to have a look at its semantic assumptions.”
Inside the agent, we expected to see an ordered shelf of rulebooks, possibly written in Chinese, describing the agent’s axioms and rules, and some generic inference engine.
Instead, the agent’s internals look like an endless series of leaky pipes, from which many whirring electric components protrude. In a corner, a squirrel is busy exercising in a run- ning wheel. “The squirrel just provides energy to the system. It is part of our company’s commitment to more ecologically friendly bootstrapping agents. Sometimes it stops un- predictably, and you need to be patient and give it a few minutes. If it stops for more than five minutes, you might have to replace the squirrel.”
This is not a problem, I think, because at Caltech we do have a large supply of squir- rels. “We can live with the biological component,”—catching squirrels will be a much more productive endeavor than the average project for Caltech undergrads—“but where can we find this agent’s semantic assumptions?” “I am sorry, but this agent has not been programmed with explicit semantic assumptions, nor it does explicit manipulation of sym- bols. It’s more like a neural network—but not exactly so. . . ”
Suppose that we have an agent, described algorithmically. Can we backtrack from the agent’s algorithmic description to the agent’s semantic assumptions?
As an example, consider the task of learning the sensor geometry, which is a basic task
sensor geometry: The metric or topological arrangement of a sensor’s sensels in space.
3.6. STARTING FROM THE AGENT 46
for bootstrapping agents. It is assumed that the observations come from the sampling of aspatial fieldon some manifold. For example, the pixels of a camera sample the plenoptic function (the field) on the visual sphere (the manifold). One approach to sensor geometry reconstruction consists in finding a measure of similarity between each pair of sensels, and then use an embedding algorithm which uses the similarities as input. The basic idea is that if sensels are close in space, then their measured values will be more similar.
There are many possible choices for the similarity measures, with various tradeoffs in space/computation requirements and robustness to noise. Here is a representative sam- ples of plausible similarities measures that an agent might use. They are written as func- tions R(yi,yj), with the understanding that they are all statistics of the observed sensel values over some time period.
• The simplest choice is using the sample correlation:
R1(yi,yj) =corr(yi,yj). (3.1)
• Some agent might think it is useful to use the absolute value of the correlation:
R2(yi,yj) =|corr(yi,yj)|, (3.2)
reasoning that, if one consistently observesyi = −yj, the sensels should be con- sidered spatially close because they are observing the same signal, just with the opposite sign.
• An agent, if it had taken more than just an introductory statistics class, could use
embedding problem: The problem of reconstructing the positions of a set of points in a metric space given their distances.
3.6. STARTING FROM THE AGENT 47
the Spearman correlation as the similarity measure:
R3(yi,yj) =|spear(yi,yj)|. (3.3)
• An agent with more computation at his disposal might use thenormalized variation of informationV1(yi;yj)(Definition B.13):
R4(yi,yj) =1− V1(yi;yj). (3.4)
This is a proper metric for random variables, derived from the mutual informa- tion, which satisfies 0≤ V1(yi;yj)≤1.
One can associate a group of symmetries to each of these similarity measures. For example, each of the similarity measuresRkwould be unchanged if all the data were to be multiplied by−1:
Rk(−yi,−yj) = Rk(yi,yj).
Formally, one says that all similarity measures are invariant to the action of the group(±1,×). Here is the rest of the analysis for the other measures.
• The absolute value of correlation (3.2) is invariant to the action of the group
Aff(R)n=Aff(R)× · · · ×Aff(R),
which corresponds to an affine map acting independently on each sensels.
invariant function: A function that is preserved by the action of a group (f(g·x) = f(x)).
See Definition C.39.
affine map: Geometrically, an affine map preserves straight lines, but not angles or lengths. See Definition D.3.
3.6. STARTING FROM THE AGENT 48
• The simple correlation (3.1) is invariant only to orientation-preserving affine maps acting separately on each variable:
Aff+(R)n=Aff+(R)× · · · ×Aff+(R).
• The Spearman correlation is invariant to all monotonic homeomorphismsHomeo+(R)n (Lemma B.11). Taking the absolute value of the Spearman correlation makes the similarity measure invariant to all homeomorphisms.
• The similarity measure 1− V1(yi;yj)is invariant to all invertible piecewise-continuous mapsPieceHomeo(R)n(Remark B.12).
These results are summarized in Table 3.2.
Just like the previous example, we can use the subgroup partial order to order the groups into a lattice (Figure 3.1a), and this induces a partial order on the similarity mea- sures (Figure 3.1b). Moreover, we can go from the groups to the semantic assumptions, by using the information in Table 3.1.
lattice: A partially ordered set with a "top" and a "bottom" elements which are compara- ble to all elements of the set. See Definition A.11.
Table 3.2. Some similarity measures and relative symmetry groups.
similarity measure symmetry
|corr(yi,yj)| Aff(R)n corr(yi,yj) Aff+(R)n spear(yi,yj) Homeo+(R)n
|spear(yi,yj)| Homeo(R)n 1− V1(yi;yj) PieceHomeo(R)n
For each similarity measure among sensels (left column) we can associate its invari- ance group (right column). In addition to the groups displayed, every measure is invariant to the reflection(yi,yj)7→(−yi,−yj).
3.6. STARTING FROM THE AGENT 49
Thus, the symmetries of the algorithm used by the agent ultimately define which se- mantic assumptions the agent is making about the data.
Homeo(R)n Aff+(R)n
PieceHomeo(R)n Aff(R)n Homeo+(R)n
≤
≤
≤
≤
≤
(a)Partial orders on groups given by the subgroup relation.
|spear(yi, yj)| corr(yi, yj)
1− V1(yi;yj)
|corr(yi, yj)| spear(yi, yj)
≤
≤
≤
≤
≤
(b)Induced partial order on similarities measures.