A blog about belief

Archive for the ‘Mathematics’ Category

The Language of Mathematics (1)

In Computer Science, Literature, Mathematics on June 18, 2012 at 7:37 pm

I’m working on a software project (more soon) that involves a notation that is interpreted by computers.  As a way of specifying the language formally, I’m trying out parsing expression grammars, a relatively new alternative to the methods that have been traditionally used to define the syntax of programming languages, like context-free grammars.  I’ve been reading the original paper in which Bryan Ford introduces PEGs, and something struck me about the way in which it builds up to the mathematical definition of the idea.  The paper begins with an “informal” explanation that starts with an example of a PEG written in ASCII text, like you would use as the input to a program:

# Hierarchical syntax
Grammar <- Spacing Definition+ EndOfFile
Definition <- Identifier LEFTARROW Expression
Expression <- Sequence (SLASH Sequence)*
Sequence <- Prefix*
Prefix <- (AND / NOT)? Suffix
Suffix <- Primary (QUESTION / STAR / PLUS)?
Primary <- Identifier !LEFTARROW
         / OPEN Expression CLOSE
         / Literal / Class / DOT

[…]

Although the paper explains what it means in a very prosaic way, placing it in historical context and comparing PEG’s practical implications with those of other types of grammar, this bit of ASCII text seems intuitively like the most formal thing in the paper.  The mathematical definition of the construct is set off much less from the text of the article than the ASCII example, which is in a fixed-width font and embedded as “Figure 1.”  The definition begins:

Definition: A parsing expression grammar (PEG) is a 4-tuple G=(VN, VT, R, eS), where VN is a finite set of nonterminal symbols, VT is a finite set of terminal symbols, R is a finite set of rules, eS is a parsing expression termed the start expression, and VN ∩ VT = ∅. Each rule r ∈ R is a pair (A, e), which we write A ← e, where A ∈ VN and e is a parsing expression. For any nonterminal A, there is exactly one e such that A ← e ∈ R. R is therefore a function from nonterminals to expressions, and we write R(A) to denote the unique expression e such that A ← e ∈ R.

One reason why the informal explanation is informal in comparison with this is that it describes the syntax of PEGs using a PEG, making it a circular definition.  But two things jump out at me.

  1. The mathematical notations in the formal definition are interpolated into a paragraph of written English, while the informal definition describes the syntax of the system in a way that a computer could understand.
  2. It would be much harder to see what the formal definition is doing without reading the informal one first.  If the paper had started talking about 4-tuples right off the bat, it would be unclear in what sense the objects it defines could be considered “rules” and “parsing expressions.”  There is something, a sort of mathematical anamnesis, that the reader takes away from the circular definition at the beginning of the paper that makes it possible to see the meaning of the more rigorous math that follows, in a sense of the word “meaning” that is not yet clear to me.
Advertisements

The “interplay of chain stimulations”

In Mathematics, Philosophy on July 3, 2011 at 10:00 pm

In Word and Object, W.V.O. Quine talks about the way in which the mind revises its web of beliefs as if this process occurs in an unconscious way:

[…]  Prediction is in effect the conjectural anticipation of further sensory evidence for a foregone conclusion.  When a prediction comes out wrong, what we have is a divergent and troublesome sensory stimulation that tends to inhibit that once foregone conclusion, and so to extinguish the sentence-to-sentence conditionings that led to the prediction.  Thus it is that theories wither when their predictions fail.

In an extreme case, the theory may consist in such firmly conditioned connections between two sentences that it withstands the failure of a prediction or two.  We find ourselves excusing the failure of prediction as a mistake in observation or a result of unexplained interference.  The tail thus comes, in an extremity, to wag the dog.

The sifting of evidence would seem from recent remarks to be a strangely passive affair, apart from the effort to intercept helpful stimuli: we just try to be as sensitively responsive as possible to the ensuing interplay of chain stimulations.  What conscious policy does one follow, then, when not simply passive toward this interanimation of sentences?  Consciously the quest seems to be for the simplest story.  Yet this supposed quality of simplicity is more easily sensed than described.  Perhaps our vaunted sense of simplicity, or of likeliest explanation, is in many cases just a feeling of conviction attaching to the blind resultant of the interplay of chain stimulations in their various strengths.  (§5)

Mercier & Sperber’s argumentative theory of reasoning could offer an answer to this conundrum.  To be sure, Quine is not suggesting an argumentative theory – by “story” he means theory, not argument.  But he is only able to hesitantly claim that the conscious part of cognition has the function of preserving the simplicity of theories.  Even this operation appears to occur at the level of intuition, and what purpose conscious reasoning has left is unclear.  In the argumentative theory of reasoning, the “interplay of chain stimulations” by which contrary evidence tugs at our theoretical ideas would be a part of the intuitive track of cognition.  The function of conscious reasoning would be not to oversee this intuitive process, but to come up with good ways of verbalizing its results.  Conscious reasoning would not, in the normal course of things, involve changing the web of belief at all – instead its purpose would be to look for paths along the web that link the particular beliefs one anticipates having to defend to sentences that others might be willing to take as premises.

The argumentative theory claims to explain the confirmation bias by thus reconceiving the function of conscious reasoning, but Quine suggests (in the second paragraph I quoted) that a confirmation bias of sorts can occur in what I have assigned to the intuitive track of cognition as well.  Sometimes our theoretical ideas have become so ingrained that we “excuse” contrary observations.  As far as I can tell, Mercier & Sperber’s argumentative theory would not explain this sort of confirmation bias.

To the extent that it serves the preference for simplicity, an intuitive confirmation bias is not fundamentally irrational, because at least in certain situations selectively ignoring evidence in the name of simplicity can result in better predictions.  This has proven true experimentally in the field of machine learning.  Suppose that we have a plane on which each point is either red or green.  Given a finite number of observations about the colors of particular points, we wish to come up with a way of predicting the color of any point on the plane.  One way of doing this is to produce a function that divides the plane into two sections, red and green.  If we can draw a straight line that correctly divides all of our observations, red on one side and green on the other, then we have a very simple model that, assuming that the set of observations we used to derive it is representative and sufficiently large, is likely to work well.  However, if it is necessary to draw a very complex, squiggly line to correctly account for all of the observations (if we are required to use a learning machine with a high VC dimension), then it is often better to choose a simpler function even if makes the wrong prediction for a few of our observed cases.  Overfitting can lead to the creation of models that deviate from the general pattern in order to account for what might actually be random noise in the observational data.  In the same way, if we attempted to account for every possible bit of contrary evidence in the revision our mental theories, our ability to make useful predictions with them would be confounded.  We will always encounter deviations from what we expect, and at least some of these will be caused by factors that we will never come across enough data to model correctly.  In such cases, we are better off allowing our too-simple theories to stand.