A blog about belief

Posts Tagged ‘formal languages’

The Ontic Web

In Computer Science, Culture on August 7, 2012 at 8:04 pm

Recently I’ve been reading about RDF, which is an attempt by the World Wide Web Consortium to create a standard way of representing information about “resources,” which is the word that they use for things.  I’m no fan of XML—a relative of RDF that provides a way to store every type of information in the same horrible HTML-like syntax—and RDF certainly shares its tendency to complicate people’s jobs.  But although the broadness of RDF’s goals all but guarantees its unwieldiness, I’m beginning to think that there is a need for a computer-processable way of writing “ontologies” beyond the interoperability concerns that motivated the RDF project.

It’s almost too easy to do the postmodernist critique of totalizing schemes with systems like RDF.  The example used in the primer for the OWL 2 Web Ontology Language, a commonly-used extension to RDF, is a system for describing family relationships.  Using OWL’s vocabulary for talking about the types of relationships that can hold between things, they define what it is to be a parent, a sibling, and so forth, in statements like this:

EquivalentClasses( :Person :Human )

The authors claim that they do not

intend this example to be representative of the sorts of domains OWL should be used for, or as a canonical example of good modeling with OWL, or a correct representation of the rather complex, shifting, and culturally dependent domain of families. Instead, we intend it to be a rather simple exhibition of various features of OWL.

Sure enough, we get to the zinger a few sections in.

Frequently, the information that two individuals are interconnected by a certain property allows to draw further conclusions about the individuals themselves. In particular, one might infer class memberships.  For instance, the statement that B is the wife of A obviously implies that B is a woman while A is a man.

Even when they’re only used as examples, categorization schemes tend to turn into power plays.  Think how a person who just married her girlfriend would feel reading that.

But information modeling isn’t all retrograde.  There’s an admirable example in Sam Hughes’s very funny essay about how database engineers will have to adapt to gay marriage.  And there is more to RDF than what I would Heideggerianly call ontics—the description of categories and subcategories of things.

One type of program that people have developed for RDF is the inference engine, which attempts to mimic human reasoning by drawing conclusions from the knowledge represented in files.  Whether or not they will lead to a serious AI, people have put these tools to use straightaway for a quite different purpose, that of checking the consistency of their work while putting ontologies together.  This is a different application of the technology from that of defining standard vocabularies to enable different software systems to work together, which is where RDF has found the most application (and which is admittedly very important).  It has less to do with the finished product (the ontology file) than with what we learn in the process of writing it, and with the input that the computer is able to give to the writer as revision proceeds.

The Language of Mathematics (1)

In Computer Science, Literature, Mathematics on June 18, 2012 at 7:37 pm

I’m working on a software project (more soon) that involves a notation that is interpreted by computers.  As a way of specifying the language formally, I’m trying out parsing expression grammars, a relatively new alternative to the methods that have been traditionally used to define the syntax of programming languages, like context-free grammars.  I’ve been reading the original paper in which Bryan Ford introduces PEGs, and something struck me about the way in which it builds up to the mathematical definition of the idea.  The paper begins with an “informal” explanation that starts with an example of a PEG written in ASCII text, like you would use as the input to a program:

# Hierarchical syntax
Grammar <- Spacing Definition+ EndOfFile
Definition <- Identifier LEFTARROW Expression
Expression <- Sequence (SLASH Sequence)*
Sequence <- Prefix*
Prefix <- (AND / NOT)? Suffix
Suffix <- Primary (QUESTION / STAR / PLUS)?
Primary <- Identifier !LEFTARROW
         / OPEN Expression CLOSE
         / Literal / Class / DOT


Although the paper explains what it means in a very prosaic way, placing it in historical context and comparing PEG’s practical implications with those of other types of grammar, this bit of ASCII text seems intuitively like the most formal thing in the paper.  The mathematical definition of the construct is set off much less from the text of the article than the ASCII example, which is in a fixed-width font and embedded as “Figure 1.”  The definition begins:

Definition: A parsing expression grammar (PEG) is a 4-tuple G=(VN, VT, R, eS), where VN is a finite set of nonterminal symbols, VT is a finite set of terminal symbols, R is a finite set of rules, eS is a parsing expression termed the start expression, and VN ∩ VT = ∅. Each rule r ∈ R is a pair (A, e), which we write A ← e, where A ∈ VN and e is a parsing expression. For any nonterminal A, there is exactly one e such that A ← e ∈ R. R is therefore a function from nonterminals to expressions, and we write R(A) to denote the unique expression e such that A ← e ∈ R.

One reason why the informal explanation is informal in comparison with this is that it describes the syntax of PEGs using a PEG, making it a circular definition.  But two things jump out at me.

  1. The mathematical notations in the formal definition are interpolated into a paragraph of written English, while the informal definition describes the syntax of the system in a way that a computer could understand.
  2. It would be much harder to see what the formal definition is doing without reading the informal one first.  If the paper had started talking about 4-tuples right off the bat, it would be unclear in what sense the objects it defines could be considered “rules” and “parsing expressions.”  There is something, a sort of mathematical anamnesis, that the reader takes away from the circular definition at the beginning of the paper that makes it possible to see the meaning of the more rigorous math that follows, in a sense of the word “meaning” that is not yet clear to me.