Tuesday, February 20, 2007

Algebraic databases

If we look at spatial logic from a certain level of generality, it appears that we have a construction general enough to work for a wide variety of algebras. More specifically, if CA = (A0,...,AN;R) is an algebra with Ai denoting operators of arity i (thus, A0 are the constants, A1 are the unary operations, etc) with term language T(CA) and relations R the we have a logic over CA whose syntax and semantics are simultaneously specified by

  • Boolean operations
[| true |] = T(CA)
[| ~φ |] = T(CA)\[| φ |]
[| φ1 & φ2 |] = [| φ1 |] ∩ [| φ2 |]
  • Operations induced from the algebra
a ∈ Ak ⇒ [| a(φ1,...,φk) |] = { b ∈ T(CA) | ∃ ai ∈ [| φi |]. b ≡ a(a1,...,ak) } where is the smallest congruence containing R

Example (free monoid):
  • m ::= e | c | m*m generates the language L(m) which when quotiented by the identity relations (m*e = m = e*m), L(m)/=, gives the free monoid with generators, c, and identity e
  • Boolean operations:
[| true |] = L(m)
[| ~φ |] = L(m)\[| φ |]
[| φ1 & φ2 |] = [| φ1 |] ∩ [| φ2 |]
  • Operations induced from the monoid
[| e |] = { m ∈ L(m) | m ≡ e }
[| c |] = { m ∈ L(m) | m ≡ c }
[| φi * φ2 |] = { m ∈ L(m) | ∃ mi ∈ [| φi |]. m ≡ m1*m2 }

  • Example usage: ~e & ~(~e*~e) denotes primality; it picks out nontrivial elements that cannot be factored into nontrivial elements.
This is all a straightforward generalization of the spatial logic construction (restricted to the composition operator). Specifically, the spatial case is the monoid of processes and parallel composition with 0 as identity. Notice, however, that the "ambient" boolean operations are provided because the lattice construction is over the powerset of L(m) (or generally, T(CA)). Nothing prevents us, though, from using other lattice constructions, for example, ℘(L(m)*) where S* denotes the set of finite length tuples of elements from S (i.e. i Si) with concatenation of tuples as a monoid operation (distinct from any operation that may be operating on the algebra). This gives us a quantale. Thus, we have a place to interpret linear formulae as the ambient logical operations. More generally, we can build appropriate lattice structure over the term language to interpret the "ambient" or "embedding" logic into which we embed the operations induced from the algebra.

My interest runs as follows. Spse CA a (generalized) algebra(ic data type). i submit that with an ambient or embedding logic of classical connectives we may auto-generate a read-only database+query language where the contents of the store are elements inhabiting CA; and, we can auto-generate a database+query language + update semantics with the appropriate set-indexed-linear lattice structure.

Practical applications?
  • To address a semantics for an XQuery with update you run this construction over the Schema schema using a set-indexed linear lattice structure.
  • Suppose that someone has determined that the geometry of protein sequences is best represented by elements of Clifford algebra. Then you can auto-generate a database for query and update of geometric information regarding protein sequences directly from the Clifford algebra specification.
Challenges?
  • The main challenge i see is that a word problem hiding inside the definitions for the logical definitions induced from the algebraic operations.

No comments: