You are on page 1of 61

The Algebraic Approach to Formal Language TheoryI

M. W. Hopkins1

Abstract
The algebraic approach to formal language and automata theory is a contin-
uation of the earliest traditions in these fields in which languages, transduc-
tions and other computations were represented as expressions (e.g. regular
expressions) in suitably-defined algebras; grammars, automata and transitions
as relational systems over these algebras, with the expressions as their solutions.
Following such results as the algebraic reformulation of the Parikh Theorem, the
possibility has been recognized that other classical results of formal language
and automata theory may be similarly recast.
A foundation for such a reformulation is provided here centering on the
construction of a complete lattice of algebras linked by a network of adjunctions.
Each algebra is a dioid (or idempotent semiring) additive and distributive over a
distinguished subfamily of subsets. The hierarchy includes quantales and many
of the Kleene algebras that have been considered in the literature.
The subset families mirror the classical notion of language family, with rep-
resentatives for types 0, 2 and 3 in the Chomsky hierarchy. Notable features of
our development include the generalization of grammars to arbitrary monoids
(including type 1 grammars), a unified foundation for languages and transduc-
tions, and the generalization of regular expressions to expression algebras for
types 2 and 0 grammars.
Keywords: Kleene, Language, Chomsky Hierarchy, Context-Free, Grammar,
Transduction, Regular Expression, Context-Free Expression, Rational,
Monoid, Semigroup, Dioid, Quantale, Ideal, Semiring, Adjunction, Monad,
T-Algebra, Eilenberg-Moore, Tensor Product, Monoidal Category.

1. The Algebraic Approach

In the standard formulation of formal languages and automata, which we


will henceforth refer as the classical theory, a language is regarded as a subset

I This is an extended version of [16, 17]. Significant new material has been added, including

clarification regarding the nature of the “context-sensitive” subset family M 7→ SM and


“Turing” subset family M 7→ T M .
Email address: federation2005@netzero.com (M. W. Hopkins)
1 UW-Milwaukee alumnus, not presently affiliated with any institution.

Preprint submitted to Elsevier January 16, 2012


of a free monoid M = X ∗ ,2 though more general monoids may sometimes be
considered, e.g. Parikh vectors over commutative monoids. Different families
of languages over an alphabet X are then identified as distinguished families of
subsets of a monoid X ∗ .
This specificity appears to extend to grammars: curiously, there seems to
be an absence of the notion of grammars in the literature other than for free
monoids. The most significant way in which this absence is felt is in the un-
fortunate duplication of formalisms. Since a transduction between alphabets
X and Y is just a subset of the product monoid X ∗ × Y ∗ , then a consequence
of generalizing grammars and language families to arbitrary monoids is that
we bring both languages and transductions into a unified formalism. Then, for
instance, we no longer need to regard “context free languages” separately from
“simple syntax directed translations”.
The essence of what may be termed the Algebraic Approach is the following.
A language or transduction is no longer regarded as a set of words or word pairs
constructed out of one or more alphabets, but as an element of an algebraic
structure. First, we start by generalizing free monoids X ∗ or direct products of
such monoids X ∗ × Y ∗ to arbitrary monoids M . The elementary objects in this
algebra are the product m, m0 ∈ M 7→ mm0 ∈ M and the identity 1 ∈ M . These
generalize the classical notions of concatenation (or sequencing ) and empty word
(or the do-nothing operation), respectively. As a result, the classical notions of
language and transduction are generalized to monoid subsets.
However, the process does not stop there. The power set PM of the monoid
M can also be viewed as a monoid. The product on M lifts to a product on
PM by
A, B ∈ PM 7→ AB = {ab ∈ M : a ∈ A, b ∈ B} .
The resulting monoid contains a copy of M , the unit morphism
ηM : a ∈ M 7→ {a} ∈ PM
giving us the embedding into PM by virtue of the identities {a} {b} = {ab} and
{a} {1} = {a} = {1} {a}. For this reason, the ηM is conventionally treated an
inclusion relation, with elements of M freely interspersed with those of PM .3
The result of this extension is to expand our arsenal of primitive concepts.
Alongside the sequencing and do-nothing operations are new operators 0 = ∅
and A + B = A ∪ B which give us generalizations, respectively, of the concepts
of failure and non-deterministic branch. The ordering relation A > B ⇔ A ⊇ B
may then be thought of as a precursor of the concepts of derivability or transfor-
mation A → B. With this perspective, a grammar or automaton now becomes
way of writing down a system of inequalities. The principle of finite derivability
is encoded by the requirement that the object (language, transduction, etc.) be
the minimal solution to the corresponding system.

2 “Let F (V ) be the free monoid generated by [the terminals] V , i.e. the set of all strings
T T
in the vocabulary VT . A language is, then, a subset of F (VT )”, [6, p. 8].
3 This gives rise to conventions such as aU b = {aub : u ∈ U }, which we will observe.

2
Thus, we may consider more general algebras which contain at least the
structure of a partially ordered monoid closed under the least upper bounds of its
finite subsets4 and distributive with respect its least upper bound operator. This
replaces the classical formulation which treated languages and transductions just
as subsets, respectively, of PX ∗ and P (X ∗ × Y ∗ ).
However, the definitions in the classical theory are cast almost entirely in
set-theoretic terms, as are the arguments for the corresponding theorems, even
though the ideas and the results frequently have a purely algebraic or categorical
flavor that can be stated in such fashion, with both an increase in transparency
and generality. As a result, the full potential of the results arrived at classically
is missed. This discrepancy is what the algebraic approach seeks to rectify.
From its inception, the Applications in Kleene Algebra conference has rec-
ognized the possibility of such a foundation:
“Recent algebraic versions of classical results in formal language theory, e.g.
Parikh’s theorem [15], point to the exciting possibility of a general algebraic
theory that subsumes classical combinatorial automata and formal language
theory [pointing] to a much more general, purely axiomatic theory in the spirit
of modern algebra.”5
Taking a significant step in this direction, the beginnings of such a refor-
mulation are provided here, bringing fully to bear the power of monads and
adjunctions. At its foundation lies a complete lattice of algebras, each contain-
ing the structure of a dioid (or idempotent semiring) additive and distributive
over a distinguished family of subsets.
At one extreme, the algebra PM , possesses both unlimited additivity and
distributivity. It is the archetype of the quantale with a unit {1}; in fact, the
free quantale extension of the monoid M . At the opposite extreme, the algebra
FM of finite subsets contains only additivity and distributivity with respect
to finite subsets. This is another way of describing the dioid, or idempotent
semiring. Classically, it corresponds to the family FM of finite languages: the
free dioid extension of M .
More generally, each subset family is associated with a monad connecting
it to the category of monoids and generalizes the classical language family and
transduction family. The range of possible families mirrors the classical language
hierarchy, and includes representatives for types 0, 2 and 3 in the Chomsky
hierarchy, as well as families that go beyond type 0: oracles. The elements of
this formalism are developed in section 3 and its extensions and applications
in section 4. The inclusion of types 0, 2 and 3 of the Chomsky hierarchy is
discussed in section 6.
A notable feature of our development is the generalization of grammars to
arbitrary monoids that also incorporates a reformulation of type 1 grammars.

4 We will refer to the closure under least upper bounds here and below as additivity.
5 Programme introduction, Applications of Kleene Algebra, Schloss Dagstuhl, February
2001.

3
The treatment of generalized grammars is the topic of section 2. The formulation
that emerges brings to the forefront the roles played by free extensions and tensor
products.
Each member of the algebra hierarchy may be regarded as a subset of a
quantale subject to a restriction on distributivity and additivity. A network of
adjunctions between the different algebras can be defined which gives realization
to this idea. An outline of the general construction is given in section 5.
Much of these developments were foreshadowed by Kozen [19], where implicit
use was made of the monad concept to develop a hierarchical relation between
different varieties of Kleene algebras. Earlier work has been carried out by
Conway [7] in the study of the algebra that came to be known as the quantale,
the *-continuous Kleene Algebra, and the “countably-closed dioid”. In addition,
as we will see at the end of section 4, the Chomsky-Schützenberger theorem [6]
should also be considered as an early precursor to these developments.
The quantale emerged in the 1980’s in quantum physics (hence the name),
particularly in the study of C*-algebras and von Neumann algebras. Both the
quantale and dioid have also played a role in non-linear dynamics, linear logic,
Penrose tilings, discrete event systems ([1, 2, 11, 14, 30, 31, 34, 37]; see also
[13] and references contained therein), and related fields (e.g. see Maslov, et
al. [27]). Our use of these algebras in the setting of formal language theory
reveals what appears to be a cross-connection with the mathematics used in the
foundational physics of both classical and quantum physics, perhaps bringing us
one step closer to realizing von Neumann’s goal of providing a unified founda-
tion for both physics and automata theory6 . This correspondence is furthered
with the introduction of the “polycyclic dioids”, matrix algebras and braided
monoidal categories (which are used prominently in differential geometry and
knot theory).
In section 7, we will discuss the extension of the dioid hierarchy to semi-
groups, semirings and power series algebras, along with other issues that the
limitations of time and space prevented us from addressing more fully here.
In the following, we will assume familiarity with the semigroups, monoids,
partial orderings, semi-lattices and lattices. In addition, we will assume basic
familiarity with categories, and the related concepts of functors and natural
transformations. References include [4, 8, 13], and for category theory, [26].
Two appendix sections have added to provide further detail on the adjunc-
tion and tensor product constructions used in sections 2, 4 and 7, and to estab-
lish the notational conventions used for the categorical algebras involved. Also
included is a brief self-contained treatment rendering tensor categories, adjunc-
tions and monads as categorical algebras in the spirit of [18, part I]. Sections
2.1, 2.8, 4.1 and 4.5 also bear the strong imprint of [18].

6 “We are very far from possessing a theory of automata which deserves that name, that

is, a properly mathematical-logical theory” [35]. It was von Neumann who earlier established
the foundational roles played by Boolean logic and Quantum logic, respectively, in classical
and quantum physics.

4
A more detailed treatment of adjunctions, monads and co-monads may be
found in [26]. Tensor categories are discussed further in [25, 29]. After reading
sections 2.8, 4.10 and 7.6 a deeper appreciation should emerge on the important
role that they play in formal language and automata theory. In particular,
in section 4.10 we shall find that the tensor product plays a crucial role in
the algebraic reformulation of the Chomsky-Schützenberger Theorem, replacing
the intersection + erasure construction that is used in classical theory for this
theorem.

2. Generalized Grammars
Classically, a grammar over the alphabet X affixes a set Q of indetermi-

nates, called non-terminals to the free monoid X ∗ to obtain a set (X ∪ Q)
of configurations . This requires the assumption X ∩ Q = ∅. A finite subset
∗ ∗
H ⊆ (X ∪ Q) × (X ∪ Q) of phrase structure rules is used to generate a transi-
∗ ∗
tion relation over (X ∪ Q) . A starting configuration S ∈ (X ∪ Q) is identified

and the language is defined as the set of all the words in X derivable from S by
a finite number of applications of transitions α → β for (α, β) ∈ H to subwords
in the present configuration. The classical theory usually assumes, further, that
the starting configuration is one of the variables, S ∈ Q, though this restriction
is not essential.
The grammar expression (α1 → β1 , . . . , αn → βn , S) is introduced here as
a means to denote the subset that results from this grammar, where H =
{(α1 , β1 ) , . . . , (αn , βn )}.

2.1. Free Extensions of Monoids



The assumption X ∩ Q = ∅ reduces (X ∪ Q) , algebraically, to the free
extension of X ∗ by the set Q. Thus, when we generalize from X ∗ to arbitrary
monoids M , the configuration set becomes the free extension M [Q].
In an algebraic theory, the free extension M [Q] of an algebra M by a set
Q is defined uniquely by the following universal property that (1) there be an
algebra homomorphism ιM,Q : M → M [Q] and (2) a map σM,Q : Q → M [Q],
such that (3) for each homomorphism f : M → M 0 and map σ : Q → M 0 to an
algebra M 0 there is a unique homomorphism hf, σi : M [Q] → M 0 such that
hf, σi ◦ ιM,Q = f, hf, σi ◦ σM,Q = σ. (1)
The uniqueness requirement is captured equivalently by the properties
hιM,Q , σM,Q i = 1M [Q] , f 0 ◦ hf, σi = hf 0 ◦ f, f 0 ◦ σi, (2)
where f 0 : M 0 → M 00 is another homomorphism.7

7 To make this truly a categorical algebra requires making explicit the forgetful functor

M̂ : Monoid → Set, writing σM,Q : Q → M̂M [Q], σ : Q → M̂M 0 and replacingD the second
E
of equations 1 and 2 respectively by M̂ hf, σi ◦ σM,Q = σ and f 0 ◦ hf, σi = f 0 ◦ f, M̂f 0 ◦ σ .
In here and section 4.1 we will take the shortcut of treating M̂ as an identity functor.

5
For monoids, the universal property is realized by taking elements α ∈ M [Q]
to be words of the form α = m0 q1 m1 . . . qn mn , where m0 , m1 , . . . , mn ∈ M ,
q1 , . . . , qn ∈ Q and n = deg (α) > 0 is the degree of the word. The product is
defined by
(m0 q1 m1 . . . qn mn ) (m00 q10 m01 . . . qn0 0 m0n0 )
(3)
= m0 q1 m1 . . . qn (mn m00 ) q10 m01 . . . qn0 0 m0n0
with deg (αβ) = deg (α) + deg (β). The homomorphism ιM,Q : m ∈ M → m ∈
M [Q] embeds M into M [Q] as words of degree 0, while σM,Q : q ∈ Q → 1q1
maps the set Q into M [Q] as words of degree 1. The map determined by third
criterion of the universal property is explicitly given by

hf, σi (m0 q1 m1 . . . qn mn ) = f (m0 ) σ (q1 ) f (m1 ) . . . σ (qn ) f (mn ) , (4)

where m0 , m1 , . . . , mn ∈ M and q1 , . . . , qn ∈ Q.
The following isomorphisms are a direct consequence of the universal prop-
erty:

M [Q] [Q0 ] ∼
= M [Q ∪ Q0 ] (Q ∩ Q0 = ∅) , M∼
= M [∅] , 1 [Q] ∼
= Q∗ (5)

where 1 = {0} is the 1-element monoid. For instance,


DD E E
ιM,Q∪Q0 , σM,Q∪Q0 |Q , σM,Q∪Q0 |Q0 : M [Q] [Q0 ] → M [Q ∪ Q0 ] ,
ιM [Q],Q0 ◦ ιM,Q , ιM [Q],Q0 ◦ σM,Q ∪ σM [Q],Q0 : M [Q ∪ Q0 ] → M [Q] [Q0 ] ,



which one can verify are inverses. Other consequences of the universal property
include ∅∗ ∼
= 1 [∅] ∼
= 1 and

(X ∗ ) [Q] ∼ = 1 [X ∪ Q] ∼
= 1 [X] [Q] ∼ = (X ∪ Q) (X ∩ Q = ∅). (6)

2.2. Grammars over Monoids


A grammar over a monoid M may then be defined as a structure G =
(Q, S, H) composed of a set of variables, Q; the starting configuration S ∈ M [Q]
and a relation H ⊆ M [Q] × M [Q]. For most of what follows, we will assume
H is finite.
The case M = X ∗ yields grammars in the classical theory for languages over
an alphabet X, while M = X ∗ × Y ∗ yields transductions between alphabets X
and Y . Other examples might be conceived of, where M represents a construc-
tion language for graphical or multimedia displays; for instance, a typesetting,
hypertext or word processing language, or (a more interesting example) the
commutative monoid that underlies the 2-dimensional symbolic language used
in the Laws of Form [33] for Boolean algebra.
The closure of H under reflexivity, transitivity and products yields the tran-
sition relation → over M [Q]. The relation is uniquely defined as the minimal
relation generated by the rules R: if α ∈ M [Q] then α → α, and HR : if
γ → δαε and (α, β) ∈ H then γ → δβε. The length of a derivation is the num-
ber of applications of the HR rule. As a consequence, we also have C: if α → β

6
and α0 → β 0 then αα0 → ββ 0 , and T: if α → β and β → γ then α → γ. Other
consequences include H: if (α, β) ∈ H then α → β and HL : if (α, β) ∈ H and
γβδ → ε then γαδ → ε. In addition, one may verify that C, T and H also follow
from R and HL , so that one can equivalently define derivations inductively by
repeated applications of HL to R.
Corresponding to each α ∈ M [Q] is the subset [α] ≡ {m ∈ M : α → m} of
elements of M derivable from the configuration α. Generalizing the grammar
expression notation to grammars over monoids, this may be written8

[α] = (α1 → β1 , . . . , αn → βn , α)

where H = {(α1 , β1 ) , . . . , (αn , βn )}. Where clarity requires it, we write [α]G or
[α]H in place of [α] and α →G β or α →H β in place of α → β. By virtue of
C, we also have the inclusion [α] [β] ⊆ [αβ], and from R and T it follows that
[α] ⊇ [β], whenever α → β. Finally, the language generated by the grammar is
L (G) = [S] ⊆ M .
An important property, that we will need in the following, shows the invari-
ance of a grammar with respect to renaming of the variables in Q.
Lemma 2.1 (Substitution Invariance). Let G = (Q, S, H) be a grammar over
a monoid M , σ : Q → Q0 a bijection and

G0 = (Q0 , σ 0 (S) , {(σ 0 (α) , σ 0 (β)) : (α, β) ∈ H}) ,

where the bijection σ 0 = hιM,Q0 , σM,Q0 ◦ σi : M [Q] → M [Q0 ] is the extension of


σ to M [Q]. Then α →G β iff σ 0 (α) →G0 σ 0 (β), for all α, β ∈ M [Q]. Moreover,
[α]G = [σ 0 (α)]G0 for all α ∈ M [Q]. In particular, L (G) = L (G0 ).
Proof. We may verify that σ 0 (α) →G0 σ 0 (β) follows from α →G β by induction
over the length of derivations in G. The converse property follows by considering
the inverse σ 0−1 . The remaining statements then directly follow. For m ∈ M ,
we have m ∈ [α]G iff α →G m iff σ 0 (α) →G0 σ 0 (m) = m iff m ∈ [σ 0 (α)]G0 .
From this, it follows that L (G) = [S]G = [σ 0 (S)]G0 = L (G0 ).
Another important property recovers the classical definition of a grammar:
requiring the starting configuration to be a single variable – moreover, requiring
the variable’s appearance to be limited to the left hand side of only one rule.
Lemma 2.2 (Start Variable Normalization). Let
 n o n o
Ĝ = Q ∪ Ŝ , Ŝ, H ∪ Ŝ, S

be a grammar obtained from a grammar G = (Q, S, H) over a monoid


 M by
adding a new variable Ŝ ∈
/ M ∪ Q and new rule Ŝ → S. Then L Ĝ = L (G).

8 Strictly speaking, this expression is ambiguous – one needs to explicitly state which free

extension monoid is being used. Note the first of equations 5.

7
Proof. We have, immediately, that [α]Ĝ = [α]G for any α ∈ M [Q], since only
rules from H can be used in any derivationh fromi α. In addition, since the only
rule involving Ŝ is Ŝ → S, it follows that Ŝ = [S]Ĝ = [S]G , from which we

obtain the result.
As in the classical theory, derivations may be refined to a form that incor-
porates a “cursor”. As the following result shows, this applies here as well.
Theorem 2.3. Let G = (Q, S, H) be a grammar over a monoid M containing
a generating subset X. Then α →H β if and only if ·α →0H ·β, where →0H is
the closure under C, R and T of the following one-step derivations: (Shift Left)
·z →0H z· and (Shift Right) z· →0H ·z for z ∈ X ∪ Q, and (Generate) ·α →0H ·β
for (α, β) ∈ H.
Proof. One direction is immediate: if ·α →0H ·β then α →H β. For the converse,
we argue inductively. For R, the result is immediate. Therefore, assume α →H
β = γηε by HR , with α →H γδε and (δ, η) ∈ H. By inductive hypothesis
·α →0H ·γδε, by repetitions of Shift Left: ·γδε →0H γ · δε, by application of
Generate: γ · δε →0H γ · ηε, and by repetitions of Shift Right: γ · ηε →0H ·γηε.
Thus ·α →0H ·β.

2.3. Contextual Grammars over Monoids


A grammar G = (Q, S, H) over a monoid M is contextual if9 H ⊆ Q+ ×
M [Q]. For such grammars, it follows that [m]H = {m}, when m ∈ M . The
family T M is defined as the set of all languages L (G) given by contextual
grammars G = (Q, S, H) over M , where H is finite. Note that10 it is not
enough to merely require that deg (α) > 0 for (α, β) ∈ H. Classically, this
condition or the more general condition α 6= 1 suffices to allow the grammar
to be converted to contextual form. In contrast, when a monoid is not free,
the condition is too general - it slips in a non-recursive element. If factoring is
not recursive in M , then the matching of the left-hand sides of phrase structure
rules may not be either. Such a grammar may also fail to satisfy substitution
invariance (i.e., the Composition Lemma, lemma 6.4) precisely because of this
difficulty.
The conversion of type 0 grammars to contextual form consists of replacing
all terminals x on the left-hand sides of phrase structure rules by non-terminals
x̄ and adding new rules x̄ → x, as needed. A generalization of this construction
suitable for arbitrary monoids, is captured by the following theorem.
Theorem 2.4. Let G = (Q, S, H) be a grammar over a monoid M , where
S ∈ Q, and let σX : X → M be a map from a set X, where X ∩ M [Q] = ∅.
Suppose Ḡ = X  ∪ Q, S, H̄ is a grammar over M , such that (1) σ (ᾱ) →H σ β̄
whenever ᾱ, β̄ ∈ H̄, where σ : M [X ∪ Q] → M [Q] is the canonical extension

9 Q+ ≡ Q∗ − {1}, here, denotes the non-empty sequences from Q.


10 This is a correction to an assertion made in [16].

8
of σX ; and (2) for all (α, β) ∈ H, m, m0 ∈ M and β̄ ∈ σ −1 (mβm0 ), there
exists ᾱ ∈ σ −1 (mαm0 ) such that ᾱ →H̄ β̄. Then [φ]H = [φ]H̄ for φ ∈ Q+ . In
particular, L (G) = L Ḡ .

Proof. Using (1), an easy inductive argument shows that σ (ᾱ) →H σ β̄ when-
ever ᾱ →H̄ β̄. From this, it follows that [ᾱ]H̄ ⊆ [σ (ᾱ)]H . From (2), we may
−1
show that: (3) if α →H σ β̄ then S there exists ᾱ ∈ σ (α) such that ᾱ →H̄ β̄.
From (3) it follows that [α]H = ᾱ∈σ−1 (α) [ᾱ]H̄ . From this, in turn, we obtain
our results. 
The proof of (3) is also by induction. In the case R, we have α = σ β̄ ,
and we may take ᾱ = β̄ and use R to show that ᾱ →H̄ β̄. For the case HR ,
we have α →H γεδ, (ε, η) ∈ H and γηδ = β. Factor γ = γ 0 m and δ = m0 δ 0 ,
where m, m0 ∈ M , γ 0 ∈ {1} ∪ M [Q] Q and δ 0 ∈ {1} ∪ QM [Q]. Then, it follows
that β̄ = γ̄ η̄ δ̄, where σ (γ̄) = γ 0 , σ (η̄) = mηm0 and σ δ̄ = δ 0 . By (2), there
exists ε̄ ∈ σ −1 (mεm0 ) such that ε̄ →H̄ η̄. Since σ γ̄ ε̄δ̄ = γ 0 mεm0 δ 0 = γεδ,
then by inductive hypothesis, there exists ᾱ ∈ σ −1 (α), such that ᾱ →H̄ γ̄ ε̄δ̄.
Upon application of C, we get γ̄ ε̄δ̄ →H̄ γ̄ η̄ δ̄ = β̄ and upon application of T, we
have ᾱ →H̄ β̄, thus establishing the result.
This recovers, as a special case, the classical conversion to contextual form
of a grammar G = (Q, S, H) over M = X ∗ , whose rules (α, β) ∈ H are subject
to the restriction α 6= 1. We define X̄ = {x̄ : x ∈ X} and σX (x̄) = x for
∗ x ∈ X.
−1 ∗
Denoting the canonical extension of σX by σ̄ : (X ∪ Q) → X̄ ∪ Q , we set
H̄ = H̄0 ∪ H̄1 , where

H̄0 = {(σ̄ (α) , σ̄ (β)) : (α, β) ∈ H} , H̄1 = {(x̄, x) : x ∈ X} .

Then condition (1) is satisfied since σ ◦ σ̄ = 1(X∪Q)∗ . To prove condition (2),


we first note that σ̄ (α) →H̄0 σ̄ (β) follows from α →H β and that σ̄ (β) →H̄1 β̄
for any β̄ ∈ σ −1 (β). Thus,

 if α → H σ β̄ , we take ᾱ = σ̄ (α) and conclude
ᾱ = σ̄ (α) →H̄0 σ̄ σ β̄ →H̄1 β̄. Thus ᾱ →H̄ β̄.

2.4. Context-Sensitive and Non-Contracting Grammars over Monoids


A monotonic or non-contracting grammar, classically, is a contextual gram-

mar, where a further restriction is placed on each (α, β) ∈ H ⊆ Q+ × (X ∪ Q)

that ln (α) < ln (β), where ln (α) denotes the length of a word α ∈ (X ∪ Q) .
The additional restriction to rules of the form αqβ → αγβ, where q ∈ Q and
γ 6= 1 distinguishes the subfamily of grammars known as context-sensitive. The
equivalence between the two types of grammars, in the classical setting, is well-
established [24]. Thus, we are able to define the family SX ∗ of context-sensitive
subsets of X ∗ .
For both cases, the production of the empty word 1 is excluded. Therefore,
an explicit stipulation is made classically to allow for its inclusion. In particular,
if (α1 → β1 , . . . , αn → βn , α) is the grammar expression for a non-contracting
or context-sensitive language, then the following modification is permitted

(α1 → β1 , . . . , αn → βn , S → α, S → 1, S)

9
with a new start symbol S ∈ / M ∪ Q is added to Q.
Because of the explicit reference to length, the generalization of context-
sensitive grammars to monoids other than free monoids is not as straightfor-
ward. However, a generalization may be found if we require that the monoid
family M 7→ SM be well-behaved under non-erasing substitutions. In par-
ticular, if X ⊆ M − {1} is a generating subset of the monoid M then under
the canonical homomorphism σX : X ∗ → M , we should expect that SM =
{σ̃X (L) : L ∈ SX ∗ }.11 A grammar over M is non-contracting with respect to
X if its rules are of the form (σX,Q (α) , σX,Q (β)), where ln (α) 6 ln (β) in
∗ ∗
(X ∪ Q) and σX,Q : (X ∪ Q) → M [Q] is the canonical homomorphism of the
generating subset X ∪ Q → M [Q].
The requirement 1 ∈ / X is necessary to satisfy the length and the non-
erasing restrictions. In order to show that this produces a well-defined family,
we also need to ensure its independence with respect to the choice of a generating
subset X ⊆ M − {1}. That is, suppose Y ⊆ M − {1} is another generating
subset of M , with a canonical homomorphism σY : Y ∗ → M . Then we may
convert a grammar G = (Q, S, H) that is non-contracting with respect to X
to one Ḡ = Q̄, S, H̄ that is non-contracting with respect to Y by adding
new variables x̄ for each x ∈ X, replacing each symbol from X in the original
grammar with the corresponding variable, and then adding new rules x̄ → wx ,
where σY (wx ) = σX (x). That is, we define Q̄ = Q ∪ {x̄ : x ∈ X} and

H̄ = {(α, h (β)) : (α, β) ∈ H} ∪ {(x̄, wx ) : x ∈ X}



where h : (X ∪ Q) → Q̄∗ is the monoid homomorphism defined inductively by
h (x) = x̄ for x ∈ X and h (q) = q for q ∈ Q. Then it follows that α →G β iff

h (α) →G0 h (β) and that σX ([α]G ) = σY ([h (α)]G0 ) for all α ∈ (X ∪ Q) .
The length requirement is also satisfied since ln (h (α)) = ln (α) and 1 =
ln (x̄) 6 ln (wx ). The latter property is where we specifically require that 1 ∈
/ X.
This gives us the family of non-contracting languages over a monoid M , which
we designate SM .
Another result of this transformation is to convert the rules to what is known
as standard form:
H ⊆ Q+ × Q+ ∪ (Q × M ) .


Classically, the further normalization of a non-contracting grammar from this


form to a context-sensitive grammar is carried out with the introduction of
enough extra non-terminals to allow each phrase structure rule to be broken
down into a series of context-sensitive steps. Hence, the rules in the subset
Q+ × Q+ of the form

q1 q2 . . . qm−1 qm → r1 r2 . . . rm−1 rm α

11 In here, and in the following, we will denote the image of a function f on a set A by

f˜ (A) ≡ {f (a) : a ∈ A}.

10
are broken down to the following rules
q1 q2 . . . qm−1 qm → Z1 q2 . . . qm−1 qm , Zm → rm α,
Zi qi+1 → Zi Zi+1 , Zi Zi+1 → ri Zi+1 (i = 1, . . . , m − 1) ,
with the introduction of the new symbols Z1 , . . . , Zm . This construction does
not require M to be a free monoid, so it can be applied generally. Thus, the
equivalence of non-contracting and context-sensitive grammars generalizes to
arbitrary monoids.

2.5. Context-Free Grammars over Monoids


Generalizing the classical definition, grammars subject to the restriction
H ⊆ Q × M [Q] are called context-free, and the family CM of context-free
subsets of a monoid M consists of the subsets L (G) ⊆ M generated by a
context-free grammar G = (Q, S, H), for finite H. Such a language can always
be represented by a grammar expression of the form
(q1 → β1 , . . . , qn → βn , S)
where q1 , . . . , qn , S ∈ Q and β1 , . . . , βn ∈ M [Q].
As we will see in the following section, each rule q → z1 z2 . . . zn can be
equivalently described as an algebraic relation over the quantale PM by [q] ⊇
[z1 ] [z2 ] . . . [zn ]. Therefore, though the classical terminology is well-established,
a more apt term for these grammars – which we will temporarily adopt here
– is algebraic . In its place, a better characterization of context-freeness is
the independence of the sets [α] from their context. That is, while for general
grammars we have [α] [β] ⊆ [αβ], the inclusion becomes an equality for context-
free grammars.
To this end, we call a grammar separable if (1) whenever m → β, where
m ∈ M , then β = m, and (2) whenever αβ → γ, then there exists a factoring
γ = α0 β 0 such that α → α0 and β → β 0 . For separable grammars, we may
show the reverse inclusion [αβ] ⊆ [α] [β] as follows. Suppose that w ∈ [αβ].
Then αβ → w. Using separability, we may factor w = w0 w1 such that α → w0
and β → w1 . Since w ∈ M , then w0 , w1 ∈ M . From this, it follows that
w0 ∈ [α], w1 ∈ [β]. Therefore, w = w0 w1 ∈ [α] [β]. Thus, separability implies
the context-freeness of the sets [α].
The classical definition of context-freeness is recovered, in part, by the fol-
lowing result.
Theorem 2.5. Algebraic grammars are separable
Proof. This result follows by inductive argument. In case R, αβ → γ = αβ,
and separability is trivial. For the case HR , we have a derivation αβ →H δqε,
with q ∈ Q, (q, η) ∈ H and δηε = γ. Then, by inductive hypothesis, we have a
factoring δqε = α0 β 0 , where α →H α0 , β →H β 0 . Since q is prime in M [Q], the
only possibilities are either α0 = δqθ and ε = θβ 0 or δ = α0 θ and β 0 = θqε. In the
former case, by HR , we have α →H α00 = δηθ, with α00 β 0 = δηθβ 0 = δηε. In the
latter case, again by HR , we have β →H β 00 = θηε, with α0 β 00 = α0 θηε = δηε,
thus completing the induction.

11
Going in the opposite direction, given a separable grammar G = (Q, S, H),
one may define the possibly infinite set

H̄1 = {(q, β) ∈ Q × M [Q] : β ∈ [q]} ,



and show that the grammar Ḡ = Q, S, H̄1 yields the same transition relation
over M [Q] as does G. However, we can go further and find a generating subset
H̄ ⊆ H̄1 that is finite, if H is finite.
Theorem 2.6 (Conversion of separable grammars to algebraic form). If G =
(Q, S, H) is a separable grammar, with H finite, then there exists an algebraic
grammar Ḡ = Q, S, H̄ such that H̄ is finite and [α]G = [α]Ḡ for all α ∈ M [Q].
Proof. For finite H, a finite subset H̄ ⊆ H̄1 can be found by systematic reduction
of the non-algebraic rules. Taking any (αβ, γ) ∈ H, by separability we have a
factoring γ = α0 β 0 , such that α →H α0 and β →H β 0 . So, we replace (αβ, γ) by
(α, α0 ) and (β, β 0 ). The remaining non-algebraic rules are of the form m →H β,
where m ∈ M . By definition of separability, we have β = m, so the rule
becomes redundant and can be removed. The result is a grammar in algebraic
form, H̄ ⊆ Q × M [Q].
Denoting the family of subsets of the monoid M recognized by separable
grammars by CM , we recover the classical convention of identifying context-
free grammars as algebraic – but only up to an equivalence conversion. In
the following, unless otherwise specified, we will assume that all context-free
grammars are reduced to algebraic form.
Finally, we note that in classical theory derivations can be reduced to left-
most or right-most form. With the following specialization of theorem 2.3,
we show the same applies not just to algebraic grammars, but to separable
grammars.
Theorem 2.7. Let G = (Q, S, H) be a separable grammar over a monoid M
containing a generating subset X. Then α →H m ∈ M if and only if ·α →L H m·,
where →L H is the closure under C, R and T of the following one-step derivations:
(Shift) ·x →L L
H x· for x ∈ X, and (Generate) ·α →H ·β for (α, β) ∈ H.

Proof. One direction is immediate: if α · β →L H γ · δ then αβ →H γδ. For the


converse, we make use of the fact that derivations can be defined by repeated
application of HL to R and apply an inductive argument to show that m0 ·
0
αβ →L H m m · β, whenever α →H m ∈ M . For R, we have α = m and apply
Shift repeatedly. For HL , assume α = γδε, with (δ, η) ∈ H and γηε →H m.
By separability, it follows that m = cd with c, d ∈ M , γ →H c and ηε →H d.
Then m0 · γδεβ →L 0 0 L 0
H m c · δεβ by inductive hypothesis, m c · δεβ →H m c · ηεβ by
0 L 0
application of Generate and m c · ηεβ →H m cd · β by inductive hypothesis.

2.6. Fixed-Point Closure


For context-free grammars we may use separability to recover the classical
correspondence between grammars and systems of fixed point relations.

12
Theorem 2.8 (Separable grammars as fixed point systems). Let G = (Q, S, H)
be a separable grammar over a monoid M. Then (q = [q] : q ∈ Q) is the least so-
lution to the system ᾱ ⊇ β̄ : (α, β) ∈ H of fixed-point relations over the quan-
tale PM where we define m̄ = {m} for m ∈ M , αβ = ᾱβ̄, for α, β ∈ M [Q] and
q̄ = q, for q ∈ Q.
Proof. That ᾱ = [α] is a solution for α ∈ M [Q] follows immediately by defi-
nition of separability. For any other solution, an induction over the length of
derivations allows us to establish ᾱ ⊇ β̄, whenever α →H β. Therefore, for each
m ∈ [α]H , we have ᾱ ⊇ m̄ = {m}. Thus, ᾱ ⊇ [α].

2.7. One-Sided Linear and Regular Grammars over Monoids


As in the classical theory, the left-linear and right-linear grammars may
be defined by the respective restrictions H ⊆ Q × (M ∪ QM ) and H ⊆ Q ×
(M ∪ M Q). Together, these are referred to as one-sided linear grammars. It
is a standard result of the theory of Kleene algebras [20, 21] that the Kleene
operations
{1} , A, B 7→ AB, ∅, A ∪ B, A∗ =
S n
A
n>0

suffice to embody all solutions to one-sided linear systems over the Kleene alge-
bra RM of rational subsets of the monoid M , that this family is the free Kleene
extension of M , and that every member is the least fixed point solution to a
finite one-sided linear system. Hence, RM is the family of all languages over
M recognized by one-sided linear grammars.

2.8. Direct Products and Transductions


One of the most significant oversights of the classical theory is to treat lan-
guage and transductions, and their corresponding automata and transducers,
separately in parallel, instead of within a common framework. Thus, instead of
there being a general formalism for the context-free subsets over monoids, one
encounters separate formalisms: context-free languages and push-down trans-
ductions (equivalently: simple syntax directed translations). One of the main
advantages, therefore, in generalizing grammars to arbitrary monoids is to re-
move this redundancy.
The Cartesian product M × N of two monoids M and N may be en-
dowed with the structure of a monoid, by defining the product (m, n) (m0 , n0 ) =
(mm0 , nn0 ); resulting in what is known as the direct product . This is an applica-
tion of the more general construction, tensor products, which is discussed more
fully in the appendix. Applied to monoids, what the construction entails is that
the direct product is the monoid (unique up to isomorphism) that satisfies the
universal property that: (1) there be monoid homomorphisms
LM,N : m ∈ M 7→ (m, 1) ∈ M × N ; RM,N : n ∈ N 7→ (1, n) ∈ M × N

that commute with one another, LM,N (m) RM,N (n) = RM,N (n) LM,N (m) for
(m, n) ∈ M × N ; and (2) for any other monoid homomorphisms f : M → P

13
and g : N → P that commute with each other, there is a unique monoid
homomorphism hf, gi : (m, n) ∈ M × N 7→ (f (m) , g (n)) ∈ P such that

hf, gi ◦ LM,N = f, hf, gi ◦ RM,N = g. (7)

This uniqueness is equivalently stated by the following two properties,

h ◦ hf, gi = hh ◦ f, h ◦ gi , hLM,N , RM,N i = 1M ×N , (8)

where h : P → P 0 is another monoid homomorphism.


These elements may be combined with the one-element monoid 1, which
may be characterized as the initial object in the category of monoids. This is
the object defined by the universal property that to each monoid M there be
a unique monoid homomorphism &M : 1 → M . The uniqueness is equivalently
given by the following conditions

f ◦ &M = &N , &1 = 11

where f : M → N is a monoid homomorphism. As described in the appendix,


the combination of the tensor product construction and initial object yields what
is known as a braided monoidal category . Applied to the category of monoids,
this entails the following isomorphisms,

(M × N ) × P ∼
= M × (N × P ) ,
M ×1=M ∼∼ = 1 × M,
M ×N ∼
= N × M.

The subsets of M × N give us descriptions of transductions between two


monoids M and N . Classically, this characterization is restricted to direct
products X ∗ × Y ∗ of free monoids, with the rational, context-free and Turing
subsets yielding, respectively, the rational transductions, push-down transduc-
tions (equivalently, simple syntax directed translations) and Turing transduc-
tions (equivalently, recursively enumerable relations between X ∗ and Y ∗ ). In
particular, the family C (X ∗ × Y ∗ ) gives us the arena for parsing theory.
Since the statement of the relation between transduction families and their
respective subset families over the direct product monoids appears to be absent
from classical theory, we will provide an outline of how it may be arrived at
only for the rational and context-free subsets of X ∗ × Y ∗ . To establish the
equivalence of Turing transductions to the family T (X ∗ × Y ∗ ) and to generalize
these results to transductions over arbitrary monoids requires that we expand
our treatment by also generalizing automata and transducers to general monoids
– which lies outside the scope of this paper.
The classical definition of a finite transducer over the monoid X ∗ × Y ∗ [36,
section 7.2, p. 340] is a structure T = (Q, X, Y, H, S, F ) with S ∈ Q, F ⊆ Q.
The configurations comprise the set Y ∗ QX ∗ (with the assumption Y ∩ Q = ∅).
Transition relations are generated by a finite subset H ⊆ Q×(X ∪ {1})×Q×Y ∗ ,
with each (q, w, q 0 , v) ∈ H yielding the one-step transition qw → vq 0 . The

14
relation →, itself, is defined as the closure under C, R and T of the one-step
transitions. Finally, the subset generated by the transducer is
L (T ) = {(w, v) ∈ X ∗ × Y ∗ : Sw → vf, f ∈ F } .
It follows, by inductive argument, that each transition vq w̄ → v̄q 0 w can
be factored to vqw0 w → vv 0 q 0 w and then treated as the closure under C of
the transition qw0 → v 0 q 0 . The same information is contained in each of the
following
q →W (w0 , v 0 ) q 0 , q (w0 , v 0 ) →R q 0
where the write →W and read →R transitions each defined as relations over
(X ∗ × Y ∗ ) [Q]. These, in turn, may be generated for the corresponding one-
step transitions
HW = {(q, (w, v) q 0 ) : (q, w, v, q 0 ) ∈ H} ,
HR = {(q (w, v) , q 0 ) : (q, w, v, q 0 ) ∈ H}
where we add transitions f →α 1 for f ∈ F and α ∈ {W, R}. By inductive ar-
gument, we obtain the following correspondences (w, v) ∈ L (T ) iff S →W (w, v)
iff S (w, v) →R 1. In particular, G = (Q, S, HW ) is a right-linear grammar over
X ∗ × Y ∗ for which L (G) = L (T ). The other relation →R defines a transduc-
tion over (X ∗ × Y ∗ ) × 1, while →W can, itself, be viewed as a transduction over
1 × (X ∗ × Y ∗ ).
Going in the reverse direction, a right-linear grammar over X ∗ × Y ∗ can be
readily transformed into the form HW . First, we factor words (w, v) ∈ X ∗ × Y ∗
into products z0 . . . zk−1 of k > 0 words z0 , . . . , zk−1 ∈ (X ∪ {1}) × Y ∗ . Then
each rule of the form q → (w, v) q̄ (where q̄ ∈ Q ∪ {1}) is decomposed into a set
of rules qi → zi qi+1 for i = 0, . . . , k − 1, with q0 = q and qk = q̄, adding new
variables q0 , . . . , qk as needed. Thus, we have the following result:
Theorem 2.9. The finite transductions between alphabets X and Y consist of
the rational subsets R (X ∗ × Y ∗ ).
The equivalence between push-down transductions and simple syntax di-
rected translations (SSDTs) is well-established classically (e.g., [36, Theorems
7.4.1, 7.4.2, pp. 353–354]). Classically, a grammar for an SSDT over a monoid
X ∗ × Y ∗ ([36, section 7.3, p. 349]) is given by a structure G = (Q, X, Y, H, S)

with S ∈ Q and configurations restricted to a finite subset H ⊆ Q × (Q ∪ X) ×

(Q ∪ Y ) whose elements (q, α, β) ∈ H are further restricted by the condition
that the words α, β have the forms α = w0 q1 w1 . . . qn wn and β = v0 q1 v1 . . . qn vn ,
with q1 , . . . , qn ∈ Q, w0 w1 . . . wn ∈ X ∗ , v0 v1 . . . vn ∈ Y ∗ and n > 0. Configura-
tions (α, β) are subject to a similar restriction, with the starting configuration
being (S, S).
Thus, each such (q, α, β) ∈ H may be equivalently defined as a context-free
rule q → (w0 , v0 ) q1 (w1 , v1 ) . . . (wn , vn ) qn over the monoid X ∗ ×Y ∗ , with a sim-
ilar conversion applied to each configuration (α, β). With this correspondence,
each SSDT grammar is described equivalently as a context-free grammar over
the product monoid, and vice versa. Therefore, we arrive at the following result:

15
Theorem 2.10. The push-down transductions between alphabets X and Y are
given by the family C (X ∗ × Y ∗ ) of context-free subsets of the product monoid.

3. The Dioid-Quantale Hierarchy

With the generalization of grammars to arbitrary monoids, the power set


PM of a monoid M is a quantale that assumes a central role that replaces the
classical arena of formal language theory, the free quantales PX ∗ , while also
subsuming the quantales P (X ∗ × Y ∗ ) in which transductions reside.
Partially
P orderedS by subset inclusion, PM is closed under the least upper
bounds Y ≡ Y of arbitrary subsets Y ⊆ PM . Because ofP this closure,
the sum is an infinitary idempotent operation Y ∈ PPM →
7 Y ∈ PM ,
(YY0 ) = ( Y) ( Y0 ), thus making it a
P P P
possessing infinite distributivity,
monoid homomorphism in its own right.
The resulting structure PM is an instance of a quantale;12 and is charac-
terized as the free quantale extension of M . More generally, a quantale D is a
Pordered monoid in which every subset U ⊆ D contains a least upper
partially
bound U ∈ D with infinite distributivity
X X  X 
(U U 0 ) = U U0 (9)

for U, U 0 ⊆ D, this making the sum U ∈ PD 7→


P
U ∈ D a monoid ho-
momorphism. Moreover, because the least upper bound also possesses infinite
associativity X X  X X 
Y = U (10)
U ∈Y

for Y ∈ PPD, the sum operator is also a quantale homomorphism.

3.1. From Semirings to Dioids


In a similar way, an algebra may be associated with the family FM of finite
subsets of a monoid M . Then we have only the finitary form of the addition
operator, and the zero, resulting in a dioid. Such an algebra may be defined as
follows: starting with an algebra D containing the structure of a semiring with
idempotent addition a + a = a, a partial ordering a > b is recovered by the
definition ∃x : a = x + b, or equivalently by a = a + b.
With respect to this relation, a + b becomes the least upper bound of {a, b}
in D, characterized by the property that x > a + b iff x > a and x > b; and
the additive identity 0 becomes the minimal element of D, and is characterized
by the property x > 0.13 In this way, the description of a dioid D is reduced

12 Generally, one distinguishes between quantales with or without the multiplicative unit 1.

Our focus, here, shall reside largely with the latter variety, the unital quantales, which we
shall, for brevity, refer to as just quantales.
13 A consequence is that the dioid operations (a, b) 7→ ab and (a, b) 7→ a + b are both

monotonic (see, e.g., [13]).

16
P ordered monoid in which every finite subset U ⊆ D has a
to that of a partially
least upper bound U ∈ D with
P
{u1 , . . . , un } = 0 + u1 + . . . + un (n > 0),

which is assumed to be finitely distributive with respect to the product. The zero
and distributivity properties are then equivalently characterized by equation 9
restricted to U, U 0 ∈ FD, which shows that the sum operator is a monoid
homomorphism from FD to D. In addition, because associativity 10 holds for
finite families Y ∈ FFD, the sum operator is also a dioid homomorphism.
A shift in the point of view away from semirings thus occurs when we regard
the ordering relation as the primitive operation, rather than the semiring addi-
tion, and it is more closely aligned to how formal languages were regarded prior
to the advent of the power series formalism. The partial ordering generalizes the
phenomena of reducibility, derivability, transformation, etc. The addition oper-
ator generalizes the phenomenon of non-deterministic branching, the additive
identity generalizes the notion of failure . The resulting shift in viewpoint places
primacy on partially ordered algebras for languages where (a) concatenation and
the empty word are captured by the underlying monoid structure, and (b) the
non-deterministic elements (derivability, branching, failure) are captured by the
ordering relation and least upper bound.
What the quantale and dioid both have in common is that the sum operation
and unit ηM : m ∈ M → {m} ⊆ M produce left-adjoints which inject the
monoid into the dioid FM and quantale PM . Each family of subsets may also
be thought of as a language family, with FM generalizing the classical concept
of finite languages, and PM the general languages .

3.2. Kleene Algebras and Regular Expressions


We may expand on the view of the dioid as an algebra for non-deterministic
processes by including the Kleene star operator a 7→ a∗ , to capture the notion
of iteration or unbounded repetition . In the classical setting such an operator
may be defined as
[
U ∈ PM 7→ U ∗ = {1} ∪ U n ∈ PM,
n>0

or the monoid closure of U ⊆ M . Though the resulting monoid is normally


denoted U ∗ it need not actually be a free monoid. Where clarity requires it, we
will denote the monoid closure by hU iM .
For a monoid M , the minimal structure containing the product, sum and
star operations; the injection η˜M (M ) of the underlying monoid M ; and the
distinguished constants ∅ and {1} is a Kleene algebra. As discussed in section
2.7, this structure is equivalently described as the family RM of the rational
subsets of the monoid M .
The Kleene star U ∗ may be characterized, in terms Pof the ordering relation,
as the least upper bound of all the powers U n as U ∗ = n>0 U n . This identity
can be combined with distributivity to yield what is known as the *-continuity

17
property n>0 AB n C = AB ∗ C. As shown in [20], a *-continuous Kleene alge-
P
bra D is equivalently defined as a partially ordered monoid satisfying distribu-
tivity 9 for rational subsets U, U 0 ∈ RD. The Kleene algebra homomorphisms
are described equivalently as maps that preserve the Kleene operators, or as
monoid homomorphisms that preserve least upper bounds for rational subsets.
Moreover, since RD is closed under rational least upper bounds, associativity
10 follows for Y ∈ RRD.
Thus, just as the quantale is associated with the classical family of general
languages, and the dioid with the family of finite languages, the *-continuous
Kleene algebra is associated with the classical family of regular languages, i.e.,
the languages corresponding to type 3 grammars in the Chomsky hierarchy.

3.3. Natural Families and Natural Dioids


Other subset families can be considered as candidates suitable for application
of properties 9 and 10 to. For instance, the family ωM of the countable subsets
of a monoid M clearly satisfies 9 and 10. The resulting algebra is a “closed
semiring”: a dioid possessing countable distributivity and least upper bounds of
countable subsets. More generally, a hierarchy
P of dioid algebras can be obtained
by restricting the domain of the operator ( ) : AD → D, for a dioid D, to
a distinguished family AD ⊆ PD of subsets. Such dioids we will refer to as
A-dioids; and the hierarchy of algebras, collectively, as natural dioids .
To implement this idea, we first define a natural family to be a correspon-
dence which, when given a monoid M , yields A0 : a subset family AM ⊆ PM ,
which A1 : contains all the finite subsets FM ⊆ AM . By requiring A2 : that
AM be closed under products, we endow AM with the structure of a monoid
partially ordered by subset inclusion. Therefore, we may consider the monoid
AAM . By also requiring that A3 : AM be closed under unions from AAM , we
endow AM with the further structure of an A-dioid.
Finally, we require that A4 : AM be closed under monoid homomorphisms
f : M → M 0 ; that is, for any U ∈ AM , f˜ (U ) ≡ {f (m) ∈ M 0 : m ∈ U } ∈
AM 0 . As a result, the correspondence A : Monoid → Monoid becomes an
endofunctor on the category of monoids and monoid homomorphisms. More
generally, as the following definition shows, the functor allows us to raise the
category of monoids to categories of natural dioids.

Definition 3.1 (The Natural Dioid Category PDA). An A-dioid D is par-


tially ordered monoid thatP has A-additivity, P U ∈ D for0 all U ∈ AD; and
i.e.,
(U U 0 ) = ( U ) ( U 0 ), for
P
(strong) A-distributivity: P U, U 0 ∈ AD.
P Equiva-
lently, we may replace this by weak A-distributivity14 : dU d = d ( U ) d0 for
d, d0 ∈ D and U ∈ AD. AnPA-morphismP f : D → D0 is a monotonic monoid
homomorphism for which: ˜
f (U ) = f ( U ), for all U ∈ AD. Together the
category of A-dioids and A-morphisms is denoted DA.

14 In [16], the names for the two forms of distributivity were inadvertently swapped.

18
What we have done is complete the construction of what is known as a T-
Algebra [26] or an Eilenberg-Moore Algebra. The category DA is referred to
as an Eilenberg-Moore category.15 The functor A can therefore be regarded as
a map between the categories Monoid and DA. Since the algebras in DA
contain the structure of monoids, then what we actually have is an adjunction
with a forgetful functor  : DA → Monoid that reduces each A-dioid D to
its underlying monoid ÂD = D and each A-morphism F : D → D0 to its
underlying monoid homomorphism ÂF = F : ÂD → ÂD0 . The adjunction
relation is established by the following theorem.
Theorem 3.2. The functor A : Monoid →  DA and the forgetful functor
 : DA → Monoid form an adjunction pair A,  .

Proof. An adjunction that (up to equivalence) arises from a T-algebra construc-


tion is referred to as monadic . This is what we are actually verifying. In the
following, we will use the notation and conventions defined in the appendix.
The proof that A is actually a functor requires showing closure under identity
functions and composition: 1f ] ˜
M = 1AM and f ◦ g = f ◦ g̃, which are both trivial
consequences of set theory.
The one-to-one correspondence between monoid homorphisms f : M → ÂD
and A-morphisms F : AM → D is defined as follows:

f ∗ : U ∈ AM 7→ f (U ) ∈ D, F∗ : m ∈ M → F ({m}) ∈ ÂD.

The identities (f ∗ )∗ = f and (F∗ ) = F immediately follow.
The naturalness of this correspondence is shown as follows. Let F : D → D0
be an A-morphism between A-dioids D and D0 , let h : M 0 → M and g : M →
ÂD be monoid homomorphisms. Then for each U ∈ AM 0 , we have
 ∗ P    X  
ÂF ◦ g ◦ h (U ) = F̃ g̃ h̃ (U ) =F g̃ h̃ (U )
  
= F g ∗ h̃ (U ) = (F ◦ g ∗ ◦ Ah) (U ) .

This results in
 a hierarchy
 of monads. For each natural family A there is an
adjunction pair A, Â that extends the category of monoids to the category of
A-dioids. The unit of the adjunction is the polymorphic function (i.e., natural
transformation) η : IMonoid → Â ◦ A with ηM (m) = {m} ∈ AM for m ∈ M ,
 2  
which is given by A1 . The monad product µ : Â ◦ A → Â ◦ A is the union
S
µM (Y) = Y ∈ AM for Y ∈ AAM , which is given by A2 . Closely P related to
the product is the co-unit ε : A ◦ Â → IDA given by εD (U ) = U ∈ D for

15 The subcategory of DA consisting only of the free A-extensions AM of monoids M gives

us what is called a Kleisli category.

19
 2
U ∈ AD. Finally, related to the unit is the co-product δ : A ◦ Â → A ◦ Â ,
defined by δD (U ) = {{u} : u ∈ D} ∈ AAD for U ∈ AD.
Elementary consequences of this construction, which are generally true for
T-algebras, are the following. Detailed proofs may be found in [16, theorems 1,
2, 3, 4].
Theorem 3.3. Let M 7→ AM be a natural family. Then
• AM is an A-dioid for any monoid M .
P
• ( ) : AD → D is an A-morphism for any A-dioid D.
• Every monoid homomorphism f : M → N lifts to an A-morphism f˜ :
AM → AN .

• (The Universal Property). The free A-dioid extension of a monoid M is


AM .
Equivalently, the universal property may be stated as: (a) ηM : M → AM
is a monoid homomorphism, and (b) to each monoid homorphism f : M →
ÂD to an A-dioid D corresponds a unique A-morphism f ∗ : AM → D; such
that f = f ∗ ◦ ηM . As an illustration of the categorical algebra developed in
the appendix for adjunctions, the uniqueness can be established as follows. If
f = F ◦ ηM = ÂF ◦ ηM , then noting the identity ηM = (1AM )∗ , it follows that
 ∗

ηM = 1AM . Therefore, f ∗ = ÂF ◦ ηM = F ◦ ηM ∗
= F.
Other results that are not general consequences of the T-algebra construc-
tion, but specific to our construction are the following.
Theorem 3.4. Natural families respect submonoid ordering: if M ⊆ M 0 , then
AM ⊆ AM 0 .
Proof. This is the direct result of applying A4 to the inclusion homomorphism
m ∈ M 7→ m ∈ M 0 .
Note that we do not necessarily have the stronger property AM = AM 0 ∩
PM . That is, if U ⊆ M may be a A-subset of M 0 ⊇ M , it need not follow
that U ∈ AM . The question of whether or not the stronger property holds is
unresolved.

Theorem 3.5 (Hierarchical Completeness). Natural families form a complete


lattice with top AM = PM and bottom AM = FM .
T
Proof. Let Z be a family of natural families, and define (∧Z) M = A∈Z AM .
If Z = ∅, we define ∧Z = P. Otherwise, suppose Z 6= ∅. Properties A0 ,
A1 , A2 and A4 are then easily verified for ∧Z. Property A3 ,T
however, is not as
immediate. For, suppose that A ∈ Z, we then have (∧Z) M = A∈Z AM ⊆ AM .
To complete the proof, we need to make use of the preservation of submonoid

20
ordering under natural families. Then, we may write A (∧Z) M ⊆ AAM . Thus,
for any family of subsets Y ∈ (∧Z) (∧Z) M , we have that
\
Y∈ A (∧Z) M ⊆ A (∧Z) M ⊆ AAM.
A∈Z
S
Thus,
S T A3 , Y ∈ AM . Since A ∈ Z was arbitrarily chosen, this shows that
by
Y ∈ A∈Z AM = (∧Z) M . Thus ∧Z satisfies property A3 .
The lattice ordering relation is a restriction of the following to natural fami-
lies, while finite meets are directly expressible in terms of intersections as follows:

Definition 3.6. Let M 7→ AM and M 7→ A0 M denote two subset families.


Then we write A 6 A0 and A0 > A if and only if ∀M : AM ⊆ A0 M . The meet
is defined by A ∧ A0 : M 7→ AM ∩ A0 M .

3.4. A-Topology
As preconditions to both forms of A-distributivity and to A-continuity, we
required A-additivity. However, the properties can be generalized to partially
ordered monoids and made independent of additivity, with the following defini-
tion.

Definition 3.7. Let M be a partially ordered monoid and write m > U if


m ∈ M is an upper bound of a subset U ⊆ M . Then M is strongly A-separable
if for all m > U U 0 , there exists u > U and u0 > U 0 such that m > uu0 , where
m ∈ M and U, U 0 ∈ AM ; M is weakly A-separable if for all m > aU b, there
exists u > U , such that m > aub, where a, b ∈ M and U ∈ AM . Finally, an
order-preserving monoid homomorphism f : M → M 0 to a partially ordered
monoid M 0 is A-continuous, if for all m0 > f˜ (U ), there exists u > U , such that
m0 > f (u), where U ∈ AM and m0 ∈ M 0 .
P
In an A-dioid D, the condition u > U is equivalent to u > U , when
U ∈ AD. Therefore, one may verify that when restricted to the category DA,
the strong and weak forms of A-separability are equivalent to one another and
to both forms of A-distributivity, while A-continuity equivalently defines an
A-morphism.
The family AM can be endowed with a topological structure with respect
to which a monoid homomorphism f : M → M 0 is A-continuous if and only if
f˜ : AM → AM 0 is continuous. A sufficient condition for this correspondence is
given by the following:

Definition 3.8. A partially ordered monoid M is A-directed if the set of upper


bounds of each U ∈ AM form a non-empty downward-directed subset of M
(i.e. for every two upper bounds u1 , u2 > U there is a third upper bound u > U
such that u1 , u2 > u.) For such monoids, it follows that the neighborhoods
Ax M = {U ∈ AM : U < x} generate a topology that we will adopt as the A-
topology of AM . The open sets are defined as arbitrary unions of neighborhoods.

21
We can then state the following results:
Theorem 3.9. Let M, M 0 be A-directed partially ordered monoids. Then f :
M → M 0 is A-continuous if and only if f˜ : AM → AM 0 is continuous. In
addition, M is weakly A-separable if and only if for each a, b ∈ M , the function
U ∈ AM 7→ aU b ∈ AM is continuous; and M is A-separable if and only if the
product U, V ∈ AM 7→ U V ∈ AM is continuous.
Proof. Both directions of each of the three correspondences rely on the equiv-
alence x > U ⇔ U ∈ Ax M . This is illustrated for A-continuity, the other
correspondences being similar.16
First, suppose f : M → M 0 is A-continuous and O ⊆ AM 0 is an open set.
Let U ∈ f˜−1 (O); i.e., f˜ (U ) ∈ O. Then, there is a neighborhood Ay M 0 ⊆ O
containing f˜ (U ); or, equivalently, y > f˜ (U ). By A-continuity, we have y >
˜
f (u), for some u > U . Since f is monotonic, then f˜ (Au M ) ⊆ Ay M 0 , while
U ∈ Au M . Thus, f˜ (O) is open.
−1

Next, suppose f˜ : AM → AM 0 is continuous. Let U ∈ AM and f˜ (U ) < y.


Then f˜ (U ) ∈ Ay M 0 . Therefore, by continuity, there is a neighborhood Au M
˜
containing U , such that f˜ (Au M ) ⊆ Ay M 0 . In particular, since {u} ∈ Au M ,
then {f (u)} ∈ Ay M , or f (u) 6 y. Since U ∈ Au M , then u > U . Thus, we
establish the A-continuity of f .

3.5. Examples
Example 3.10. From our prior discussion on generalized grammars, we have
the Chomsky hierarchy, which consists of the following families RM , CM , SM
and T M ordered by R 6 C 6 S 6 T . In the following, we will show that each
is a natural family, except S.

Example 3.11. We may define the following cardinality-limited subset families

Fk M = {U ⊆ M : card (U ) < c} , Pc M = {U ⊆ M : card (U ) 6 c},

for each regular transfinite cardinal k and transfinite cardinal c. The closure of
each of these families under products, unions and monoid homomorphisms are
easy consequences of set theory17 , thus proving the naturality of each family.
As special cases, we have the natural families F = Fℵ0 and ω = Pℵ0 of finite
and countable subsets, respectively.

Example 3.12 (Finite Generativity). Each A can be restricted to the sub-


family formed from finitely generated submonoids:
[
A0 M = A hXiM .
X∈F M

16 For the continuity of the product, we use the product topology on AM × AM .


17 For Fk , property A3 is the definition of a regular cardinal.

22
The largest such family is P0 . For the other natural families, we have A0 6
A ∧ P0 in the lattice ordering of natural families, with equality if A satisfies the
strong version of submonoid ordering.18

4. Closure Properties and Constructions

4.1. Free Extensions and Substitution Expressions


The description of free extensions we gave in section 2.1 as a categorical
algebra applies also to the category DA of A-dioids. Here, the corresponding
operations are the A-morphism ιD,Q : D → D [Q], the map σD,Q : Q → D [Q],
and the construction of a unique A-morphism hf, σi : D [Q] → D0 from each
A-morphism f : D → D0 and map σ : Q → D0 , satisfying equations 1 and 2.
Roughly speaking, a construction for the free extension D [Q] of the A-dioid
D by the set Q may be obtained from the monoid free extension ÂD [Q] of
the monoid ÂD, defining the product by equation 3 and subjecting it to the
identities X X  X 
(U q . . . q 0 U 0 ) = U q . . . q0 U0 , (11)

for all U, . . . , U 0 ∈ AD and q, . . . , q 0 ∈ Q. The necessity of relations 11 is de-


termined by A-distributivity. The condition suffices to ensure the A-continuity
of ιD,Q and to show that equation 4 yields a monoid homomorphism hf, σi :
D [Q] = ÂD [Q] → ÂD0 = D0 that preserves relations 11.
In more precise terms, (AD) [Q] may be defined as the free A-extension
A ÂD [Q] , subject to the relations
X [  [ 
(Yq . . . q 0 Y0 ) = Y q . . . q0 Y0 , (12)
 
for Y, . . . , Y0 ∈ AA ÂD [Q] and q, . . . , q 0 ∈ Q. The A-morphism ιAD,Q =
ηÂD[Q] ◦ ιÂD,Q and the substitution σAD,Q = ηÂD[Q] ◦ σÂD,Q provide us with
a realization of the part of the universal property. For Dthe remainder of the
E ∗
universal property, we define the construction hf, σi = Âf, σ , which pre-
serves the relations 12 and is therefore well-defined as a function on D [Q].
0
nn subsets oU ∈ AD
The o are 
represented
 through the co-product by δ (U ) =
ι (u) : u ∈ U ∈ A ÂD [Q] , where U 0 = ι^ ÂD,Q (U ) ∈ ÂD [Q], while
P ÂD,Q S 0 0
U is represented by δ (U ) = U . Therefore, relations 11 are realized as a
special case of 12.
As a consequence of the free extension universal property, isomorphisms
analogous to equation 5 may be obtained, with the relation 1 [Q] ∼= Q∗ replaced

18 See the discussion following theorem 3.4. The proof of A3 is analogous to that in theorem
3.5.

23
by 2 [Q] ∼
= A [Q∗ ], where 2 = P1 = {∅, {0}} is the 2-element dioid. Similarly,
the following isomorphisms analogous to equation 6 holds

(AX ∗ ) [Q] ∼
= 2 [X] [Q] ∼
= 2 [X ∪ Q] ∼
= A (X ∪ Q) (X ∩ Q = ∅). (13)

Finally, the relation to the monoid free extension is established by the following
theorem.
Theorem 4.1. A (M [Q]) ∼
= (AM ) [Q].
Proof. Explicitly, the isomorphism is given explicitly in terms of the categorical
algebra by the following:


(ιAM,Q )∗ , σAM,Q : A (M [Q]) → (AM ) [Q] ,


AιM,Q , ηM [Q] ◦ σM,Q : (AM ) [Q] → A (M [Q])

which one may verify are inverses. Based on these considerations, may define
the substitution expression .

Definition 4.2 (Substitution Expression). Suppose D is a natural dioid,


L ∈ D [Q] and σ : Q = {q1 , . . . , qk } → D for some k > 0. Then,

(q1 = σ (q1 ) , . . . , qk = σ (qk ) , L) ≡ h1D , σi (L) ∈ D.

4.2. A5 : Closure under Substitutions


The product closure property A2 is well-known in the classical theory and
readily established in that setting for the finite, countable, regular, context-free,
context-sensitive and Turing subsets of a free monoid. A similar observation ap-
plies to the homomorphism closure property A4 , with the proviso that context-
sensitive subsets only possess closure under what are referred to as non-erasing
monoid homomorphisms.
In contrast, the union closure property A3 is decidedly non-classical. How-
ever, there is a classical property closely related to it that also happens to
subsume A4 . This corresponds to the classical concept of substitution. Given
two monoids M and M 0 , a substitution σ : M → PM 0 may be thought of as
the generalization of a map that replaces terminals by languages. Each element
of M is replaced by a subset of M 0 . Reflecting the hierarchy of natural families
A is a similar hierarchy of substitutions, as given by the following definition.

Definition 4.3. Let M and M 0 be monoids. A monoid homomorphism σ :


M → PM 0 is called a substitution. If, in addition, AM 0 ⊆ PM 0 is any family
of subsets such that σ (m) ∈ AM 0 , for each m ∈ M , then σ is an A-substitution.

Each substitution σ : M → PM 0 extends uniquely Sto a map between the


respective power sets of the monoids given by, σ ∗ (U ) =S m∈U S
σ (m) ∈ PM 0 for
U ∈ PM . This extension distributes over unions: σ ( Y) = U ∈Y σ ∗ (U ), for

Y ∈ PPM . Therefore, it is also a quantale homomorphism. This leads to the


following result.

24
Theorem 4.4. A substitution σ : M → PM 0 determines and is uniquely deter-
mined by a quantale homomorphism φ : PM → PM 0 such that φ ({m}) = σ (m)
for m ∈ M .
Proof. In the forward direction, for Y ∈ PPM , we have
[  [ [ [ [
σ∗ Y = σ (m) = σ (m) = σ ∗ (U ).
U ∈Y m∈U U ∈Y
S
m∈ Y

For m ∈ M , we have σ ∗ ({m}) = m0 ∈{m} σ (m0 ) = σ (m). Conversely, suppose


S
φ : PM → PN is a quantale homomorphism satisfying the condition of the
theorem. Then, making use of the invariance property, for each U ∈ PM , we
have [ [
φ (U ) = φ ({m}) = σ (m) = σ ∗ (U ) .
m∈A m∈A

With these preliminaries, we may then state the following property, A5 : A


respects A-substitutions. That is, if σ : M → AM 0 is an A-substitution, then
σ ∗ (U ) ∈ AM 0 for all U ∈ AM . The equivalence of A5 to A3 and A4 is then
established by the following result.
Theorem 4.5. Suppose M 7→ AM is a subset family satisfying A0 , A1 and
A2 . Then A3 and A4 are equivalent to A5 .
Proof. In the following, let M and M 0 be monoids.
First, we will prove A0 , A2 , A3 , A4 → A5 . Suppose σ : M → AM 0 is
an A-substitution and U ∈ AM . Then, by A0 and A2 , AM 0 ⊆ PM 0 is a
monoid and σ : M → AM 0 a monoid homomorphism. By AS 4 , it follows that
σ̃ (U ) ∈ AAM 0 . In turn, by A3 , it follows that σ ∗ (U ) = σ̃ (U ) ∈ AM 0 .
The conversion σ 7→ σ ∗ is just the natural bijection already given to us by the
adjunction A : Monoid → DA.
Second, we will prove that A1 , A5 → A4 . Suppose f : M → M 0 is a monoid
homomorphism and U ∈ AM . Then by A1 , A > F, therefore σ : m ∈ M 7→
{f (m)} ∈ FM 0 ⊆ AM 0 is anSA-substitution. Applying A5 , it follows that
σ ∗ (U ) ∈ AM 0 . But σ ∗ (U ) = {f (m)} = f˜ (U ). Therefore, f˜ (U ) ∈ AM 0 .
m∈U
Finally, we will prove that A0 , A2 , A5 → A3 . Suppose Y ∈ AAM . Then,
the identity σ = 1AM : AM → AM is an A-substitution by A0 and a monoid

homomorphism

S by A S A5 it follows that σ (Y) ∈ AM . But
S2 . Therefore, from
σ (Y) = U ∈Y U = Y. Therefore, Y ∈ AM .

4.3. A6 : Surjectivity and Closure under Inverse Morphisms


A partial recapitulation of the classical property of closure under inverse
homomorphisms can be obtained with the following property.
Theorem 4.6 (A6 ). If the monoid homomorphism f : M → N is surjective,
then so is the A-morphism f˜ : AM → AN .

25
Proof. The property of surjectivity may be stated solely in terms of the proper-
ties of homomorphisms in the following way: given homomorphisms g, h : N →
P to another monoid P , if g ◦ f = h ◦ f then g = h. Surjectivity for the f˜ is
proven analogously.
Therefore, assume G, H : AN → D are A-morphisms, such that G ◦ f˜ =
H ◦ f˜. Then, making use of the categorical algebra associated with adjunctions,
we have
G∗ ◦ f = (G ◦ Af )∗ = (H ◦ Af )∗ = H∗ ◦ f → G∗ = H∗ → G = H,
the first inference following by the surjectivity of f , the second following since
∗ ∗
G = (G∗ ) and H = (H∗ ) . This conclusion can also be regarded as an appli-
cation of the universal property (c.f. [16, theorem 9]).
As a consequence, we find that natural families respect inverse morphisms
in the following sense:
Theorem 4.7. Let A be a natural family. Then if f : M → N is a surjective
monoid homomorphism, and V ∈ AN then V = f˜ (U ) for some U ∈ AM .
Moreover, there is a monoid N̂ , a surjective map σ : N̂ → N , and a factoring
σ = f◦ φ into φ : N̂ → M and f ; such
 that each V ∈ AN may be expressed as
σ V̂ for some V̂ ∈ AN̂ where φ V̂ ∈ AM .

Proof. The first statement follows directly from A6 . For the second part, let
Y ⊆ N be any generating subset of the monoid N . The universal property for
free monoids then associates a canonical monoid homomorphism σ : N̂ = Y ∗ →
N with the inclusion σ : Y → N . This maps the free monoid Y ∗ generated by
the set Y onto the closure of that set within N , which (by assumption) is just
N , itself.
Let V ∈ AN
 . Since σ : N̂ → N is surjective then there exists V̂ ∈ AN̂ such
that V = σ V̂ . To define φ, for each y ∈ Y , we need to choose an element
m ∈ M such that f (m) = σ (y), and then define φ (y) = m. The remainder of
the theorem then follows by property A4 .

4.4. A7 : Finite Generativity


In general, if Y is an infinite set, the second part of theorem 4.7 requires the
Axiom of Choice. However, for certain natural families, we will always be able
to express a subset V ∈ AN as V ∈ AY ∗ for some finite subset Y ⊆ N . This
is captured by the Finite Generativity property,19 A7 : every A-subset is an A-
subset of a finitely generated submonoid. That is, if U ∈ AM then U ∈ A hXiM
for some finite X ⊆ M . It is clear that A satisfies A7 if and only if A = A0 (see
example 3.12). In addition, if A satisfies the strong form of submonoid ordering
(discussed following theorem 3.4), this condition is equivalent to A 6 P0 .

19 This property was adopted by [12, p. 356, following lemma 2.5] as a condition for context-

free subsets of arbitrary free monoids. Though it was also mentioned in [16], it was not
discussed in any detail there.

26
Theorem 4.8. F, R, C, S, T each satisfy A7 .
Proof. Let M be a monoid. For U ∈ FM , we take X = U . Then U ∈ F hXiM .
For the other cases, suppose Y is a generating subset of M . Then U is given by a
grammar expression U = (α0 → α1 , . . . , α2n−2 → α2n−1 , α2n ) of the appropriate
type over hY iM = M . Then, we may write the component words in the form
αi = zi0 zi1 . . . zini , where 0 6 i 6 2n, ni > 0 and zij ∈ Y ∪ Q for 0 6 j 6 ni .
Then
X = Y ∩ {zij : 0 6 i 6 2n, 0 6 j 6 ni }
gives us a suitable choice for the finite set X. Since the rules α2i → α2i+1 and
starting configuration α2n are all over hXiM [Q] ⊆ hY iM [Q] = M [Q], it follows
that the grammar expression yields a subset U ⊆ hXiM of the appropriate
type.

4.5. Finite Dioids, Initial and Terminal Objects, Syntactic Monoids


All finite dioids have a complete lattice ordering. Therefore, each category
DA contains the same finite algebras. The smallest dioid 1 = {0} is uniquely
defined by the identity 0 = 1. It has a trivial product 00 = 0, ordering 0 6 0
and sum 0 + 0 = 0 and is a terminal object in each natural dioid category. That
is, it satisfies the universal property that for each algebra D there is a unique
morphism ∗D : D → 1. The uniqueness is given by the conditions ∗1 = 11 and
∗D0 ◦ F = ∗D where F : D → D0 is a morphism. Since 0 = 1 in 1, no dioid
morphism f : 1 → D exists, except for D = 1.
The next smallest dioid 2 = {0, 1} has a product 00 = 01 = 10 = 0 and
11 = 1, an ordering 0 < 1 and a sum 0 + 0 = 0 and 0 + 1 = 1 + 0 = 1 + 1 = 1.
This is the initial object and satisfies a universal property analogous to the initial
monoid 1, discussed in section 2.8. Denoting the unique morphism to the algebra
D by &D : 2 → D, we have &2 = 12 and F ◦ &D = &D0 , for each morphism
F : D → D0 . Moreover, it follows from the respective universal properties for
the monoid 1 and dioid 2, that A&M ◦ &A1 = &AM and Â&D ◦ &Â2 = &ÂD .
The

dioids of cardinality

3 are defined, respectively,

by the presentations

30 = x : x2 = x < 1 , 31 = x : 1 < x = x2 and 32 = x : x2 = 0, x < 1 .
There are 21 dioids D = {0, 1, x, y} of order 4. Letting A, B range over
{0, 1}, they are described as follows. Five are ordered as a Boolean lattice, four
satisfying the identities x + 1 = y and x2 = Ax + B, and the fifth 2 ⊕ 2 is the
direct sum, which is defined uniquely by the identity x + y = 1.
The remaining 16 dioids are on a linear ordering. The two with the ordering
0 < 1 < x < y are given by x2 = x + Ay. One has the ordering 0 < x < 1 < y
and is given by x2 = 0, while the other four with the same ordering are given
by x2 = x, xy = x + Ay and yx = x + By.
Finally, the remaining 9 all have the ordering 0 < x < y < 1. One is
uniquely defined by y 2 = 0, two are defined by y 2 = x and y 3 = Ax, four by
y 2 = y, x2 = 0, xy = Ax and yx = Bx, and the last two by y 2 = y, x2 = x,
xy = x + Ay = yx.

27
Other examples may be constructed from the syntactic monoids X ∗ /L ≡
{w/L : w ∈ X ∗ } associated with regular languages L ⊆ RX ∗ , where

w/L ≡ {w0 ∈ X ∗ : (∀u, v ∈ X ∗ ) (uwv ∈ L ↔ uw0 v ∈ L)} .

For each natural family A, A (M/L) is a finite A-dioid, since M/L is finite.
More generally, each finite dioid D may be associated with a partition of
X ∗ , where X = D − {0, 1} is taken as a generating subset of D and σ : X ∗ → D
as the canonical monoid homomorphism to D. A finite state automaton A with
start state 1 ∈ D, final state F ∈ / D, and transitionsPd → dx on x ∈ X and
d → F on dˆ for d ∈ D yields a partition L(A) = ˆ
d∈D Ld d into mutually
∗ −1 ∗
disjoint subsets L0 , L1 ⊆ RX , and Lx = σ (x) ⊆ RX for x ∈ X.
For instance, associated with the direct sum 2 ⊕ 2 is the partition L1 = 1,
Lx = xx∗ and Ly = yy ∗ . Associated with the dioid given by 0 < x < y < 1,
y 2 = x and y 3 = 0 is the partition Lx = x + yy, Ly = y and L1 = 1.

4.6. Tensor Products


As was true with free extensions, the description we gave of tensor products
in section 2.8 as a categorical algebra may also be applied to the category DA,
with both tensor product constructions applications of Appendix B.
The resulting algebra may be termed the tensor product of the A-dioids A
and B, written A ⊗ B or A ⊗A B for clarity. It is succinctly described as the free
common extension of the two dioids in which the elements of A and B commute
one another.
The corresponding statement of the universal property gives us the factor
A-morphisms LA,B : A → A ⊗ B and RA,B : A → A ⊗ B which commute with
one another, and a unique A-morphism hF, Gi : A ⊗ B → C for each pair of
A-morphisms F : A → C and G : B → C that commute with one another such
that hF, Gi ◦ LA,B = F and hA, Bi ◦ RA,B = G. Again, as was the case with the
tensor product construction for monoids, the uniqueness is equivalently stated
by the conditions that hLA,B , RA,B i = 1A⊗B and H ◦ hF, Gi = hH ◦ F, H ◦ Gi,
where H : C → D is another A-morphism.
Following the approach we used in section 4.1, we may first pose a rough
description of a suitable construction for D ⊗ D0 as the direct monoid product
ÂD × ÂD0 and subject it to the relations
X X X 
(U × U 0 ) = U, U0 , (14)

for U ∈ AD and U 0 ∈ AD0 . In a similar way, a more precise description may be


0
 the tensor product D ⊗A D to be the free monoid
refined from this by taking
extension A ÂD × ÂD0 subject to the identities
X [  [ 
Ū × Ū 0 = Y × Y0 , (15)
Ū ∈Y,Ū 0 ∈Y0

28
   
where Y ∈ AA ÂD and Y0 ∈ AA ÂD0 . The factor morphisms may then
be represented by ÂLD,D0 = ηÂD×ÂD0 ◦ LÂD,ÂD0 and ÂRD,D0 = ηÂD×ÂD0 ◦
D E∗
RÂD,ÂD0 , while the morphism hF, Gi = ÂF, ÂG preserves the relations
15 and is therefore well-defined as a function on D ⊗A D0 , thus giving us a
realization of the universal property.
 Again,
 we recover the naive relations 14
0
through the co-product δ : A ÂD × ÂD , by associating each U ∈ AD and
U 0 ∈ AD0 respectively with δ Ū and δ Ū 0 , where Ū = {(u, 1) : u ∈ U } and
 

Ū 0 = {(1, u0 ) : u0 ∈ U 0 }. Under this correspondence, 14 becomes a special case


of 15.
With this background in mind, we may simply treat A ⊗ B as a common
(mutually commutative) extension of the two algebras, writing products of a ∈ A
and b ∈ B as ab = LA,B (a) RA,B (b) ∈ A ⊗ B.
As an application of the categorical algebra developed in Appendix B, we
arrive at the following two results:

Theorem 4.9. DA is a braided monoidal category, in which the following are


natural isomorphisms:

(D ⊗A D0 ) ⊗A D00 ∼
= D ⊗A (D0 ⊗A D00 ) ,
D ⊗A 2 ∼=D∼ = 2 ⊗A D,
D ⊗A D0 ∼
= D0 ⊗A D.

Theorem 4.10 (Transduction Theorem). The adjunction A : Monoid → DA


is a monoidal adjunction, with A (M × M 0 ) ∼
= AM ⊗A AM 0 a natural isomor-
phism.

Finally, we have the following property


Theorem 4.11. For any dioid D, D ⊗ 1 = 1 = 1 ⊗ D.
Proof. Since there is no dioid morphism from 1 other than to itself, then the
dioid morphisms L1,D : 1 → 1 ⊗ D and RD,1 : 1 → D ⊗ 1 immediately imply
the stated results.

4.7. A8 : Closure under Involution and Reversal


A monoid anti-homomorphism f : M → M 0 is a map with the properties
f (mm0 ) = f (m0 ) f (m), for m, m0 ∈ M , and f (1) = 1. In particular, if the
map is a bijection onto M 0 = M with f −1 = f , then we may write f (m) = m†
and term the operation an involution. An involution is therefore defined by the
properties
† †
1† = 1, (mm0 ) = m0† m† , m† = m.
A similar definition
P † applies to A-dioids, where we also require the preservation of
the sum, ( U ) = u∈U u† , for U ∈ AM . However, in order for this operation
P
to be meaningful, we need to know that the sum can actually be defined. That

29
is,
 †we need the property,
A8 : under an involution m ∈ M 7→ m† ∈ M , U † ≡
u ∈ M : u ∈ U ∈ AM for each U ∈ AM , thus yielding an involution on AM .
An equivalent statement of A8 is that for anti-homomorphisms f : M → M 0 ,
if U ∈ AM , then f˜ (U ) ∈ AM 0 . For, let X ⊆ M be a generating subset of M
and σ : X ∗ → M the canonical surjection  onto M . Then we may define the
homomorphism fσ : w ∈ X ∗ 7→ f σ w† ∈ M 0 , where w ∈ X ∗ 7→ w† ∈ X ∗ is

the reversaloperation.
 Given U ∈ AM , by surjectivity, we can find a Û ∈ AX
such that σ̃ Û = U . By A8 , Û ∈ AX . Therefore, by A3 , f˜ (U ) = ff
† ∗
σ Û


AM 0 .
Example 4.12 (Free extensions & tensor products). We can apply the
constructions used for free extensions and direct products of monoids to anti-
homomorphisms. Thus, if f : M → P and g : N → P are monoid anti-
homomorphisms and σ : Q → P a map, then we can define hf, σi : M [Q] →
P and hf, gi : M × N → P in the obvious way, by hf, σi (mq . . . q 0 m0 ) =
f (m0 ) σ (q 0 ) . . . σ (q) f (m) and hf, gi (m, n) = f (m) g (n). The free extension
constructor satisfies equations 1 and 2, while the direct product constructor
satisfies equations 7 and 8.
For this to be extended to A-morphisms requires A to satisfy A8 closure.
Assuming this is the case, then given A-involutions iD : d ∈ D 7→ d† ∈ D
0 0 0† 0 0
 iD0 : d ∈ D 7→ d ∈ D on A-dioids D and D , the monoid involution on
and

ÂD [Q] may be extended to an A-involution iD[Q] on D [Q] by (U q . . . q 0 U 0 ) =
U 0† q 0 . . . qU † , where U, . . . , U 0 ∈ AD, and the monoid involution on ÂD × ÂD0

to an A-involution iD⊗A D0 on D ⊗A D0 by (U × U 0 ) = U † ×U 0† , where U ∈ AD
0 0
and U ∈ AD . They satisfy the properties
iD[Q] ◦ ιD,Q = ιD,Q ◦ iD , iD[Q] ◦ σD,Q = σD,Q ,
iD⊗A D0 ◦ LD,D0 = LD,D0 ◦ iD , iD⊗A D0 ◦ RD,D0 = RD,D0 ◦ iD0 .
More generally, given anti A-morphisms F : D → D00 and F 0 : D0 → D00 and
a map σ : Q → D00 , we may use the involutions to define hF, σi = iD[Q] ◦
hF ◦ iD , σi and hF, F 0 i = iD⊗A D0 ◦ hF ◦ iD , F 0 ◦ iD0 i. It follows, then, that
iD[Q] = hιD,Q ◦ iD , σD,Q i and iD⊗A D0 = iD ⊗A iD0 = hLD,D0 ◦ iD , RD,D0 ◦ iD0 i.
For cardinality-limited natural families, F or ω, for P, and more generally
for Fk , Pc , and for the finite generativity class P0 , involution closure A8 holds.
In addition, by the following theorem, we have involution closure for members
of the Chomsky hierarchy.
Theorem 4.13. R, C, S, T each satisfy A8 .
Proof. This follows by the following elementary argument. Let G = (Q, S, H) be
a grammar over a monoid M that has an involution m 7→ m† . Then,extending
the
 †involution to M [Q], and defining the grammar G† = Q, S † , H † by H † =
α , β † : (α, β) ∈ H , it follows that α →G β iff α† →G† β † . Therefore,

† †
([α]G ) = α† G† and L G† = L (G) . Moreover, the type of grammar (regular,
  

context-free, context-sensitive) is preserved by the conversion. Thus, it follows


that R, C, S and T all satisfy A8 .

30
4.8. A9 : Matrix Closure, Relations and Matrix Algebras
The relations over a set X form an involutive quantale P (X × X) ordered
by subset inclusion. The product R, R0 7→ R ◦ R0 is relational composition,
defined by (x, x0 ) ∈ R ◦ R0 iff (x, x00 ) ∈ R and (x00 , x0 ) ∈ R0 for some x00 ∈ X.
The identity is IX = {(x, x) : x ∈ X}. The involution R 7→ R† is the relational
transpose, defined by (x, x0 ) ∈ R† iff (x0 , x) ∈ R.
By A3 , the natural family A (X × X) is closed under A-sums and is therefore
an A-dioid. For finite ordinals20 n, A (n × n) = P (n × n), with a one-to-one
correspondence to 2n×n given by A ∈ 2n×n 7→ {(i, j) ∈ n × n : Aij = 1}. This
is the finite dioid of n × n binary matrices. More generally, let D be an A-dioid
and define the algebra Mn D ≡ 2n×n ⊗A D, denoting it Mn,A D if A needs to be
made explicit. Noting the isomorphism Mn 2 ∼ = 2n×n , we also write Mn = 2n×n .
The tensor product universal property for Mn D takes on the following form:
suppose D0 is an A-dioid containing a subset {zij : i, j ∈ n} such that21
P
zij zkl = δjk zil (i, j, k, l ∈ n) , zii = 1. (16)
i∈n

Then for each A-morphism F : D → D0 that commutes with the zij , there
0
P a unique A-morphism Mn F : Mn D → D such that Mn F (LMn ,D (R)) =
is
z
(i,j)∈R ij for R ∈ M n and M n F (R Mn ,D (d)) = F (d), for d ∈ D.
Since M0 and M1 are respectively 1 and 2 element dioids, the isomorphisms
M0 ∼ = 1 and M1 ∼ = 2 are immediate. Thus, it also follows that M0 D ∼ = 1
and M1 D ∼ = D. We can also establish M m ⊗ M ∼
n = M mn , and more generally
Mm (Mn D) ∼ = Mmn D by using the bijection (i, k) ∈ m × n 7→ ni + k ∈ mn
to establish P (m × m) × P (n × n) ∼ = P (mn × mn) as a monoid homomor-
phism. Since this correspondence is preserved by unions, this also gives us the
isomorphism Mm ⊗ Mn ∼ = Mmn .
Closely related to Mn D, the set Dn×n hasPthe structure of a partially ordered
monoid with the matrix product (AB)ik = j∈n Aij Bjk , the identity Iij = δij
and the ordering relation A > B iff Aij > Bij , for all i, j ∈ n. An involution
on D  can be extended to an involution A ∈ Dn×n 7→ A† ∈ Dn×n , by defining
† †
A ij = Aji , for each i, j.
When least upper bounds are defined on the sets Uij ≡ {Aij : A ∈ U } for
n×n
each i, j, where PU ⊆ ADP , then P it follows by the definition of the ordering
relation that ( U )ij = Uij = A∈U Aij . At minimum, this makes Dn×n a
dioid.
Denoting the unit matrix for row i and column j by eij , and noting the
identity Uij ekl = eki U ejl , we have Uij ekl ∈ ADn×n . Summing over k = l ∈ n,
it also follows that Uij I ∈ ADn×n . However, to further conclude Uij ∈ AD
requires the matrix closure condition on A – A9 : if U ∈ ADn×n then Uij ∈ AD
for all i, j.

20 Note, we are identifying each ordinal with the set of the ordinals that precede it; thus,

n = {0, . . . , n − 1}, for finite ordinals n.


21 δ
ij denotes the Kroenecker delta, which is defined by δij = 0 if i 6= j and δii = 1.

31
In the absence of matrix closure, addition is subject to the limitations of
least upper bounds on D. However, we may define infinite P A-sums within Mn D
and use the dioid homomorphism φ : A ∈ Dn×n 7→ i,j∈n Aij (i, j) ∈ Mn D
to interpret infinite A-sums over Dn×n . If each of the Uij ∈ AD for U ∈
ADn×n , then the inverse map fulfills the universal property for Mn,A D and the
correspondence becomes an A-isomorphism Mn D ∼ n×n . Otherwise, we’re
P= D P
limited to only asserting the existence of the sum φ Uij ≡ φ̃ (Uij I) ∈ Mn D.
These considerations lead to a second form of the universal property for
Mn D. Suppose, again, there is a subset {zij } ⊆ D0 satisfying equation 16 and
suppose F : Dn×n → D0 is a dioid homomorphism satisfying F̃ (Uij I) ∈ D0
P
n×n
for all U ∈ AD there is aunique A-morphism F̄ : Mn D →
and alli, j. Then 
P 
D0 such that n×n
P P
F̃ φ̃ (U ) = i,j F̄ φ Uij zij for all U ∈ AD .
c×c ∼
The correspondence 2 = P (X × X) also applies to relations over infinite
sets X with cardinality c, since infinite addition is defined on 2c×c . Therefore,
we can define the A-dioid Mc D ≡ 2c×c ⊗A D. If A > Pc , where c is a transfinite
cardinal, then an A-dioid D has closure under infinite sums of cardinality c.
We can then define the matrix algebra Dc×c . Again, the same considerations
apply about the closure of sums over ADc×c . However, since infinitary addition
is defined on 2c×c , we still have the extension of Dc×c to the A-dioid Mc D ≡
2c×c ⊗A D.

4.9. The Polycyclic Dioids


In each category DA is an involutive algebra Cn (denoted Cn,A when nec-
essary) defined as the A-dioid with generators pi , qi for i ∈ n, subject to the
relations P
pi qj = δij (i, j ∈ n) , qi pi = 1. (17)
i∈n

The involution is defined by p†i


= qi , for i ∈ n. For C2 , writing b = p0 , d = q0 ,
p = p1 and q = q1 , the defining relations become

bd = 1 = pq, bq = 0 = pd, db + qp = 1. (18)

A homomorphism Cn → C2 is given by pn−1 7→ pn−1 and qn−1 7→ q n−1 ; and for


i < n − 1, pi 7→ bpi and qi 7→ q i d.
A remarkable property of Cn is the isomorphism Cn ∼ P by a ∈
= Mn Cn given
Cn 7→ â ∈ Mn Cn , where âij = pi aqj . The inverse A ∈ Mn Cn 7→ i,j qi Aij pj
may be obtained
P by applying the universalPproperty for Mn Cn with zij = qi pj ∈
Cn , p̂i = q p p
j∈n j i j ∈ C n and q̂j = i∈n qi qj pi . It is easily verified that
pi 7→ p̂i and qj 7→ q̂j generate a well-defined A-morphism from Cn to itself
and that the p̂i and q̂j commute with the zij , thus fulfilling the conditions of
the universal property for the A-morphism Mn Cn → Cn . We may also treat
Mn ⊆ Cn by identifying eij = qi pj .
Of particular interest are the tensor products Cn,A ⊗A D, on which the
universal property for the tensor product assumes the following form: let D0

32
be an A-dioid containing subsets {h0| , . . . , hn − 1|}, {|0i , . . . , |n − 1i} satisfying
the equations22
P
hi| |ji = δij (i, j ∈ n) , |ii hi| = 1. (19)
i∈n

Then any A-morphism f : D → D0 that commutes with the hi| and |ji ex-
tends uniquely extends to an A-morphism fn : Cn,A ⊗A D → D0 , such that
fn (LCn ,D (pi )) = hi|, fn (LCn ,D (qj )) = |ji and fn (RCn ,D (d)) = f (d).
For countably-additive and countably-distributive dioids D, a matrix repre-
sentation of Cn,A ⊗A D in 2ω×ω , can be given by
P P P
hi| = ea(na+i) , |ji = e(nb+j)b , d ∈ D 7→ dI = deaa , (20)
a∈ω b∈ω a∈ω

where i, j ∈ n. This leads to the following interpretation of Cn ⊗ D. Each


a ∈ ω may be regarded as an infinite sequence of base n digits that trail off in
0’s. Since e0a pi = e0(na+i) and e0(na+i) qj = δij e0a , the effect of pi is to add
the digit i to the front of the sequence, and the effect of qj is to remove the
first element of the sequence after testing for its equality to j. We may think of
this sequence either as encoding a stack or – by analogy with dynamics – as a
classical many-body system. For i > 0, the pi may be thought of as a “push i”
or “create i” operation, and the qj as a “pop and test for j” or “annihilate j”
operation. The combination q0 p0 is then the “empty stack” or “vacuum” test.
A dual interpretation of the generators may be obtained with the identities
pi e(nb+j)0 = δij eb0 and qj eb0 = e(nb+j)0 , with the pi as “pop and test for i” and
qj as “push j”.23

4.10. The Algebraic Chomsky-Schützenberger Theorems


Since R^ Cn ,D (D) is mapped to DI under the matrix representation just dis-
cussed, this shows that RCn ,D : D → Cn ⊗ D is an embedding. However, the
analogue of matrix closure A9 for Cn ⊗ D ⊆ Mℵ0 D fails to hold. It’s at this
point that the power of the polycyclic dioids emerges. For, as the following
results show, the commutant of Cn,R in Cn,R ⊗R D contains sums from the CD
subsets of an R-dioid, while the commutant of Cn,C in Cn,C ⊗C D̄ contains sums
from the T D̄ subsets of a C-dioid. In this way, a *-continuous Kleene algebra
may be extended to an algebra closed under recursively enumerable subsets. For

22 The appearance of the bra hi| and ket |ji, also satisfying these identities, and commonly

seen with Hilbert spaces for quantum theory, should not be taken as an analogy drawn with
quantum theory, but with linear Hilbert space representations. The synonymous identification
“Hilbert space” = “quantum theory” is a widespread misconception: classical systems also
have Hilbert space representations.
23 The analogy drawn here is this: though normally regarded as a vector in a Hilbert space

H, the bra can also be regarded as the “creation” operator in the many-body extension H∗
of H, with the ket playing the role of the annihilation operator. The many-body space H∗
(called the Maxwell-Boltzmann Fock space) is equivalently described as the space of all stack
configurations formed from 1-body states in H.

33
instance, the algebra of regular expressions in classical theory is transformed by
these means into algebras for context-free expressions and “Turing expressions”.
Theorem 4.14 (Chomsky-Schützenberger Theorem). Let D be an A-dioid,
with A > R. Then every U ∈ CD has a least upper bound in C2 ⊗ D.

Proof. Let G = (Q, S, H) by a separable grammar for U over a submonoid


hXiD generated by a finite subset X ⊆ D. Assume that 0 ∈ / Q ∪ X and let
Z = {0} ∪ Q ∪ X, with n the cardinality of Z. We then map Cn into C2 with
pz ∈ Cn 7→ hz| ∈ C2 and qz ∈ Cn 7→ |zi ∈ C2 for z ∈ Z, and extend this
notation to Z ∗ , by writing |z . . . z 0 i = |zi . . . |z 0 i and hz . . . z 0 | = hz 0 | . . . hz| (note
the reversal). Finally, we define
P P  ∗
X̂ = |xi x, Ĥ = |αi hβ|, Û = h0| hS| X̂ + Ĥ |0i,
x∈X (α,β)∈H

referring to Û as a Chomsky-Schützenberger kernel of U . Then it is verified by


inductive argument that
 n X
h0| hα| X̂ + Ĥ = h0| hβ| m
·α→n m·β

for n > 0, where →n refers to n applications of Shift and Generate (see section
2.5). Applying *-continuity on the left and the definition of →L
H on the right,
it follows that  ∗ X
h0| hα| X̂ + Ĥ = h0| hβ| m.
·α→L
H m·β

Finally, noting that ·α →L


H m· iff m ∈ [α]H , it follows that
 ∗ X
h0| hα| X̂ + Ĥ |0i = [α]H .
P
Thus, Û = U.
As the following theorem shows, these results may be extended to type 0
grammars.
Theorem 4.15. Let D be an A-dioid with A > R. Then every U ∈ T D has a
least upper bound in C2 ⊗ C2 ⊗ D.
Proof. Applying the same notation from the previous theorem, we assume here
that G is a contextual grammar for the subset U ∈ T D. We map a second
copy of Cn into C2 with pz 7→ (z] and qz 7→ [z), for z ∈ Z and write (z . . . z 0 ] =
(z] . . . (z 0 ] and [z . . . z 0 ) = [z 0 ) . . . [z). Note that the reversal, this time, is on the
closing brackets. Finally, we define
 ∗
[z) hz|, Û = (0] h0| hS| L + R + Ĥ X̂ ∗ [0) |0i.
P P
L= |zi (z], R =
z∈Z z∈Z

34
Then it follows by inductive argument that24
 n X
(0] h0| (α] hβ| L + R + Ĥ = (0] h0| (γ] hδ|
α·β→n γ·δ

where →n now denotes n > 0 applications of Shift Left, Shift Right and Gen-
erate (see section 2.3). Then, again, applying *-continuity on the left and the
definition of →0H on the right, we get
 ∗ X
(0] h0| (α] hβ| L + R + Ĥ = (0] h0| (γ] hδ|
α·β→0H γ·δ


Application on the right by X̂ ∗ [0) |0i filters out δ ∈
/ {hx| : x ∈ X} and all γ 6= 1.
Therefore, noting that ·α →0H ·β iff α →H β, it follows that
 ∗ X
(0] h0| (α] L + R + Ĥ X̂ ∗ [0) |0i = m = [α]H .
α→H ·m
P
Setting, α = S, it follows that Û = U.
These two results capture, in algebraic form, the classical result due origi-
nally to and named after Chomsky and Schützenberger [5, 6]. A self-contained
proof was published in ([22], Supplementary Lecture G, 197-199). A similar
result, in the context of power series algebras, is Theorem 4.5 in [32]. In [6], one
sees reference to a traditional form of grammatical analysis in which substruc-
tures are explicitly marked, e.g.
[S [NP the dog ]NP [VP ate [NP the homework ]NP ]VP ]S
where a noun phrase (NP), verb phrase (VP) and sentence (S) are marked
within labeled brackets. The classical theorem is established with a different
annotation convention, still, in which the terminals are also mapped to brackets.
In contrast, our choice of Chomsky-Schützenberger kernel corresponds to the
following annotation
[S ]S [VP [NP ]NP the dog ]VP [NP ate ]NP the homework
where all the opening brackets for a phrase are moved up front in reverse order,
and the closing brackets are moved to the head of the phrase – which is the
order characteristic for top-down parsers. To better show this, the parse may
itself be parsed
[S (]S [VP [NP ) (]NP the dog ) (]VP [NP ate ) (]NP the homework )
As the following example shows, other annotations, corresponding to other
parsing or analysis methods can be used as the basis for deriving Chomsky-
Schützenberger kernels.

24 This is where we require the grammar to be contextual. Otherwise, L and R may give us

the wrong factoring for m ∈ M in rules of the form α = γmδ → β.

35
Example 4.16. The archetypical context-free language is the Dyck language
Dn , which consists of the properly nested bracket sequences formed from n > 0
different types of brackets. Our particular interest is in its interleaves with other
sets. Let M be a monoid containing a finite subset Y ⊆ M and the (i , )i for all
i ∈ n. The interleaves of Dn and hY iM form a subset Dn (Y ) = L (G0 ) ⊆ M
given by the context free grammar G0 containing the start symbol N and rules
N → (i N )i N for i ∈ n, N → yN for y ∈ Y and N → 1.

Lemma 4.17. Using the notation of example 4.16, let D0 be an A-dioid with
A > R. Suppose Y ⊆ D0 is a finite subset and (i , )i ∈ D0 for all i ∈ n. Then
Dn (Y ) has a least upper bound in Cn ⊗A D0 given by
 ∗
X X X X
Dn (Y ) = h0|  (i hi| + |ji)j + y  |0i .
i∈n j∈n y∈Y

Proof. Let H denote the set of phrase structure rules of G0 . Left most deriva-
tions ·N →L H β · γ may only use the Shift and Generate rules in the following
combinations ·N →L L L
H ·(i N )i N →H (i ·N )i N for i ∈ n and ·N →H ·yN →H y · N
L

for y ∈ Y . The Generate rule ·N →H · only occurs in the context ·N )j →L


L
H
·)j →LH )j ·, for j ∈ n or as the final rule in the derivation sequence. Therefore, we
may replace the one-step rules of →L H by ·N → (i ·N )i N for i ∈ n, ·N → y · N ,
for y ∈ Y and ·N )j →)j ·, for j ∈ n. Let →D denote the closure under R,
T and C of these one-step rules. By inductive argument, it then follows that
·N →L H m· iff ·N →D m · N . Also, by inductive argument, it follows that
 n
X X X X
h0|  (i hi| + |ji)j + y = h0| hz . . . z 0 | m
i∈n j∈n y∈Y ·N →n m·N )z ...N )z0 N

where n > 0 and →n denotes the number of applications of the revised one-step
rules. Thus,
 ∗
X X X X X
h0|  (i hi| + |ji)j + y  |0i = m= L (G0 ).
i∈n j∈n y∈Y ·N →D m·N

The algebras C2 ⊗ D and C2 ⊗ C2 ⊗ D respectively contain the C-closure


¯ of D. Therefore, within C ⊗ D is an algebra for
D̄ of D and the T -closure D̄ 2
context-free expressions, while C2 ⊗ C2 ⊗ D goes one step further, containing an
algebra for Turing expressions . What the following theorem shows is that we
can factor the two relations into D → D̄ → D̄¯.

Theorem 4.18. Let D̄ be an A-dioid, with A > C. Then every U ∈ T D̄ has a


least upper bound in C2 ⊗ D̄.

36
Proof. Adopting the notation of theorems 4.14 and 4.15, P let U = L (G) be given
0
by a contextual grammar, as before, and define n = L (G ) ∈ C2 ⊗ D̄, where
L (G0 ) ∈ C C2 ⊗ D̄ is given by a grammar G0 with start symbol N and rules


N → |qi N hq| N for q ∈ Q, N → |αi hβ| N for (α, β) ∈ H and N → 1. Finally,


we set Û = h0| hS| nX̂ ∗ |0i.
Applying lemma 4.17, we replace the hz| and |zi in the lemma respectively
by (z] and [z) from a new copy of C2 , and set Y = {|αi hβ| : (α, β) ∈ H} and
(z = |zi, )z = hz| for z ∈ Z. Then we may write
 ∗
X X X
n = (0]  |zi (z] + [z) hz| + |αi hβ| [0)
z∈Z z∈Z (α,β)∈H
 ∗
= (0] L + R + Ĥ [0) .

Thus,
 ∗ X
h0| hβ| nX̂ ∗ |0i = (0] h0| hβ| L + R + Ĥ X̂ ∗ |0i [0) = [β]H ,

by which we obtain the result.

In the following section, we will establish the existence of the closures D̄ and
¯ , with a general construction that freely extends an A-dioid D to its B-closure

QB C ¯
A D for B > A. For A = R, the respective closures are D̄ = QR D and D̄ =
¯
QTR D, and it will indeed be the case that D̄ = QTC D̄ and that QTR = QTC ◦ QCR .

5. A Network of Adjunctions

Classically, the process of algebraization ended abruptly at the type 3 level


in the Chomsky hierarchy: the regular languages and their corresponding alge-
bra of regular expressions. Attempts were made to extend this process to the
type 2 level (i.e., context-free expressions) [12, 28, 38], but did not find partic-
ularly fruitful applications; e.g., no algebraic reformulation of parsing theory.
A significant step, however, in this direction had already been taken early on
[6] – the Chomsky-Schützenberger theorem for context-free languages, but its
algebraic underpinning (which theorems 4.14, 4.15 and 4.18 have revealed) re-
mained obscured up to the present. No theory of context-free expressions arose
from it. In recent times, we’ve begun to see renewed progress in this direction
[10]; and as we have seen at the end of the previous section we have the basis
for going much further still, although the full development of these ideas lies
beyond the scope of this paper.
Much of what stood in the way may have been the difficulty in clarifying
the algebraic foundation underlying the theory of regular expressions. In what
algebra(s) do these objects live? As was noted in [19], the landscape is populated
by a diversity of candidates that seem to get in the way of clearly answering the
question. However, what we now recognize is that it was not a multiplicity of

37
answers that was uncovered, but an algebraic analogue of the classical notion
of language hierarchy.
It is a hierarchy linked by a network of adjunctions. Through a construction
by ideals [7, 19], a series of adjunctions was established in which each of the
families in the hierarchy R < ω < P was completed to the families higher in the
hierarchy. Here, we will generalize these results, devising a general construction
for adjunctions between any two categories QB A : DA → DB, where A 6 B.

5.1. Ideals and Construction by Ideals


Though not clearly stated in [17], the construction is actually generic to the
category of partially ordered monoids and order-preserving monoid homomor-
phism, which we denote DI for reasons to be made apparent below. Recalling
our notation from section 3.4: for a partially ordered monoid M , we write
u > U for the upper bounds u ∈ M of a U ⊆ M . In addition we define U 0 =
{m ∈ M : (∀u > U ) u > m} and the interval hui = {m ∈ M : m 6 u}. Then, it
0 0
is an immediate
0
P P that for A-separable monoids aU b ⊆ (aU b) and
consequence
U = h U i, whenever U is defined. For A-continuous functions between
0
partially ordered monoids, we also have f˜ (U 0 ) ⊆ f˜ (U ) .25
An A-ideal I ⊆ M of a partially ordered monoid is defined by the property
I1 : aU b ⊆ I → aU 0 b ⊆ I, whenever a, b ∈ M and U ∈ AM , and A [M ] denotes
the set of all the A-ideals of M . It is easily verified from I1 that A [M ] is closed
under arbitrary intersection (with M ∈ A [M ] as a maximal element), so that
we can define the ideal closure U ⊆ hU iA as the intersection of all ideals that
contain U ⊆ M . For A-separable monoids M , the minimal ideals all reduce to
intervals: h∅iA = h0i and h{m}iA = hmi.
In the special case A = F, the condition I1 reduces equivalently to I2 : that I
be a union of intervals, IF1 : 0 ∈ I and IF2 : a, b ∈ I → a+b ∈ I, thus recovering
the definition of semi-lattice ideals. For A = R, these three conditions along
with IR1 : (∀n > 0) abn c ∈ I → ab∗ c ∈ I serve to equivalently characterize I1 ,
thus recovering the definition in [19] of a *-continuous Kleene algebra ideal.
For A-dioids D, criterion
P I1 reduces equivalently to I2 and I3 : that for each
U ∈ AD, U ⊆ I → h U i ∈ I. Thus,P the ideals generated by the subsets
U ∈ AD all reduce to intervals hU iA = h U i, giving us an embedding of the
partially ordered monoid structure of D into A [D]. This sets the stage for the
extension of the A-dioid to closure under the sums taken over natural families
B > A.
We can go further, in fact, and make the ideals A [M ] of a partially ordered
monoid M into a quantale
P S the product given by U, V 7→ hU V iA for U, V ∈
with
PM , the sum by Y = h YiA , for Y ∈ PPM . An A-continuous function
f : M → M 0 between partially ordered monoids is then extended D to a Equantale
morphism fA : A [M ] → A [M 0 ] with the definition fA (U ) = f˜ (U ) . The
A

25 aU 0 b ⊆ (aU b)0 and f˜ (U 0 ) ⊆ f˜ (U )0 are established in [17] in the proofs of Corollary 1 and

Lemma 6 respectively.

38
well-definedness of these operations is a consequence of the identities [17, lemmas
2, 4, 6]:
D E D E
f˜ (U ) = f˜ (hU iA )
S P
hU V iA = hhU iA hV iA iA , h YiA = hU iA ,
U ∈Y A A

0
for U, V ∈ PM , Y ∈ PPM and A-continuous morphisms f : M → M P . It also
follows that with respect to the subset ordering on A [M ], the sum Y is the
least upper bound of Y ⊆ A [M ] in A [M ].
Together, this is enough to define a functor QA : DA → DP which maps
A-dioids D to A [D] and A-morphisms f : D → D0 to fA : A [D] → A [D0 ].
By virtue of the one-to-one embedding d ∈ D 7→ hdi ∈ A [D], the functor is
the right-adjoint of the forgetful functor QA : DP → DA. However, instead
of establishing these properties directly, we can generalize the construction to
yield a network of functors.

5.2. The Adjunction Network


The functors QB A : DA → DB are defined by restricting the ideal family to
B-generated ideals. This definition is generic to partially ordered monoids M
and we can write QB A M = {hU iA ∈ A [M ] : U P ∈ BM }. The resulting algebra is
a B-dioid; in particular, closure under sums Z over Z ∈ BQB A M is a result
that requires surjectivity A6 to pull Z back to Z = {hU iA : U ∈ Y} for some
Y ∈ BBM , combined with A3 . Similarly, the restriction of fA to QB A M , where
f : M → M 0 is an A-continuous morphism, yields a B-morphism into QB AM ,
0

which is also a result that requires surjectivity, combined with A4 .


When QB A is restricted to P A-dioids D for B > A then by virtue of the em-
bedding U ∈ AD 7→ hU iA = h U i, the resulting algebra QB A D is an extension
of D, itself.
For B-dioids, when B > A, then the ideals in QA B D all reduce to intervals,
which shows that QA B D is isomorphic to D. In addition, for a B-continuous
morphism f : D → D0 , QA B f : hdi ∈ Q A
B D → hf (d)i ∈ Q A 0
B D reduces to an
A
equivalent of the function f . Thus, QB : DB → DA is the forgetful functor.
These results are captured by the following theorems [17, theorem 14, corollary
6, theorem 15, respectively].
Theorem 5.1. Suppose A 6 B. Then QA B : DB → DA is the forgetful functor.
In particular, QA
A is the identity functor on DA.
Theorem 5.2. Let A, B be natural families with A 6 B. Then QB A
A ◦ QB is the
identity functor on DB.
Theorem 5.3 (Adjunction Theorem). Suppose A 6 B. Then QB
A : DA → DB
is the left adjoint of QA
B : DB → DA.
The natural correspondence between A-morphisms f : A → QA B B and B-
∗ B ∗
PP ˜ 26
morphisms f : QA A → B is given by f (hU iA ) = f (U ). The extra

26 Since

f is only an A-morphism, one needs to show Pthat fP
(U ) is actually defined. This
˜ f˜ hU iA .

is done in [17, lemma 7], where it is also shown that f (U ) =

39
summation is needed because our construction by ideals replaces individual
elements by intervals.
Finally, the adjunctions defined here involve left-adjoints of forgetful func-
tors. However, since the forgetful functors close under composition, and the
composition of adjunctions is also an adjunction, then from the uniqueness of
left adjoints [26, corollary 1, p. 83], the following two theorems result (c.f. [17,
corollary 7]).
Theorem 5.4. Let A, D, B be natural families with A 6 D 6 B. Then

QA D A B D B
D ◦ QB = QB and QD ◦ QA = QA .

Theorem 5.5. Let A, B be natural families with A 6 B. Then

B = QB A
A ◦ A and B̂ = Â ◦ QB .

5.3. Downward Extension to I


To have the structure of a monad does not require including all finite sets in
natural families. We actually only need the unit map m ∈ M 7→ {m} ∈ AM .
Therefore, A1 can be weakened by only requiring singleton sets to be members
of a natural family, thus resulting in the condition A1W : η (m) = {m} ∈ AM ,
for each m ∈ M .
Then, the minimal family F is replaced by the “singleton” family IM =
{{m} : m ∈ M }, resulting in an extension of the hierarchy downward to a family
of algebras that have partially ordered monoid structures, but not necessarily an
upper semilattice structure. Its minimal member is the category DI of partially
ordered monoids with monotonic monoid homomorphisms.
The corresponding functor I : Monoid → DI takes a monoid M and maps
it to the same underlying set M , equipped with the flat-ordering, m 6 m0 ⇔
m = m0 . Conversely, the forgetful functor Î takes a partially ordered monoid
and gives us the underlying monoid sans the partial ordering. Correspondingly,
we have the factorings A = QA I
I ◦ I and  = Î ◦ QA .

5.4. Adjunction Closure Results


Since each of the natural dioid categories shares the same finite algebras,
then the isomorphisms QB ∼
A D = D follow for any finite dioid D. In particular,
the terminal dioid 1 and initial dioid 2 remain the same in each category DA,
up to isomorphism.
When A 6 B then, as discussed in Appendix B, the functors QB A

A , QB
form a monoidal adjunction pair with natural isomorphisms
0 ∼ B 0 ∼
QB B B B
A (D ⊗A D ) = QA D ⊗B QA D , QA Mn,A D = Mn,B QA D

where D, D0 are A-dioids. The second of these follows from QB


A Mn = Mn , since
the finite dioid Mn = 2n×n is the same in every category.

40
In addition, when A 6 B, then for free extensions we have an isomorphism
QB ∼ B
A (D [Q]) = QA D [Q], where D is an A-dioid. This is implemented by the
following two mutually inverse morphisms
D  E∗
ιQBA D,Q , h ◦ σQBA D,Q : QB B

A (D [Q]) → QA D [Q] ,


B
QA ιD,Q , k −1 ◦ η ◦ σD,Q : QB B

A D [Q] → QA (D [Q]) ,

where
h : QB A
QB k : QB A B
  
A D [Q] → QB A D [Q] , A (D [Q]) → QB QA (D [Q])

are identity maps defined by the forgetful functor QA


B , and η = ηD[Q] : D [Q] →
QA B
B QA (D [Q]) is the unit morphism for D [Q].

6. Inclusion of the Chomsky Hierarchy


At this point we have not fully resolved of which of subset families under
3.10 actually constitute natural families. Properties A0 and A1 are true, by
construction, for each subset family in examples 3.10, 3.11 and 3.12. Similarly,
A2 , A3 and A4 are well-known and easily proven in the cases of examples 3.11
(and easy to show for examples 3.12). However, for the families R, C, S and T
of example 3.10, A3 is neither obvious nor well-known, while A2 and A4 require
further clarification. This is what we will address here.

6.1. R and *-Continuous Kleene Algebras


A *-continuous Kleene algebra is a dioid which is R-additive. By definition,
the rational subsets RM of a monoid M are the closure of FM under the
product, finite union and Kleene star. Therefore A2 is satisfied, so that we need
only prove A3 and A4 , or equivalently A5 .
Theorem 6.1. The subset family M 7→ RM satisfies A5 .
Proof. This may be shown by induction. Let σ : M → RN be an R-substitution
from a monoid M to the rational subsets of a monoid N . For finite sets U ∈ FM ,
we immediately have σ ∗ (U ) = u∈U σ (u) ∈ RN , since RN is closed under
S
finite unions. Moreover, we may show that σ ∗ preserves unions and products,
since
σ ∗ ( Y) =
S S S ∗
σ (u) = σ (U ) (Y ⊆ PM ),
u∈U ∈Y U ∈Y

and for U, V ⊆ M ,
[ [ [
σ ∗ (U V ) = σ (uv) = σ (u) σ (v) = σ ∗ (U ) σ ∗ (V ) .
u∈U,v∈V u∈U v∈V


From this, it follows that σ preserves Kleene stars,
 

[ [ [ n
σ ∗ (U ∗ ) = σ ∗  U n = σ ∗ (U n ) = σ ∗ (U ) = σ ∗ (U ) .
n>0 n>0 n>0

41
Consequently, if we let U, V ∈ RM and assume by inductive hypothesis that
σ ∗ (U ) , σ ∗ (V ) ∈ RN , then it follows that

σ ∗ (U V ) = σ ∗ (U ) σ ∗ (V ) ∈ RN,
σ (U ∪ V ) = σ ∗ (U ) ∪ σ ∗ (V ) ∈ RN,


σ ∗ (U ∗ ) = σ ∗ (U ) ∈ RN

since RN is closed under products, finite unions and Kleene stars.


Corollary 6.2. The *-continuous Kleene algebra of finite transductions between
alphabets X and Y is given, up to isomorphism, by RX ∗ ⊗R RY ∗ .

Proof. This is an elementary consequence of theorem 4.10 applied to theorem


2.9.

6.2. The Composition Lemma and Nested Grammar Expressions


There remains the issue of A2 , A3 and A4 with respect to C, S and T . For
A3 and A4 , it turns out, again, to be more useful to establish A5 , instead. We
do this explicitly here for the subset family M 7→ CM , closely following the
development of the analogous result in the classical theory (c.f. [36, theorem
9.2.2]).

Example 6.3 (Nested Grammar Expressions). However, before we go on


with this result, it will help if we illustrate the composition process. For context-
free grammars, we are seeking to establish the ability to take a grammar ex-
pression with nested grammar subexpressions

S → SS,
 
S→ A → e, A → f Af,
AgA S B → BBh, B → i, BB ,

S → C → Cj, C → k, CC ,

S D → lD, D → m, DnD S,

and convert it to a context-free grammar expression. First, we adopt the con-


vention that the embedded grammar expressions are to each be treated as the
results of substitutions, so that we start out by rewriting this as the substitution
expression 
a = A → e, A → f Af, AgA ,

b = B → BBh, B → i, BB ,

c = C → Cj, C → k, CC ,

d = D → lD, D → m, DnD ,

S → SS, S → aSb, S → c, SdS
At this point, we’d like reduce this to a normal form in which the embedded
grammar rules have been “factored out”. Here, this means simply moving all

42
the embedded rules to the head of the overall grammar expression
A → e, A → f Af, B → BBh, B → i,
C → Cj, C → k, D → lD, D → m,
S → SS, S → (AgA) S (BB) , S → (CC) , S (DnD) S.
More generally, we need to also resolve the clashes in the names of bound
variables of two or more subexpressions with each other or with the main ex-
pression27 – hence, the need for the technical lemma, Substitution Invariance
(lemma 2.1). For technical reasons, we also replace composite starting configu-
rations by fresh variables, making use of Start Variable Normalization (lemma
2.2). Modulo the technical adjustments, the composition lemma may then be
stated with the assumption that the main grammar expression and each of the
grammar sub-expressions involved in substitutions make use of mutually dis-
joint sets of variables, and that each expression has been suitably normalized
with respect to their start variables.
For the following lemma, let M be a monoid G = (Q, S, H) be a context-
free grammar over a submonoid hXiM generated by a finite subset X ⊆ M .
Let σ : M → CN be a context-free substitution to the monoid N . For each
x ∈ X, let Gx = (Qx , Sx , Hx ) be a context-free grammar over N such that
L (Gx ) = σ (x); with the sets Q and Qx for each x ∈ X all mutually disjoint.
In addition, it is assumed that each component grammar Gx has been placed
in a form in which Sx ∈ Qx is a single variable that appears nowhere on the
right-hand side of any rule in Hx and in only one rule on the left. We also
require S ∈ Q to be a single variable, and that the production in Gx involving
Sx has at least one variable on the right-hand side (if necessary, by applying the
start variable normalization a second time).
Lemma 6.4 (The Composition Lemma). Define the composition of the gram-
mars by the following grammar over N
!
[ [
G0 = Q ∪ Qx , S, H̄ ∪ Hx ,
x∈X x∈X


where the homomorphism σ̄ : (Q ∪ {Sx : x ∈ S}) → M [Q] is given by σ̄ (Sx ) =
x for x ∈ X and σ̄ (q) = q for q ∈ Q; and where

H̄ ⊆ Q × (Q ∪ {Sx : x ∈ S})

is a finite set of rules chosen such that


  
H = q, σ̄ β̄ : q, β̄ ∈ H̄ .

Then L (G0 ) = σ ∗ (L (G)).

27 In particular, notice that we’re not substituting bound variables into the grammar expres-

sion. With the convention we’re restricting ourselves to here, the scope of the non-terminals
in a grammar expression is local.

43
Proof. It is an easy induction to show, for each x ∈ X, that α →Gx β iff
α →G0 β, where α, β ∈ N [Qx ]. This uses of the mutual disjointness of the sets
Qx . The only rules that can apply here are therefore those from Hx . From this,
it follows that
[Sx ]G0 = [Sx ]Gx = L (Gx ) = σ (x) (x ∈ X).
Moreover, by induction, making use of the context-freeness of G0 , [α]G0 [β]G0 =

[αβ]G0 , for α, β ∈ (Q ∪ {Sx : x ∈ S}) , we have [w̄]G0 = σ (σ̄ (w̄)), for w̄ ∈

{Sx : x ∈ S} .
In a similar way, one may verify that σ̄ (α) →G σ̄ (β) ⇔ α →G0 β for

α, β ∈ (Q ∪ {Sx : x ∈ S}) . This is where we require the assumptions made
about Sx in the grammar Gx . Making use of the disjointness of the set Q from
all the other sets Qx , it then follows that
[ [ [
[q]G0 = [w̄]G0 = σ (σ̄ (w̄)) = σ (w) = σ ∗ ([q]G )
σ̄(w̄)∈[q]G σ̄(w̄)∈[q]G w∈[q]G

for q ∈ Q, since occurrences of variables of Q in a configuration α must be


handled by the rules from H̄. Thus, we have
L (G0 ) = [S]G0 = σ ∗ ([S]G ) = σ ∗ (L (G)) .

6.3. C
With these preliminaries established, we then have the following corollary.
Corollary 6.5. C is natural family.
Proof. The Composition Lemma shows that C satisfies A5 . Property A0 is true
for all languages generated by general grammars, and A1 follows easily: a finite
subset U ∈ FM of a monoid has a regular grammar ({S} , S, {S} × U ).
The proof of A2 closely follows that in the classical theory. Given subsets
L (G1 ) , L (G2 ) ⊆ M generated by context-free grammars Gi = (Qi , Si , Hi ) over
M (i = 1, 2),
G = (Q1 ∪ Q2 ∪ {S} , S, H1 ∪ H2 ∪ {(S, S1 S2 )})
is a grammar for the product, provided that S ∈
/ Q1 ∪ Q2 ∪ M . We may then
use the property [αβ] = [α] [β] to show that
L (G) = [S1 S2 ]G = [S1 ]G [S2 ]G = [S1 ]G1 [S2 ]G2 = L (G1 ) L (G2 ) .

Example 6.6 (Products of Grammar Expressions). This process is illus-


trated schematically by the following:
q → β, . . . , q 0 → β 0 , S q̄ → β̄, . . . , q̄ 0 → β̄ 0 , S 0
 

= q → β, . . . , q 0 → β 0 , q̄ → β̄, . . . , q̄ 0 → β̄ 0 , SS 0 .


It is assumed, here, that the sets {q, . . . , q 0 } and {q̄, . . . , q̄ 0 } are mutually disjoint.

44
Corollary 6.7. As a C-dioid, the push-down transductions between alphabets
X and Y , up to isomorphism, is CX ∗ ⊗C CY ∗ .
Proof. This follows from theorem 2.10, by application of theorem 4.10.
Corollary 6.8. As a *-continuous Kleene algebra, the push-down transductions
between alphabets X and Y is contained in the commutant of Cn,R in Cn,R ⊗R
RX ∗ ⊗R RY ∗ .
Proof. Using the Transduction Theorem (theorem 4.10) and the results of sec-
tion 5.4, the C-closure of RX ∗ ⊗R RY ∗ is
∗ ∼ ∗ ∼
QR C ∗ R C ∗ R ∗ ∗
C QR RX ⊗R RY = QC QR R (X × Y ) = QC C (X × Y )

The result follows by application of theorem 4.14.

6.4. S and T
Though the composition lemma and product construction are formulated
explicitly for C, they can be refined to make them applicable to S and T – with
one important proviso to be noted below.
First, the product construction works with general grammars. However, for
context-sensitive and non-contracting grammars, special consideration must be
paid to the cases where the erasing-rules α1 → 1 or α2 → 1 are present. In
the former case, we replace the rule with S → α2 , while in the latter case, we
replace it with S → α1 . If both erasing rules are present, then we must also
add S → 1 to the overall grammar.
Second, the homomorphism property A3 may be shown directly as follows.
Given a homomorphism f : M → N , we extend it to a homomorphism f :
M [Q] → N [Q] by defining f : q ∈ Q 7→ q ∈ N [Q]. Then, given a language
L (G) ⊆ M , each rule α → β of the grammar G is replaced by the rule f (α) →
f (β). This preserves the grammar type – unless it is a non-contracting or
context-sensitive grammar. A rule of the form αqβ → αγβ, where γ 6= 1 may
map to a rule of the form f (α) qf (β) → f (α) f (γ) f (β), where f (γ) = 1, thus
destroying context-sensitivity. A similar problem occurs with non-contracting
grammars.
Third, the substitution property applies to general grammars. However, to
avoid the need for the property [αβ] = [α] [β], which we used in the composition
lemma, the grammar Gx over the monoid N must be modified to a grammar
over a copy Nx of N (using Substitution Invariance to rename the non-terminals,
if need be). By finite generativity A7 , we may assume that N is generated by a
finite set Y ⊆ N , similarly Nx by Yx ⊆ Nx . We must then add rules nx → n to
map each copy nx ∈ Nx to its original n ∈ N . 28 Thus, we establish the result
Corollary 6.9. T is a natural family.

28 As in the proof of theorem 4.18, this is where we require that the grammars be contextual.

45
For S, in the composition lemma, we also need to prove that the grammar
G0 is context-sensitive. However, a similar problem may occur as happens with
homomorphisms. Erasing rules in the component grammars Gx cannot generally
be eliminated and the resulting grammar need not be context-sensitive at all.
Therefore, we only obtain the following.29

Corollary 6.10. S satisfies A0 , A1 , A2 .


The root of the problem with context-sensitivity is that membership for type-
0 subsets is, in fact, solvable, provided enough hints are “written in invisible
ink”. For, it is a result of the classical theory that every type 0 language is a
homomorphic image of a type 1 language. This consideration continues to apply,
here, in our more general formalism to grammars over arbitrary monoids.30
Therefore, it will help to recount the classical argument in this more general
context.
In detail: if L is a type 0 language over a monoid M , then we may find a
language L0 ⊆ a∗ bL, where a 6= b and a, b ∈ / M , that maps homomorphically
onto L under the erasing homomorphism h (a) = 1 = h (b) and h (m) = m
for m ∈ M . Pick an arbitrary generating subset X ⊆ M − 1. Each rule

σX,Q (α) → σX,Q (β), where ln (α) > ln (β) and α, β ∈ (X ∪ Q) , is modified to
ln(α)−ln(β) ∗
the following σX,Q (α) → â σX,Q (β). Here, σX,Q : (X ∪ Q) → M [Q]
is the canonical homomorphism to the free extension M [Q].
The start configuration is changed from S to b̂S, and new rules are added
of the form
xâ → âx, b̂â → ab̂, b̂ → b
for x ∈ X. The result is a grammar that is non-contracting with respect to the
generating set X ∪ {a, b} of the freely extended monoid M [a, b]. Yet, its image
under the erasing homomorphism is the original type 0 language L.
An approximation to A3 for the subset family M 7→ SM can be recovered
as follows. Define a non-erasing monoid homomorphism f : M → N as a
homomorphism which is non-erasing with respect to at least one generating
subset X ⊆ M ; that is, where 1 ∈ / f˜ (X). A context-sensitive rule of the form
αqβ → αz1 . . . zn β maps homomorphically to the rule

f (α) qf (β) → f (α) f (z1 ) . . . f (zn ) f (β) .

If f (z1 ) . . . f (zn ) 6= 1, then we’re okay. Otherwise, z1 , . . . , zn ∈ M , and we need


to replace each f (x) by a new symbol x̂, for each x ∈ X that occurs in the
sequence z1 . . . zn , then we add a new rule x̂ → f (x). Since f (x) 6= 1, then
the new rule is non-contracting. At the same time, the embedded sequence

29 It was erroneously asserted in [17] that S also satisfied A . The proof breaks down for
3
erasing monoid homomorphisms.
30 In fact, by theorem 4.18, we can generate a type 0 language by erasing homomorphisms

from the “polycyclic” extension of a context-free language. Classically, this corresponds to


the erasure of the intersection of the context-free language with an interleaved Dyck language.

46
f (z1 ) . . . f (zn ) is now modified to a form ẑ1 . . . ẑn that is no longer identically
equal to 1. The resulting rule is now non-contracting and context-sensitive.
Similarly, the argument used to prove the substitution property A5 may
still be applied in approximate form. The needed condition is that there be
a generating subset X ⊆ M , such that 1 ∈ / σ (x) for all x ∈ X. For such
substitutions, the language σ (x) = L (Gx ) ∈ SN no longer has any productions
of the form α → 1 in it. Therefore, we may proceed with the construction, as
before, and define the composition G0 of the grammar G, with the component
grammars Gx for σ (x), for each x ∈ X.
The condition inherited from this for the union property A4 is that corre-
sponding to the family Y ∈ SSM , there should be a finite subset S ⊆ SM of
languages, all free of erasing productions, such that Y ∈ SS ∗ .

7. Conclusion and Further Topics

We have shown how the classical notion of language hierarchy may be en-
capsulated and generalized in algebraic form as a hierarchy of algebras. At the
bottom of the hierarchy is the dioid, or idempotent semiring. Associated with
this is the functor F, which maps a given monoid M to its dioid of finite subsets.
Thus, the dioid may be regarded as an algebraization of the concept of the finite
language. At the top of the hierarchy is the unital quantale, which is associated
with the functor P that maps a monoid M to its quantale of subsets. Here, the
corresponding classical concept is the general language.
This hierarchy was complemented by a family of adjunctions with the prop-
erties that
• if A 6 B then there exists an adjunction QB A B A

A , QB with QA ◦ QB = IDB ;

• if A 6 D 6 B then QB D B A D A
D ◦ QA = QA and QD ◦ QB = QB ; and

• if A 6 B then QB A
A ◦ A = B and  ◦ QB = B̂.

The functor QB A extends each A-dioid to its B-completion, and is complemented


A
by the forgetful functorPQB , which maps a B-dioid D to itself, where the least
upper bound operator ( ) : BD → D is restricted to the subfamily AD ⊆ BD.
The Chomsky Hierarchy lies at the foundation of both the theory of compu-
tation and linguistics. Our generalization of the classical notion of grammars to
arbitrary monoids merges the hierarchies of language families and transduction
families, while including representatives for all four members of the hierarchy.
Thus, between the extremes of F and S, we find algebras that include the nat-
ural families of the Chomsky hierarchy: R 6 C 6 S 6 T . Three of the four
members of this family fit strictly within the framework just devised.
At the same time, the introduction of a monadic hierarchy goes beyond the
Chomsky hierarchy in two ways. First, we have an upwards extension beyond T
to oracle families. The second way, not mentioned in [16, 17], is the downward
extension to proper subfamilies of F, and to their related algebras and a sideways
extension to other families A that are neither above nor below F. The smallest

47
of the new families is the “singleton family” I; its corresponding category being
the category of partially ordered monoids, itself. The related families of algebras
between I and F, comprise partially ordered monoids whose least upper bound
operator is reduced to a partial function over distinguished families of finite
subsets.
In the remaining sections, we discuss some of the issues that have only been
touched on briefly up to this point.

7.1. Natural Families for Semigroups


The axioms A0 to A4 apply analogously to semigroups resulting in analogous
properties A00 to A04 . We can define a correspondence between these functors
and the functors for natural dioid families as follows. First, given a semiring
functor A0 satisfying the A00 to A04 , define the restriction A0M = A0 |Monoid to
the subcategory of monoids. To verify that A0M defines a natural family, the only
non-trivial condition is A3 , which requires that A0M M actually be a monoid, as
a precondition. This is true, since {1} is the identity of any sub-semigroup of
PM that contains {1}.
For the other direction, let Se denote the monoid extension of a semigroup S:
the monoid obtained by adding a new element e ∈ / S and imposing the relations
e2 = e and es = s = se for s ∈ S. Also, let φe : Se → Se0 denote the extension
of the semiring homomorphism φ : S → S 0 to a monoid homomorphism with
φe (e) = e.
Then, given a monoid functor A satisfying A0 to A4 , define AS S = ASe ∩PS
for semirings S and AS f = f˜, for semiring homomorphisms f : S → S 0 . The
independence of this definition from the element e is assured by A4 . The only
non-trivial condition for AS is A03 , which is verified as follows. Suppose S
S is a
semiring and Y ∈ AS S AS S. Then, since AS AS S ⊆ PPS, it follows that Y∈
PS. To show that Y ∈ AS S, we first note that AS AS S ⊆ A (AS S)e . Since
AS S = ASe0 is already a monoid (where we take advantage of the independence
of AS from e and affix S by another identity e0 ∈ / Se ), we can map (AS S)e →
AS S homomorphically by φee0 (e) = {e0 } ∈ ASe0 and φee0 (U ) = U , for U ∈
AS S. Since Y ∈ A (AS S)e , then by A4 , φg ee0 (Y) ∈ AAS S = AASe0 . Since Y ∈
S
PAS S, then Y = φee (Y). Therefore by A3 , it follows that Y ∈ ASe0 = AS S.
g 0

In general, we only have (A0M )S > A0 , which may not reduce to equality.
We can only show that (A0M )S S = A0 Se ∩ PS ⊇ A0 S. However, the identity
(AS )M = A may be verified as follows. For monoids M and monoid homomor-
phisms f : M → M 0 , we have (AS )M M = AMe ∩ PM and (AS )M f = f˜ = Af .
Since Me can be mapped homomorphically onto M , then surjectivity can be
used to show that AMe ∩ PM = AM .

7.2. Natural Semiring Families and Idempotent Power Series


The incorporation of the idempotency, A = A + A, is the critical feature
behind the occurrence of the partially ordered monoid structure. In contrast,

48
in the formal power series approach [3, 9, 23], addition no longer need be idem-
potent. Therefore, a natural route of generalization is of the monad hierarchy
from dioids to semirings.
Unlike the case for dioids, the product, sum and morphism operators need
not be well-defined unless one makes restrictions on their domain, on the class
of monoids under consideration, or on the semiring, itself. Therefore, while
M 7→ PM yields a natural family, the analogue PS : M 7→ S M need not
be natural. The classification of natural families AS , for a given semiring S,
satisfying axioms A0S to A4S is left open here. Because of the absence of infinite
summability over S, there is no longer a clear-cut analogue to any portion of
the hierarchy of natural families AS , satisfying axioms A0S to A4S . Nor is it
clear whether the hierarchy even extends beyond the family FS of power series
with finite support. In particular, the question remains unresolved as to what
conditions are required to define analogues RS , CS , TS (and even SS ).
In [6], the power series was employed as a means to measure ambiguity. If
we were to attempt to generalize power series to arbitrary monoids, we would
immediately run into the problems just described, unless further restrictions
are made. One restriction, which is in the spirit of the approach adopted in
this paper, is to replace numerical coefficients by coefficients in a quantale,
such as the Boolean lattice of subsets. Then, we may factor the “ambiguity
count” function into the composition of two operations: (1) the output set for a
given input under the action of a transduction and (2) the cardinality function
applied to output sets. The ill-definedness of power series is factored out with
(2), leaving behind (1) – the well-defined notion of an idempotent power series .
In general, for a given semiring with unit S, we may define the following
partial (and not necessarily well-defined) operations over the function space S M :
for each m ∈ M , the unit is defined as ηM (m) = m̂, where m̂ : m0 ∈ M 7→ δmm0 .
For each s ∈ S, we also have s1̂ ∈ S M , whichP we may denote by s. Thus,
we can write out the decomposition φ = m∈M φ (m) m̂ which gives us the
M
representation of each φ ∈ S as a power series. For φ, φ0 ∈ S M , and Φ ∈ S S
M

we define,

(φφ0 ) (m) = φ (n) φ0 (n0 ),


P P P
Φ (m) = Φ (φ) φ (m),
nn0 =m φ∈S M
P
where m ∈ M . The summation gives us the product µM (Φ) = Φ. Finally,
0
we define the lifting of each function f : M → M to the power series algebra
0
by f˜ (φ) = (m) ∈ S M , where φ ∈ S M .
P
φ (m) f\
m∈M
The analogues of axioms A0 to A4 for the semiring S are: A0S : AS M ⊆ S M ,
A1S : FS M ⊆ AS M , where FS M denotes the family of functions φ : M → S
0 0
P finite support, A2S : φ, φ ∈ AS M ˜→ φφ ∈ AS 0M , A3S : Φ ∈ AS AS M
with →
Φ ∈ AS M and A4S : φ ∈ AS M → f (φ) ∈ AS M , where f : M → M 0 is a
monoid homomorphism. The natural dioid families are recovered as a special
case A = A2 obtained with the finite dioid S = 2 = {0, 1}, by treating 2M
synonymously with PM . More generally, when we restrict S to quantales, the

49
power series operations all become well-defined and the analogue PS M = S M
to PM is defined.
The category DAS is then defined analogously by AS -additivity and AS -
distributivity:
(φφ0 ) = ( φ) ( φ0 )
P P P P
φ ∈ D,
respectively, where φ, φ0 ∈ ASP D. Equivalently in place of the latter, we may
adopt weak AS -distributivity ˆ dˆ0 ∈ d (P φ) d0 , where φ ∈ AS D and d, d0 ∈

D. Finally, an AS -morphism is defined by the condition that f˜ (φ) = f ( φ),
P P
where φ ∈ AS D.
Here, unlike the situation with semiring power series, we can go further and
develop analogues RS , CS , TS and even SS by applying our generalized grammar
formalism to the direct product monoid S × M , which is contained in S M by
the inclusion (s, m) ∈ S × M 7→ sm̂ ∈ S M . Which of the results established in
the previous sections carry over to this formalism is left as an unresolved issue.

7.3. Context-Sensitive Languages and Non-Erasing Morphisms


The hierarchy along the segment S → T collapses, once we require closure
under monoid homomorphisms, because of the occurrence of erasing morphisms.
Strict inclusion of S requires us to generalize the framework in a way that
we have only begun to elaborate on here, but have not fully resolved. One
approach to generalizing non-erasing morphisms is to consider morphisms on
normed monoids .
A normed monoid is a monoid M with a non-negative measure m ∈ M 7→
|m| > 0 satisfying the conditions |1| = 0, |m| > 0 if m 6= 1 and |mm0 | 6
|m| + |m0 |. If there is a factoring m = m1 m2 with m1 , m2 6= 1 such that
|mm0 | = |m| + |m0 | then m is composite. Any other m 6= 1 may then be termed
atomic . The norm, itself, may be termed atomic if inf {|m| : m ∈ M − {1}} > 0.
The atomic elements form a generating subset of a monoid with an atomic norm.
Conversely, given a generating subset X ⊆ M −{1}, the generating norm may be
defined by |m|X = inf {n > 0 : m ∈ X n }, for m ∈ M . The minimal generating
norm is the flat norm by, defined by |m| = 1 if m 6= 1, which corresponds to the
generating subset X = M − {1}.
Thus, a possible generalization for non-erasing morphisms is atomic mor-
phisms – morphisms between monoids with atomic norms that maps atomic
elements to atomic elements. A monoid homomorphism f : M → M 0 that is
non-erasing with respect to a generating subset X ⊆ M − {1} can therefore be
treated as an atomic morphism by defining suitable norms on M and M 0 . Since
the identity morphism preserves the atomic elements of a normed monoid, and
atomic morphisms are closed under composition, the result is a subcategory
of normed monoids that may serve to “categorize” the notion of non-erasing
morphism.

7.4. Matrix Closure


Matrix closure A9 is easily verified for the natural families F, ω, P and more
generally for Fk M and Pc M . The case for R is established in [21], where the

50
Kleene star is inductively defined by the following decomposition
∗ 
E∗ E ∗ BD∗
 
A B
= , E = A + BD∗ C.
C D D∗ CE ∗ D + D∗ CE ∗ BD∗

The issue of proofs for A9 for C and T , however, has been left unresolved here.
Matrix closure can be derived from the stronger version of the submonoid
ordering property (section 3.3): if M, M 0 are monoids and M ⊆ M 0 then AM =
AM 0 ∩ PM . Then, from U ∈ ADn×n , we already have Uij I ∈ ADn×n . Since
DI = {dI : d ∈ D} ⊆ Dn×n is a submonoid, then under this stronger form
of submonoid ordering, we have Uij I ∈ A (DI). From DI ∼ = D, we conclude
Uij ∈ AD. The converse is posed as a conjecture.
Conjecture 7.1. If A satisfies matrix closure A9 then A satisfies the strong
submonoid ordering property.

7.5. Composition and Transduction Closure


Define the composition of U ⊆ M 00 × M and T 0 ⊆ M × M 0 by

U ◦ T ≡ {(m00 , m0 ) ∈ M 00 × M 0 : (∃m ∈ M ) (m00 , m) ∈ U, (m, m0 ) ∈ T } .

Is there a closure property for composition of the following form (possibly


with restrictions placed on M )?

Conjecture 7.2. For each U ∈ A (M 00 × M ) and T ∈ A (M × M 0 ),

U ◦ T ∈ A (M 00 × M 0 ) .

If so, then a consequence of this property is closure under transduction:

Conjecture 7.3. For each U ∈ AM and T ∈ A (M × M 0 ),

T (U ) ≡ {m0 ∈ M 0 : (∃m ∈ U ) (m, m0 ) ∈ T } ∈ AM 0 .

In turn, a consequence of transduction closure is that each


P transduction
T ∈ A (M × M 0 ) reduces to an idempotent power series T = m∈M mTm over
the monoid M with coefficients Tm = T ({m}) in the idempotent semiring AM 0 .

7.6. Transducers and Automata


To generalize the classical theorems linking automata families to language
families requires a generalized treatment of automata. For instance, with such
an expanded treatment available, we would be able to generalize the classical
theory to include results asserting the equivalence of the families R (M × M 0 ),
C (M × M 0 ) and T (M × M 0 ), respectively, to the rational/finite, push-down
and Turing transductions between M and M 0 ; and to assert the equivalence of
these families, in turn, to the rational, context-free and Turing subsets of the
product monoid M × M 0 .

51
The inability to prove more comprehensive results regarding the nature of
transductions over general monoids is directly tied to the absence of any devel-
opment here for automata, analogous to the treatment of grammars in section
2. Without such a parallel treatment, the analogues of the classical theorems
relating automata classes to language and transduction families cannot be estab-
lished. One approach, already alluded to in section 2.8, is to define a transducer
over a product monoid M ×M 0 as a structure A = (Q, I, F, H) containing a set Q
of states, subsets I, F ⊆ Q, respectively, of initial and final states and a relation
H ⊆ Q × M × M 0 × Q for one-step transitions. The grammar GW = (Q, S, HW )

HW = {(S, i) : i ∈ I} ∪ {(f, 1) : f ∈ F } ∪ {(q, (m, m0 ) q 0 ) : (q, m, m0 , q 0 ) ∈ H}

can be then used as a simple means for defining the language L (A) ≡ L (GW ).
The duality between the different “directionalities” of transducers, also al-
luded to in section 2.8, generalizes to give us an equivalence between transducers
over M1 × (M2 × M3 ) to transducers over (M1 × M2 ) × M3 , based on the fol-
lowing correspondence between their one-step transitions

qm1 → (m2 , m3 ) q 0 ⇔ q (m1 , m2 ) → m3 q 0 .

This leads directly to a 3-way classification of automata over M × M 0 as


recognizers (if M 0 = {1}), generators (if M = {1}) and transducers (the general
case). Underlying the representation of recognizers and generators as transduc-
ers are the other two isomorphisms defining a monoidal category, M × {1} ∼ =M
and {1} × M 0 ∼ = M 0 , respectively. The conversion between recognizer, trans-
ducer and generator for a monoid M ×M 0 then corresponds to the isomorphisms
(M × M 0 ) × {1} ∼ = M × M0 ∼ = {1} × (M × M 0 ).
Different classes of automata may then be distinguished based on the prop-
erties of the individual elements of the structure A = (Q, I, F, H). The simplest
family are the finite state transducers, where Q, I, F and H are all finite. From
there, one may consider the analogue of push-down transducers, where the state
set factors as Q = Q0 × B ∗ into a finite state set Q0 and the configuration states
B ∗ for a push-down store over an alphabet B. To recover the classical defini-
tions of push down transducers then requires symmetry conditions be imposed
on I, F and H. In a similar way, the Turing transducer may be recovered as a
transducer whose state space factors as Q0 × B ω into a finite state set Q0 and a
configuration set limited to “queues” that “trail off in blanks” (i.e., the subclass
of functions in B ω that have finite support).
This process is illustrated for push down automata. Classically defined as a
structure [36, section 5.2, p. 259] M = (Q, X, B, H, I, F ) with I = {s} and

H ⊆ (Q × X × B) × (Q × B ∗ ) ∪ (Q × X × {⊥}) × (Q × ⊥B ∗ )

configurations comprise words in ⊥B ∗ QX ∗ with →H generated by closure under


C, R and T (see section 2.2) from the one-step relations

bqx → βq 0 ((q, x, b) , (q 0 , β)) ∈ H,


⊥qx → ⊥βq 0 ((q, x, ⊥) , (q 0 , ⊥β)) ∈ H.

52
The language is defined by L (M ) = {m ∈ X ∗ : ⊥sm → ⊥f, f ∈ F }.
When we rewrite the state set as Q × B ∗ , the corresponding automaton is
(Q × B ∗ , I × {1} , F × {1} , H 0 ) over the free monoid X ∗ . In H 0 , while the one-
step transition ⊥qx → ⊥βq 0 is replaced by (q, 1) x →H 0 (q 0 , β), each one-step
transition bqx → βq 0 must be replaced an entire set of relations: (q, β 00 b) x →H 0
(q 0 , β 00 β) for all β 00 ∈ B ∗ . The result is an automaton with the following sym-
metry condition: if β 6= 1, then (q, β) x →H 0 (q 0 , β 0 ) implies (q, β 00 β) x →H 0
(q 0 , β 00 β 0 ) for all β 00 ∈ B ∗ .
The conversion of F to F × {1} imposes the empty stack condition. If we
were to relax this condition, then we would replace F by F × B ∗ , instead. Then
a second symmetry condition would arise, stating that if (f, β) is a final state
then (f, β 0 ) is a final state, for all β 0 ∈ B ∗ .31

Acknowledgments.
The author would like to thank Dexter Kozen and Bernhard Möller for
their assistance, Bruce Litow for his enthusiastic support, Derick Wood for
inspiring research in the area of algebraizing formal language and automata
theory, Hans Leiss for our enlightening conversations during RelMiCS 10/AKA
5 and afterwards, Noam Chomsky for his words of encouragement, and the
UW-Milwaukee Golda Meir Library for providing the resources needed for the
research. Thanks also go to members of my family, who’ve provided moral
support and encouragement.

Appendix A. Adjunctions, Monads and Co-Monads


The conventions we use throughout the paper are to denote the class of
objects of a category C as |C|, and the morphisms between A, B ∈ |C| as
tagged arrows f : A → B. In addition, throughout the paper we use the
notation 1A : A → A for the identity function on A.

Appendix A.1. Adjunctions


A pair of functors, E : M → D and U : D → M between two categories M
and D is adjoint when there is a natural bijection between morphisms f : M →
UD and F : EM → D. If the morphisms f ⇔ F correspond in this way, we

write F = f ∗ and F∗ = f . Note the identities (f ∗ )∗ = f and (F∗ ) = F . For the
bijection to be natural means that there is consistency with the composition in

the following sense (UF ◦ g ◦ h) = F ◦g ∗ ◦Eh, or equivalently, (F ◦ G ◦ Eh)∗ =
UF ◦ G∗ ◦ h, where F : D → D , g : M → UD, G : EM → D and h : M 0 → M .
0

Under these conditions, E is left-adjoint to U, which is right-adjoint to E, the


pair is denoted (E, U) and is referred to as an adjunction pair.

31 So, perhaps lending partial fulfillment to von Neumann’s attempt [35] at expanding his

foundational work in physics to automata theory, we may think of H as the analogue of


a Hamiltonian for a dynamics, the restriction on the forms one-step rules as analogues of
selection rules, and the symmetry conditions as analogues of symmetries imposed on the
dynamics of a system and its states.

53
For a given adjunction, the unit ηM ≡ (E1M )∗ : M → TM and co-unit

εD ≡ (U1D ) : LD → D respectively yield natural transformations η : 1M →
T ≡ U ◦ E and ε : L ≡ E ◦ U → 1D , by virtue of the identities

ηM 0 ◦ f = (Ef )∗ = Tf ◦ ηM , εD0 ◦ LF = (UF ) = F ◦ εD .

where f : M → M 0 and F : D → D0 . Conversely, the natural correspondence


between f : M → UD and F : EM → D can be recovered from the unit
and co-unit by the identities f ∗ = εD ◦ Ef and F∗ = UF ◦ ηM . For this, one
needs the following conditions, known as the unit co-unit equations or zig-zag
equations: εEM ◦EηM = E1M and UεD ◦ηUD = U1D . The resulting adjunction
is equivalently specified by the quadruple (E, U, η, ε).

Appendix A.2. Monads and Co-Monads


For the category M, a self-contained structure (T, µ, η), called a monad,
can be defined for the endofunctor T : M → M, with the introduction of the

product µM ≡ U (T1M ) = UεEM . By virtue of the identity µM 0 ◦ TTf =

U (Tf ) = Tf ◦ µM (for f : M → M 0 ), this yields a natural transformation
µ : T ◦ T → T which also satisfies the two coherence conditions that play the
respective analogues to associativity and identity
 ∗
µM ◦ µTM = µM ◦ TµM = U (µM ) , µM ◦ ηTM = T1M = µM ◦ TηM .

Similarly, for the category D, the co-monad (L, δ, ε) is introduced with the
co-product δD ≡ E (L1D )∗ = EηUD , which also yields a natural transformation
δ : L → L ◦ L, since δD0 ◦ LF = E (LF )∗ = LLF ◦ δD (for F : D → D0 ). The
corresponding coherence conditions are

δLD ◦ δD = LδD ◦ δD [= E (δD )∗ ] , εLD ◦ δD = L1D = LεD ◦ δD .

For both the monad and co-monad, information from the original adjunction
is lost. In particular, the two equations enclosed in square brackets are no longer
present. We can partially recover the unit and co-unit by UδD = TηUD and
EµM = LεEM , and attempt to approximate the adjunction by trying to define
natural transformations Σ : T → 1M and ∆ : 1D → L such that UεD = ΣUD
and EηM = ∆EM . However, these natural transformations are only defined,
respectively, over the subcategories U (D) ⊆ M and E (M) ⊆ D within each
category that reflect the other. This observation serves as the basis for con-
structing a minimal adjunction from either the monad or co-monad.

Appendix A.3. Adjunctions from Monads: T-Algebras


A T-Algebra is constructed from the monad (T, µ, η) as a category MT
whose objects are the morphisms ΣM : TM → M that satisfy ΣM ◦ µM =
ΣM ◦ TΣM and ΣM ◦ ηM = 1M . The morphisms F : ΣM → ΣM 0 are the
subclass of morphisms F : M → M 0 for which ΣM 0 ◦ TF = F ◦ ΣM . This
is enough information to define the functor E by EM = µM and Ef = Tf ,
where f : M → M 0 . In addition, the functor U may be defined by UΣM = M

54
and UF = F , where F : ΣM → ΣM 0 . The natural operations are given by
f ∗ = ΣM 0 ◦ Tf , where f : M → M 0 , and F∗ = F ◦ ηM , where F : µM → ΣM 0 .
These definitions suffice to recover an adjunction, the identity T = U ◦ E,
and a co-monad structure given by δΣM = TηM and εΣM = ΣM . The category
MT is referred to as an Eilenberg-Moore category of the monad (T, µ, η), and its
members as Eilenberg-Moore algebras, or just T-Algebras. The full subcategory
of free algebras µM : TTM → TM yields a category MT called the Kleisli
category of the monad (T, µ, η). An adjunction that (up to equivalence) arises
from a T-algebra construction is referred to as monadic .

Appendix A.4. Adjunctions from Co-Monads: L-Algebras


The analogous construction for a co-monad (L, δ, ε) is a category DL whose
objects are the morphisms ∆D : D → LD satisfying δD ◦ ∆D = L∆D ◦ ∆D and
εD ◦ ∆D = 1D . Its morphisms f : ∆D → ∆D0 are the morphisms f : D → D0
for which ∆D0 ◦ f = Lf ◦ ∆D . The functors are defined by UD = δD , UF = LF
when F : D → D0 , E∆D = D and Ef = f , when f : ∆D → ∆D0 . The
natural correspondences are given by f ∗ = εD0 ◦ f , where f : ∆D → δD0 and
F∗ = LF ◦ ∆D , where F : D → D0 . Finally, the corresponding monad operators
are µ∆D = LεD and η∆D = ∆D , and we recover the factoring L = E ◦ U.

Appendix A.5. Natural Dioids and Natural Semirings


In section 3, we laid out the construction for natural dioid family A as an
adjunction between M = Monoid and D = DA, and actually as a T-algebra
for M with functors E = T = A and U = Â. For the category M this led to
the following definitions EM = TM = AM , Ef = Tf = f˜, f ∗ (U ) = P f˜ (U ),
S
µM (Y) = Y and ηM (m) = {m}. For the category D the corresponding
data were given by UD = LD = D,PUF = LF = F , F∗ (m) = F ({m}),
δD (U ) = {{u} : u ∈ U } and εD (U ) = U .
Properties A0 , A2 and A4 gave us the functor T, A1 gave us the unit η and
A3 the product µ. As we mentioned at the end of section 5, one needs only
the weaker property A1W to be able to define the unit morphisms. With this
extension, the result is a larger hierarchy that goes beyond the additively closed
structure of dioids and includes the partially ordered monoids.
This construction generalizes to arbitrary semirings S, yielding a hierarchy
of monads AS . The dioid hierarchy corresponds to the special case A = A2
constructed from the dioid 2. In place of property A0 , we assume A0S : that
AS M ⊆ S M . The remaining properties A1W S , A1S , A2S , A3S and A4S are
then defined by analogy.
One instance of a natural semiring family is already given to us: the class
FS M of finite subsets of S M . However, since a semiring S need not generally be
closed under arbitrary sums, with infinite associativity and distributivity, the
maximal element PS M = S M may not be defined. The nature of the hierarchy –
and whether it even includes functors other than FS – remains unresolved here.
In particular, the question of whether analogous functors RS , CS and TS (for
Chomsky types 3, 2 and 0 respectively) can be defined remains unresolved. The

55
exceptions that stand out are quantales – thus leading to the idea of idempotent
power series that we discussed in the concluding sections of the paper.

Appendix B. Monoidal Categories and Tensor Products


Appendix B.1. Monoidal Categories
A monoidal (or tensor) category D is a category with a tensor product bi-
functor ⊗ : D × D → D and identity I, such that (A ⊗ B) ⊗ C ∼ = A ⊗ (B ⊗ C)
and A ⊗ I ∼ =A∼ = I ⊗ A are coherent natural isomorphisms. The category is
braided if A ⊗ B ∼ = B ⊗ A is also a coherent natural isomorphism.
These requirements yield a morphism f ⊗g : A⊗B → A0 ⊗B 0 for each f : A →
A0 , g : B → B 0 such that 1D ⊗ 1D0 = 1D⊗D0 and (f 0 ⊗ g 0 ) ◦ (f ⊗ g) = (f 0 ◦ f ) ⊗
(g 0 ◦ g), where f 0 : A0 → A00 and g 0 : B 0 → B 00 . The natural isomorphisms
are the associator αABC : (A ⊗ B) ⊗ C → A ⊗ (B ⊗ C), the left unitor λA :
I ⊗ A → A, the right unitor ρA : A ⊗ I → A and (for braided categories) the
commutor γAB : A ⊗ B → B ⊗ A; each having inverses, e.g., λA ◦ λ−1 A = 1A and
λ−1
A ◦ λA = 1I⊗A . The naturality condition means they each possess invariance
under morphisms of the affected objects; e.g., f ◦ λA = λA0 ◦ (1I ⊗ f ), for
morphisms f : A → A0 .
Finally, coherence means that circular chains of natural isomorphisms reduce
to the identity. The two fundamental loops are
= (A ⊗ I) ⊗ B ∼
= A ⊗ (I ⊗ B) ∼
A⊗B ∼ =A⊗B
and
A ⊗ (B ⊗ (C ⊗ D)) ∼
= A ⊗ ((B ⊗ C) ⊗ D)

= (A ⊗ (B ⊗ C)) ⊗ D ∼
= ((A ⊗ B) ⊗ C) ⊗ D

= (A ⊗ B) ⊗ (C ⊗ D) ∼
= A ⊗ (B ⊗ (C ⊗ D)) .
This leads, respectively, the unit identity (1A ⊗ λB ) ◦ αAIB = ρA ⊗ 1B and
pentagon identity
(1A ⊗ αBCD ) ◦ αA(B⊗C)D ◦ (αABC ⊗ 1D ) = αAB(C⊗D) ◦ α(A⊗B)CD .
together referred to as the coherence conditions . In addition, for braided cate-
gories, we also have the loop
(A ⊗ B) ⊗ C ∼
= A ⊗ (B ⊗ C) ∼
= (B ⊗ C) ⊗ A

= B ⊗ (C ⊗ A) ∼
= B ⊗ (A ⊗ C)

= (B ⊗ A) ⊗ C ∼
= (A ⊗ B) ⊗ C
which leads to the hexagon identities
αBCA ◦ γA(B⊗C) ◦ αABC = (1B ⊗ γAC ) ◦ αBAC ◦ (γAB ⊗ 1C ) ,
−1 −1 −1
 
αBCA ◦ γ(B⊗C)A ◦ αABC = 1B ⊗ γCA ◦ αBAC ◦ γBA ⊗ 1C .
A braided category is symmetric if A ⊗ B → B ⊗ A → A ⊗ B also reduces to
−1
the identity. This leads to symmetry γBA ◦ γAB = 1A⊗B , i.e., γAB = γBA . Both
hexagon identities then become equivalent.

56
Appendix B.2. Application to the Tensor Product Algebras
The tensor product construction we outlined in sections 2.8 and 4.6 involve
two morphisms LAB : A → A ⊗ B, RAB : B → A ⊗ B that mutually commute
and a construction of hf, gi : A ⊗ B → C from any two morphisms f : A → C
and g : B → C that mutually commute. They are subject to the identities
hf, gi ◦ LAB = f , hf, gi ◦ RAB = g, hLAB , RAB i = 1A⊗B and h ◦ hf, gi =
hh ◦ f, h ◦ gi, where h : C → D is another morphism.
The initial object, described in sections 2.8 and 4.5 involved the morphisms
&A : I → A subject to the identities &I = 1I and f ◦ &A = &B , where
f : A → B. The uniqueness of initial morphisms also entails the following
relations &I⊗A = LIA and &A⊗I = RAI .
Together, as the following theorem shows, this is enough to define a sym-
metric monoidal category.
Theorem 1. A category with a tensor product and initial object is a symmetric
monoidal category with the correspondence f ⊗ g = hLA0 B 0 ◦ f, RA0 B 0 ◦ gi, where
f : A → A0 and g : B → B 0 .
Proof. The unitors are λA = h&A , 1A i, ρA = h1A , &A i, with inverses λ−1
A =
RIA and ρ−1A = L AI , the commutors are γ AB = hR BA , LBA i = γ −1
BA and the
associators and its inverses are


αABC = LA(B⊗C) , RA(B⊗C) ◦ LBC , RA(B⊗C) ◦ RBC ,
−1



αABC = L(A⊗B)C ◦ LAB , L(A⊗B)C ◦ RAB , R(A⊗B)C .

The verification of inverse relations, the naturality and coherence conditions are
all routine, but lengthy, calculations. For coherence, the key observation is that
since all the natural isomorphisms are built out of the ingredients used in the
universal property for tensor products and initial objects, then the universal
property forces circular chains of isomorphisms to all reduce to the identity. For
instance, the unit identity is converted to a loop by moving everything to the
right, to obtain: (1A ⊗ λB ) ◦ αAIB ◦ ρ−1 A ⊗ 1B = 1A⊗B . Direct computation,
omitting subscripts, then yields the following verification of the relation

(1 ⊗ λ) ◦ α ◦ ρ−1 ⊗ 1


= hL ◦ 1, R ◦ h&, 1ii ◦ hhL, R ◦ Li , R ◦ Ri ◦ hL ◦ L, R ◦ 1i


= hL, hR ◦ &, R ◦ 1ii ◦ hhL, R ◦ Li ◦ L, R ◦ R ◦ 1i
= hL, h&, Rii ◦ hL, R ◦ Ri = hL, h&, Ri ◦ Ri = hL, Ri = 1.

Other calculations are similar. Note, in particular, the need for the identities
&I⊗A = LIA and &A⊗I = RAI to ensure the inverse property for the unitors,
e.g.,

λ−1
A ◦ λA = RIA ◦ h&A , 1A i = hRIA ◦ &A , RIA ◦ 1A i
= h&I⊗A , RIA i = hLIA , RIA i = 1I⊗A .

57
Appendix B.3. Tensor Functors and Adjunctions
Consistency of the tensor product across the adjunctions we define is assured
if the adjunctions involve monoidal functors. Let M and D be two categories
with initial objects 1 ∈ |M| and 2 ∈ |D| and tensor products M, N ∈ |M| 7→
M × N ∈ |M| and D, E ∈ |D| 7→ D ⊗ E ∈ |D|.
Given a functor U : D → M, a weak form of invariance is ensured by the
existence of two coherence maps ψ : 1 → U2 and φDE : UD ×UE → U (D ⊗ E)
satisfying the naturality condition φD0 E 0 ◦(UF × UG) = U (F ⊗ G)◦φDE where
F : D → D0 and G : E → E 0 . The coherence conditions involve the four
fundamental chains. Omitting subscripts for brevity, the respective identities
are:

φ ◦ (1 × φ) ◦ α = Uα ◦ φ ◦ (φ × 1) ,
λ = Uλ ◦ φ ◦ (ψ × 1) ,
ρ = Uρ ◦ φ ◦ (1 × ψ) ,
φ◦γ = Uγ ◦ φ.

Going the other way, given a functor E : M → D another weak form


of invariance arises from the duals of the coherence maps, ψ̄ : E1 → 2 and
φ̄M N : E (M × N ) → EM ⊗ EN , which is subject to the naturality condition
(Ef ⊗ Eg) ◦ φ̄M N = φ̄M 0 N 0 ◦ E (f × g), where f : M → M 0 and g : N → N 0 .
Similar fundamental chains are involved, but with the directions of the arrows
reversed. Omitting subscripts, the corresponding identities are:
 
1 ⊗ φ̄ ◦ φ̄ ◦ Eα = α ◦ φ̄ ⊗ 1 ◦ φ̄,

Eλ = λ ◦ ψ̄ ⊗ 1 ◦ φ̄,

Eρ = ρ ◦ 1 ⊗ ψ̄ ◦ φ̄,
φ̄ ◦ Eγ = γ ◦ φ̄.

Finally, an adjunction pair (E, U) is monoidal if the coherence maps are


∗ ∗
related by φ̄ = (φ ◦ (η × η)) and
 ψ̄ = ψ . Equivalently, these conditions may
be expressed by φ = (ε ⊗ ε) ◦ φ̄ ∗ and ψ = ψ̄∗ .

Appendix B.4. Application to the Adjunctions between Monoid-Dioid Algebras


Assume, that we have two categories M and D each possessing their re-
spective initial objects 1 and 2, and tensor products M, N 7→ M × N and
D, E 7→ D ⊗ E satisfying their respective universal properties for initial objects
and tensor products. This is the situation we have in sections 3.3 and 5.2, where
we have functors U : M → D and E : D → M that form an adjunction pair
(E, U). As a consequence of the universal property for initial objects, we then
obtain the additional identity &UE1 = η1 .
As the following theorem shows, these conditions suffice to give us a monoidal
adjunction in which the coherence maps ψ̄ and φ̄M N have inverses.

58
Theorem 2. An adjunction (E, U) between categories M and D with ten-
sor products × and ⊗ respectively and initial objects 1 ∈ |M| and 2 ∈ |D| is
monoidal, with natural isomorphisms E1 ∼
= 2 and E (M × N ) ∼ = EM ⊗ EN .
Proof. The correspondences are ψ = &U2 and ψ̄ = &∗U2 with inverse ψ̄ −1 =
&E1 . In addition, we have φDE = hULDE , URDE i and

φ̄−1

M N = hELM N , ERM N i , φ̄M N = L(EM )(EN ) ∗ , R(EM )(EN ) ∗ .

Again, verification of the conditions comes down to routine, but lengthy, calcula-
tions. The key points of interest are the inverse relations ψ̄ ◦ ψ̄ −1 = &∗U2 ◦&E1 =
&2 = 12 and

ψ̄ −1 ◦ ψ̄ = &E1 ◦ &∗U2 = (U&E1 ◦ &U2 ) = &∗UE1 = η1∗ = 1E1 .

Omitting subscripts, the calculation for one of the naturality conditions is


demonstrated here, where F : D → D0 and G : E → E 0 ,

U (F ⊗ G) ◦ φ = U hL ◦ F, R ◦ Gi ◦ hUL, URi
= hU (hL ◦ F, R ◦ Gi ◦ L) , U (hL ◦ F, R ◦ Gi ◦ R)i
= hU (L ◦ F ) , U (R ◦ G)i ,
φ ◦ (UF × UG) = hUL, URi ◦ hL ◦ UF, R ◦ UGi
= hhUL, URi ◦ L ◦ UF, hUL, URi ◦ R ◦ UGi
= hU (L ◦ F ) , U (R ◦ G)i .

References

[1] Abramsky S., Vickers S.: Quantales, observational logic and process se-
mantics Mathematical Structures in Computer Science 3, 161–227 (1993).
[2] Baccelli, F., Mairesse, J.: Ergodic theorems for stochastic operators and
discrete event systems. In: [13].
[3] Berstel, J., Reutenauer, C.: Les Séries Rationelles et Leurs Langages. Mas-
son (1984). English edition: Rational Series and Their Languages. Springer-
Verlag (1988).
[4] Birkhoff, G.: Lattice Theory. American Mathematical Society (1967).
[5] Chomsky, N.: Context-Free Grammars and Pushdown Storage. Quarterly
Prog. Report No. 65, MIT Research Lab. In: Electronics, pp. 187–194,
Cambridge, (1962).
[6] Chomsky, N., Schützenberger, M.P.: The Algebraic Theory of Context-Free
Languages. In: Braort P. and Hirschberg D. (eds.), Computer Programming
and Formal Systems, pp. 118–161, North Holland, (1963).

59
[7] Conway, J.H.: Regular Algebra and Finite Machines. Chapman and Hall,
London (1971).
[8] Davey, B.A., Priestley, H.A.: Introduction to Lattices and Order. Cam-
bridge University Press (1990).

[9] Ésik, Z., Kuich, W.: Rationally Additive Semirings. Journal of Computer
Science 8, 173–183 (2002).
[10] Ésik, Z., Leiss, H.: Algebraically complete semirings and Greibach normal
form. Ann. Pure. Appl. Logic. 133, 173–203 (2005).
[11] Golan, J.S.: Semirings and their applications. Kluwer Academic Publishers,
Dordrecht (1999).
[12] Gruska, J.: A Characterization of Context-Free Languages. Journal of
Computer and System Sciences 5, 353–364 (1971).
[13] Gunawardena, J. (ed.): Idempotency. Publications of the Newton Institute,
Cambridge University Press (1998).
[14] Hoeft, H.: A normal form for some semigroups generated by idempotents.
Fund. Math. 84, 75–78 (1974).
[15] Hopkins, M.W., Kozen, D.: Parikh’s Theorem in Commutative Kleene Al-
gebra. LICS ’99, pp. 394–401 (1999).

[16] Hopkins, M.W.: The Algebraic Approach I: The Algebraization of the


Chomsky Hierarchy. RelMiCS/AKA 2008, LNCS 4988, 155–172 (2008).
[17] Hopkins, M.W.: The Algebraic Approach II: Dioids, Quantales and Mon-
ads, RelMiCS/AKA 2008, LNCS 4988, 173–190 (2008).

[18] Huet, G.: Logical Foundations of Functional Programming. Addison-


Wesley (1990).
[19] Kozen, D.: On Kleene Algebras and Closed Semirings. In: Roval (ed.),
Lecture Notes in Computer Science 452, pp. 26–47, Springer (1990)

[20] Kozen, D.: The Design and Analysis of Algorithms. Springer-Verlag (1992).
[21] Kozen, D.: A Completeness Theorem for Kleene Algebras and the Algebra
of Regular Events. Information and Computation 110, 366–390 (1994).
[22] Kozen, D.: Automata and Computability. Springer-Verlag (1997).

[23] Kuich, W., Salomaa, A.: Semirings, Automata and Languages. Springer-
Verlag, Berlin (1986).
[24] Kuroda, S.Y.: Classes of languages and linear bounded automata. Infor-
mation and Control 7, 203–223 (1964).

60
[25] MacLane, S.: Natural associativity and commutativity. Rice Univ. Studies
49, 28–46 (1963).
[26] MacLane, S.: Categories for the Working Mathematician. Springer-Verlag
(1971).

[27] Maslov, V.P., Samborskii, S.N. (eds.): Advances in Soviet Mathematics 13


(1992).
[28] McWhirter, I.P.: Substitution Expressions. Journal of Computer and Sys-
tem Sciences 5, 629–637 (1971).
[29] Müger, M.: Tensor categories: A selective guided tour. arXiv:0804.3587v3
[math.CT] (18 Jun 2010).
[30] Mulvey, C.J.: Quantales. Springer Encyclopaedia of Mathematics (2001).
[31] Paseka, J., Rosicky, J.: Quantales. In: Coecke B., Moore D., Wilce A.
(eds.), Current Research in Operational Quantum Logic: Algebras, Cat-
egories and Languages. Fund. Theories Phys., vol. 111, Kluwer Academic
Publishers, pp. 245–262 (2000).
[32] Salomaa, A. and Soittola, M. Automata-Theoretic Aspects of formal power
series. Texts and Monographs in Computer Science. Springer-Verlag (1979).
[33] Spencer-Brown, G.: Laws of Form. Julian Press and Bantam, New York
(1972).
[34] Vickers, S.: Topology via Logic. Cambridge Tracts in Theoretical Computer
Science, vol. 5 Cambridge University Press (1989).
[35] Von Neumann, J.: The General and Logical Theory of Automata. In: New-
man, J.R. (ed.), The World of Mathematics, Volume 4, Simon and Schuster,
New York, pp. 2070–2098 (1956).
[36] Wood, D.: The Theory of Computation. Harper and Row (1987).
[37] Yetter, D.N.: Quantales and (Noncommutative) Linear Logic. J. of Sym-
bolic Logic 55, 41–64 (1990).

[38] Yntema, M.K.: Cap Expressions for Context-Free Languages. Information


and Control 8, 311–318 (1971).

61

You might also like