A Codatatype of Formal Languages Dmitriy Traytel September 19, 2015
1
Introduction
We define formal languages as a codataype of infinite trees branching over the alphabet 0a. Each node in such a tree indicates whether the path to this node constitutes a word inside or outside of the language. codatatype 0a language = Lang (o: bool ) (d: 0a ⇒ 0a language)
This codatatype is isormorphic to the set of lists representation of languages, but caters for definitions by corecursion and proofs by coinduction. Regular operations on languages are then defined by primitive corecursion. A difficulty arises here, since the standard definitions of concatenation and iteration from the coalgebraic literature are not primitively corecursive—they require guardedness up-to union/concatenation. Without support for up-to corecursion, these operation must be defined as a composition of primitive ones (and proved being equal to the standard definitions). As an exercise in coinduction we also prove the axioms of Kleene algebra for the defined regular operations. Furthermore, a language for context-free grammars given by productions in Greibach normal form and an initial nonterminal is constructed by primitive corecursion, yielding an executable decision procedure for the word problem without further ado.
2
Regular Languages
primcorec Zero :: 0a language where o Zero = False | d Zero = (λ . Zero) primcorec One :: 0a language where o One = True | d One = (λ . Zero) primcorec Atom :: 0a ⇒ 0a language where o (Atom a) = False | d (Atom a) = (λb. if a = b then One else Zero) primcorec Plus :: 0a language ⇒ 0a language ⇒ 0a language where o (Plus r s) = (o r ∨ o s) | d (Plus r s) = (λa. Plus (d r a) (d s a)) theorem Plus ZeroL[simp]: Plus Zero r = r by (coinduction arbitrary: r ) simp theorem Plus ZeroR[simp]: Plus r Zero = r by (coinduction arbitrary: r ) simp
1
theorem Plus assoc: Plus (Plus r s) t = Plus r (Plus s t) by (coinduction arbitrary: r s t) auto theorem Plus comm: Plus r s = Plus s r by (coinduction arbitrary: r s) auto lemma Plus rotate: Plus r (Plus s t) = Plus s (Plus r t) using Plus assoc Plus comm by metis theorem Plus idem: Plus r r = r by (coinduction arbitrary: r ) auto lemma Plus idem assoc: Plus r (Plus r s) = Plus r s by (metis Plus assoc Plus idem) lemmas Plus ACI [simp] = Plus rotate Plus comm Plus assoc Plus idem assoc Plus idem lemma Plus OneL[simp]: o r =⇒ Plus One r = r by (coinduction arbitrary: r ) auto lemma Plus OneR[simp]: o r =⇒ Plus r One = r by (coinduction arbitrary: r ) auto
Concatenation is not primitively corecursive—the corecursive call of its derivative is guarded by Plus. However, it can be defined as a composition of two primitively corecursive functions. primcorec TimesLR :: 0a language ⇒ 0a language ⇒ ( 0a × bool ) language where o (TimesLR r s) = (o r ∧ o s) | d (TimesLR r s) = (λ(a, b). if b then TimesLR (d r a) s else if o r then TimesLR (d s a) One else Zero) primcorec Times Plus :: ( 0a × bool ) language ⇒ 0a language where o (Times Plus r ) = o r | d (Times Plus r ) = (λa. Times Plus (Plus (d r (a, True)) (d r (a, False)))) lemma TimesLR ZeroL[simp]: TimesLR Zero r = Zero by (coinduction arbitrary: r ) auto lemma TimesLR ZeroR[simp]: TimesLR r Zero = Zero by (coinduction arbitrary: r ) (auto intro: exI [of Zero]) lemma TimesLR PlusL[simp]: TimesLR (Plus r s) t = Plus (TimesLR r t) (TimesLR s t) by (coinduction arbitrary: r s t) auto lemma TimesLR PlusR[simp]: TimesLR r (Plus s t) = Plus (TimesLR r s) (TimesLR r t) by (coinduction arbitrary: r s t) auto lemma Times Plus Zero[simp]: Times Plus Zero = Zero by coinduction simp lemma Times Plus Plus[simp]: Times Plus (Plus r s) = Plus (Times Plus r ) (Times Plus s) proof (coinduction arbitrary: r s) case (Lang r s) then show ?case unfolding Times Plus.sel Plus.sel by (intro conjI [OF refl ]) (metis Plus comm Plus rotate) qed lemma Times Plus TimesLR One[simp]: Times Plus (TimesLR r One) = r by (coinduction arbitrary: r ) simp
2
lemma Times Plus TimesLR PlusL[simp]: Times Plus (TimesLR (Plus r s) t) = Plus (Times Plus (TimesLR r t)) (Times Plus (TimesLR s t)) by (coinduction arbitrary: r s t) auto lemma Times Plus TimesLR PlusR[simp]: Times Plus (TimesLR r (Plus s t)) = Plus (Times Plus (TimesLR r s)) (Times Plus (TimesLR r t)) by (coinduction arbitrary: r s t) auto definition Times :: 0a language ⇒ 0a language ⇒ 0a language where Times r s = Times Plus (TimesLR r s) lemma o Times[simp]: o (Times r s) = (o r ∧ o s) unfolding Times def by simp lemma d Times[simp]: d (Times r s) = (λa. if o r then Plus (Times (d r a) s) (d s a) else Times (d r a) s) unfolding Times def by (rule ext, coinduction arbitrary: r s) auto theorem Times ZeroL[simp]: Times Zero r = Zero by coinduction simp theorem Times ZeroR[simp]: Times r Zero = Zero by (coinduction arbitrary: r ) auto theorem Times OneL[simp]: Times One r = r by (coinduction arbitrary: r rule: language.coinduct strong) (simp add : rel fun def ) theorem Times OneR[simp]: Times r One = r by (coinduction arbitrary: r ) simp
Coinduction up-to Plus–congruence relaxes the coinduction hypothesis by requiring membership in the congruence closure of the bisimulation rather than in the bisimulation itself. inductive Plus cong where Refl [intro]: x = y =⇒ Plus cong R x y | Base[intro]: R x y =⇒ Plus cong R x y | Sym: Plus cong R x y =⇒ Plus cong R y x | Trans[intro]: Plus cong R x y =⇒ Plus cong R y z =⇒ Plus cong R x z | Plus[intro]: [[Plus cong R x y; Plus cong R x 0 y 0]] =⇒ Plus cong R (Plus x x 0) (Plus y y 0) lemma language coinduct upto Plus[unfolded rel fun def , simplified , case names Lang, consumes 1 ]: assumes R: R L K and hyp: V ( L K . R L K =⇒ o L = o K ∧ rel fun op = (Plus cong R) (d L) (d K )) shows L = K proof (coinduct rule: language.coinduct[of Plus cong R]) fix L K assume Plus cong R L K then show o L = o K ∧ rel fun op = (Plus cong R) (d L) (d K ) using hyp by (induct rule: Plus cong.induct) (auto simp: rel fun def intro: Sym) qed (intro Base R) theorem Times PlusL[simp]: Times (Plus r s) t = Plus (Times r t) (Times s t) by (coinduction arbitrary: r s rule: language coinduct upto Plus) auto theorem Times PlusR[simp]: Times r (Plus s t) = Plus (Times r s) (Times r t) by (coinduction arbitrary: r s rule: language coinduct upto Plus) fastforce theorem Times assoc[simp]: Times (Times r s) t = Times r (Times s t)
3
by (coinduction arbitrary: r s t rule: language coinduct upto Plus) fastforce
Similarly to Times, iteration is not primitively corecursive (guardedness by Times is required). We apply a similar trick to obtain its definition. primcorec StarLR :: 0a language ⇒ 0a language ⇒ 0a language where o (StarLR r s) = o r | d (StarLR r s) = (λa. StarLR (d (Times r (Plus One s)) a) s) lemma StarLR Zero[simp]: StarLR Zero r = Zero by coinduction auto lemma StarLR Plus[simp]: StarLR (Plus r s) t = Plus (StarLR r t) (StarLR s t) by (coinduction arbitrary: r s) (auto simp del : Plus ACI Times PlusR) lemma StarLR Times Plus One[simp]: StarLR (Times r (Plus One s)) s = StarLR r s proof (coinduction arbitrary: r s) case Lang { fix a def L ≡ Plus (d r a) (Plus (Times (d r a) s) (d s a)) and R ≡ Times (Plus (d r a) (Plus (Times (d r a) s) (d s a))) s have Plus L (Plus R (d s a)) = Plus (Plus L (d s a)) R by (metis Plus assoc Plus comm) also have Plus L (d s a) = L unfolding L def by simp finally have Plus L (Plus R (d s a)) = Plus L R . } then show ?case by (auto simp del : StarLR Plus Plus assoc Times PlusL) qed lemma StarLR Times: StarLR (Times r s) t = Times r (StarLR s t) by (coinduction arbitrary: r s t rule: language coinduct upto Plus) (fastforce simp del : Plus ACI Times PlusR) definition Star :: 0a language ⇒ 0a language where Star r = StarLR One r lemma o Star [simp]: o (Star r ) unfolding Star def by simp lemma d Star [simp]: d (Star r ) = (λa. Times (d r a) (Star r )) unfolding Star def by (rule ext, coinduction arbitrary: r ) (auto simp add : Star def StarLR Times[symmetric]) lemma Star Zero[simp]: Star Zero = One by coinduction auto lemma Star One[simp]: Star One = One by coinduction auto lemma Star unfoldL: Star r = Plus One (Times r (Star r )) by coinduction auto primcorec Inter :: 0a language ⇒ 0a language ⇒ 0a language where o (Inter r s) = (o r ∧ o s) | d (Inter r s) = (λa. Inter (d r a) (d s a)) primcorec Not :: 0a language ⇒ 0a language where o (Not r ) = (¬ o r ) | d (Not r ) = (λa. Not (d r a))
4
primcorec Full :: 0a language (Σ∗ ) where o Full = True | d Full = (λ . Full )
Shuffle product is not primitively corecursive—the corecursive call of its derivative is guarded by Plus. However, it can be defined as a composition of two primitively corecursive functions. primcorec ShuffleLR :: 0a language ⇒ 0a language ⇒ ( 0a × bool ) language where o (ShuffleLR r s) = (o r ∧ o s) | d (ShuffleLR r s) = (λ(a, b). if b then ShuffleLR (d r a) s else ShuffleLR r (d s a)) primcorec Shuffle Plus :: ( 0a × bool ) language ⇒ 0a language where o (Shuffle Plus r ) = o r | d (Shuffle Plus r ) = (λa. Shuffle Plus (Plus (d r (a, True)) (d r (a, False)))) lemma ShuffleLR ZeroL[simp]: ShuffleLR Zero r = Zero by (coinduction arbitrary: r ) auto lemma ShuffleLR ZeroR[simp]: ShuffleLR r Zero = Zero by (coinduction arbitrary: r ) (auto intro: exI [of Zero]) lemma ShuffleLR PlusL[simp]: ShuffleLR (Plus r s) t = Plus (ShuffleLR r t) (ShuffleLR s t) by (coinduction arbitrary: r s t) auto lemma ShuffleLR PlusR[simp]: ShuffleLR r (Plus s t) = Plus (ShuffleLR r s) (ShuffleLR r t) by (coinduction arbitrary: r s t) auto lemma Shuffle Plus Zero[simp]: Shuffle Plus Zero = Zero by coinduction simp lemma Shuffle Plus Plus[simp]: Shuffle Plus (Plus r s) = Plus (Shuffle Plus r ) (Shuffle Plus s) proof (coinduction arbitrary: r s) case (Lang r s) then show ?case unfolding Shuffle Plus.sel Plus.sel by (intro conjI [OF refl ]) (metis Plus comm Plus rotate) qed lemma Shuffle Plus ShuffleLR One[simp]: Shuffle Plus (ShuffleLR r One) = r by (coinduction arbitrary: r ) simp lemma Shuffle Plus ShuffleLR PlusL[simp]: Shuffle Plus (ShuffleLR (Plus r s) t) = Plus (Shuffle Plus (ShuffleLR r t)) (Shuffle Plus (ShuffleLR s t)) by (coinduction arbitrary: r s t) auto lemma Shuffle Plus ShuffleLR PlusR[simp]: Shuffle Plus (ShuffleLR r (Plus s t)) = Plus (Shuffle Plus (ShuffleLR r s)) (Shuffle Plus (ShuffleLR r t)) by (coinduction arbitrary: r s t) auto definition Shuffle :: 0a language ⇒ 0a language ⇒ 0a language where Shuffle r s = Shuffle Plus (ShuffleLR r s) lemma o Shuffle[simp]: o (Shuffle r s) = (o r ∧ o s) unfolding Shuffle def by simp lemma d Shuffle[simp]: d (Shuffle r s) = (λa. Plus (Shuffle (d r a) s) (Shuffle r (d s a)))
5
unfolding Shuffle def by (rule ext, coinduction arbitrary: r s) auto theorem Shuffle ZeroL[simp]: Shuffle Zero r = Zero by (coinduction arbitrary: r rule: language coinduct upto Plus) (auto 0 4 ) theorem Shuffle ZeroR[simp]: Shuffle r Zero = Zero by (coinduction arbitrary: r rule: language coinduct upto Plus) (auto 0 4 ) theorem Shuffle OneL[simp]: Shuffle One r = r by (coinduction arbitrary: r ) simp theorem Shuffle OneR[simp]: Shuffle r One = r by (coinduction arbitrary: r ) simp theorem Shuffle PlusL[simp]: Shuffle (Plus r s) t = Plus (Shuffle r t) (Shuffle s t) by (coinduction arbitrary: r s t rule: language coinduct upto Plus) (force intro!: Trans[OF Plus[OF Base Base] Refl ]) theorem Shuffle PlusR[simp]: Shuffle r (Plus s t) = Plus (Shuffle r s) (Shuffle r t) by (coinduction arbitrary: r s t rule: language coinduct upto Plus) (force intro!: Trans[OF Plus[OF Base Base] Refl ]) theorem Shuffle assoc[simp]: Shuffle (Shuffle r s) t = Shuffle r (Shuffle s t) by (coinduction arbitrary: r s t rule: language coinduct upto Plus) fastforce
We generalize coinduction up-to Plus to coinduction up-to all previously defined concepts. inductive regular cong where Refl [intro]: x = y =⇒ regular cong R x y | Sym[intro]: regular cong R x y =⇒ regular cong R y x | Trans[intro]: [[regular cong R x y; regular cong R y z ]] =⇒ regular cong R x z | Base[intro]: R x y =⇒ regular cong R x y | Plus[intro]: [[regular cong R x y; regular cong R x 0 y 0]] =⇒ regular cong R (Plus x x 0) (Plus y y 0) | Times[intro]: [[regular cong R x y; regular cong R x 0 y 0]] =⇒ regular cong R (Times x x 0) (Times y y 0) | Star [intro]: [[regular cong R x y]] =⇒ regular cong R (Star x ) (Star y) | Inter [intro]: [[regular cong R x y; regular cong R x 0 y 0]] =⇒ regular cong R (Inter x x 0) (Inter y y 0) | Not[intro]: [[regular cong R x y]] =⇒ regular cong R (Not x ) (Not y) | Shuffle[intro]: [[regular cong R x y; regular cong R x 0 y 0]] =⇒ regular cong R (Shuffle x x 0) (Shuffle y y 0) lemma language coinduct upto regular [unfolded rel fun def , simplified , case names Lang, consumes 1 ]: assumes R: R L K and hyp: V ( L K . R L K =⇒ o L = o K ∧ rel fun op = (regular cong R) (d L) (d K )) shows L = K proof (coinduct rule: language.coinduct[of regular cong R]) fix L K assume regular cong R L K then show o L = o K ∧ rel fun op = (regular cong R) (d L) (d K ) using hyp by (induct rule: regular cong.induct) (auto simp: rel fun def Plus Times Shuffle) qed (intro Base R) lemma Star unfoldR: Star r = Plus One (Times (Star r ) r ) proof (coinduction arbitrary: r rule: language coinduct upto regular ) case Lang { fix a have Plus (Times (d r a) (Times (Star r ) r )) (d r a) =
6
Times (d r a) (Plus One (Times (Star r ) r )) by simp } then show ?case by (auto simp del : Times PlusR) qed lemma Star Star [simp]: Star (Star r ) = Star r by (subst Star unfoldL, coinduction arbitrary: r rule: language coinduct upto regular ) auto lemma Times Star [simp]: Times (Star r ) (Star r ) = Star r proof (coinduction arbitrary: r rule: language coinduct upto regular ) case Lang V have ∗: r s. Plus (Times r s) r = Times r (Plus s One) by simp show ?case by (auto simp del : Times PlusR Plus ACI simp: Times PlusR[symmetric] ∗) qed instantiation language :: (type) {semiring 1 , order } begin lemma Zero One[simp]: Zero 6= One by (metis One.simps(1 ) Zero.simps(1 )) definition definition definition definition
zero language = Zero one language = One plus language = Plus times language = Times
definition less eq language r s = (Plus r s = s) definition less language r s = (Plus r s = s ∧ r 6= s) lemmas language defs = zero language def one language def plus language def times language def less eq language def less language def instance proof intro classes fix x y z :: 0a language assume x ≤ y y ≤ z then show x ≤ z unfolding language defs by (metis Plus assoc) next fix x y z :: 0a language show x + y + z = x + (y + z ) unfolding language defs by (rule Plus assoc) qed (auto simp: language defs) end
We prove the missing axioms of Kleene Algebras about Star, as well as monotonicity properties and three standard interesting rules: bisimulation, sliding, and denesting. theorem le StarL: Plus One (Times r (Star r )) ≤ Star r by (rule order eq refl [OF Star unfoldL[symmetric]]) theorem le StarR: Plus One (Times (Star r ) r ) ≤ Star r by (rule order eq refl [OF Star unfoldR[symmetric]]) theorem ardenL: Plus r (Times s x ) ≤ x =⇒ Times (Star s) r ≤ x unfolding language defs proof (coinduction arbitrary: r s x rule: language coinduct upto regular ) case Lang hence o r =⇒ o x by (metis Plus.sel (1 )) moreover { fix a let ?R = (λL K . ∃ r s. L = Plus (Times (Star s) r ) K ∧ Plus r (Plus (Times s K ) K ) = K )
7
have regular cong ?R (Plus x (Times (Star s) r )) x using Lang[unfolded Plus assoc] by (auto simp only: Plus comm) hence regular cong ?R (Plus (Times (d s a) (Plus x (Times (Star s) r ))) (Plus (d r a) (d x a))) (Plus (d r a) (Plus (Times (d s a) x ) (d x a))) by (auto simp del : Times PlusR) also have (Plus (Times (d s a) (Plus x (Times (Star s) r ))) (Plus (d r a) (d x a))) = (Plus (Times (d s a) (Times (Star s) r )) (Plus (d r a) (d x a))) by (subst (3 ) Lang[symmetric]) auto finally have regular cong ?R (Plus (Times (d s a) (Times (Star s) r )) (Plus (d r a) (d x a))) (Plus (d r a) (Plus (Times (d s a) x ) (d x a))) . } ultimately show ?case by (subst (4 ) Lang[symmetric]) auto qed theorem ardenR: Plus r (Times x s) ≤ x =⇒ Times r (Star s) ≤ x unfolding language defs proof (coinduction arbitrary: r s x rule: language coinduct upto regular ) case Lang let ?RV= (λL K . ∃ r s. L = Plus (Times r (Star s)) K ∧ Plus r (Plus (Times K s) K ) = K ) have a. o x =⇒ Plus (d s a) (d x a) = d x a by (subst (1 V 2 ) Lang[symmetric]) auto then have ∗: a. ?R (Plus (Times (d r a) (Star s)) (d x a)) (d x a) by (subst Lang[symmetric]) (auto simp del : Plus comm) moreover from Lang have o r =⇒ o x by (metis Plus.sel (1 )) moreover { fix a assume o x have regular cong ?R (Plus (Times (d s a) (Star s)) (d x a)) (d x a) proof (rule Base exI conjI [OF refl ])+ from ‘ o x‘ show Plus (d s a) (Plus (Times (d x a) s) (d x a)) = d x a by (subst (1 3 ) Lang[symmetric]) auto qed from Plus[OF Base[of ?R, OF ∗[of a]] this] have regular cong ?R (Plus (Times (d r a) (Star s)) (Plus (Times (d s a) (Star s)) (d x a))) (d x a) by auto } ultimately show ?case by auto qed lemma ge One[simp]: One ≤ r ←→ o r unfolding less eq language def by (metis One.sel (1 ) Plus.sel (1 ) Plus OneL) lemma Plus mono[intro]: [[r1 ≤ s1 ; r2 ≤ s2 ]] =⇒ Plus r1 r2 ≤ Plus s1 s2 unfolding less eq language def by (metis Plus assoc Plus comm) lemma Plus upper : [[r1 ≤ s; r2 ≤ s]] =⇒ Plus r1 r2 ≤ s by (metis Plus mono Plus idem) lemma le PlusL: r ≤ Plus r s by (metis Plus idem assoc less eq language def ) lemma le PlusR: s ≤ Plus r s by (metis Plus comm Plus idem assoc less eq language def ) lemma Times mono[intro]: [[r1 ≤ s1 ; r2 ≤ s2 ]] =⇒ Times r1 r2 ≤ Times s1 s2 proof (unfold less eq language def ) assume s1 [symmetric]: Plus r1 s1 = s1 and s2 [symmetric]: Plus r2 s2 = s2
8
have Plus (Times r1 r2 ) (Times s1 s2 ) = Plus (Times r1 r2 ) (Plus (Times r1 r2 ) (Plus (Times s1 r2 ) (Plus (Times r1 s2 ) (Times s1 s2 )))) by (subst s1 , subst s2 ) auto also have . . . = Plus (Times r1 r2 ) (Plus (Times s1 r2 ) (Plus (Times r1 s2 ) (Times s1 s2 ))) by (metis Plus idem Plus assoc) also have . . . = Times s1 s2 by (subst s1 , subst s2 ) auto finally show Plus (Times r1 r2 ) (Times s1 s2 ) = Times s1 s2 . qed lemma le TimesL: o s =⇒ r ≤ Times r s by (metis Plus OneL Times OneR Times mono le PlusL order refl ) lemma le TimesR: o r =⇒ s ≤ Times r s by (metis Plus OneR Times OneL Times mono le PlusR order refl ) lemma le Star : s ≤ Star s by (subst Star unfoldL, subst Star unfoldL) (auto intro: order trans[OF le PlusL le PlusR]) lemma Star mono: assumes rs: r ≤ s shows Star r ≤ Star s proof − have Star r = Plus One (Times (Star r ) r ) by (rule Star unfoldR) also have . . . ≤ Plus One (Times (Star r ) s) by (blast intro: rs) also have Times (Star r ) s ≤ Star s proof (rule ardenL[OF Plus upper [OF le Star ]]) have Times r (Star s) ≤ Times s (Star s) by (blast intro: rs) also have Times s (Star s) ≤ Plus One (Times s (Star s)) by (rule le PlusR) finally show Times r (Star s) ≤ Star s by (subst (2 ) Star unfoldL) qed finally show ?thesis by auto qed lemma Inter mono: [[r1 ≤ s1 ; r2 ≤ s2 ]] =⇒ Inter r1 r2 ≤ Inter s1 s2 unfolding less eq language def proof (coinduction arbitrary: r1 r2 s1 s2 ) case Lang then have o (Plus r1 s1 ) = o s1 o (Plus r2 s2 ) = o s2 ∀ a. d (Plus r1 s1 ) a = d s1 a ∀ a. d (Plus r2 s2 ) a = d s2 a by simp all then show ?case by fastforce qed lemma Not antimono: r ≤ s =⇒ Not s ≤ Not r unfolding less eq language def proof (coinduction arbitrary: r s) case Lang then have o (Plus r s) = o s ∀ a. d (Plus r s) a = d s a by simp all then show ?case by auto qed lemma Not Plus[simp]: Not (Plus r s) = Inter (Not r ) (Not s) by (coinduction arbitrary: r s) auto lemma Not Inter [simp]: Not (Inter r s) = Plus (Not r ) (Not s) by (coinduction arbitrary: r s) auto lemma Inter assoc[simp]: Inter (Inter r s) t = Inter r (Inter s t) by (coinduction arbitrary: r s t) auto lemma Inter comm: Inter r s = Inter s r
9
by (coinduction arbitrary: r s) auto lemma Inter idem[simp]: Inter r r = r by (coinduction arbitrary: r ) auto lemma Inter ZeroL[simp]: Inter Zero r = Zero by (coinduction arbitrary: r ) auto lemma Inter ZeroR[simp]: Inter r Zero = Zero by (coinduction arbitrary: r ) auto lemma Inter FullL[simp]: Inter Full r = r by (coinduction arbitrary: r ) auto lemma Inter FullR[simp]: Inter r Full = r by (coinduction arbitrary: r ) auto lemma Plus FullL[simp]: Plus Full r = Full by (coinduction arbitrary: r ) auto lemma Plus FullR[simp]: Plus r Full = Full by (coinduction arbitrary: r ) auto lemma Not Not[simp]: Not (Not r ) = r by (coinduction arbitrary: r ) auto lemma Not Zero[simp]: Not Zero = Full by coinduction simp lemma Not Full [simp]: Not Full = Zero by coinduction simp lemma bisimulation: assumes Times r s = Times s t shows Times (Star r ) s = Times s (Star t) proof (rule antisym[OF ardenL[OF Plus upper [OF le TimesL]] ardenR[OF Plus upper [OF le TimesR]]]) have Times r (Times s (Star t)) = Times s (Times t (Star t)) (is ?L = ) by (simp only: assms Times assoc[symmetric]) also have . . . ≤ Times s (Star t) (is ≤ ?R) by (rule Times mono[OF order refl ord le eq trans[OF le PlusR Star unfoldL[symmetric]]]) finally show ?L ≤ ?R . next have Times (Times (Star r ) s) t = Times (Times (Star r ) r ) s (is ?L = ) by (simp only: assms Times assoc) also have . . . ≤ Times (Star r ) s (is ≤ ?R) by (rule Times mono[OF ord le eq trans[OF le PlusR Star unfoldR[symmetric]] order refl ]) finally show ?L ≤ ?R . qed simp all lemma sliding: Times (Star (Times r s)) r = Times r (Star (Times s r )) proof (rule antisym[OF ardenL[OF Plus upper [OF le TimesL]] ardenR[OF Plus upper [OF le TimesR]]]) have Times (Times r s) (Times r (Star (Times s r ))) = Times r (Times (Times s r ) (Star (Times s r ))) (is ?L = ) by simp also have . . . ≤ Times r (Star (Times s r )) (is ≤ ?R) by (rule Times mono[OF order refl ord le eq trans[OF le PlusR Star unfoldL[symmetric]]]) finally show ?L ≤ ?R . next have Times (Times (Star (Times r s)) r ) (Times s r ) =
10
Times (Times (Star (Times r s)) (Times r s)) r (is ?L = ) by simp also have . . . ≤ Times (Star (Times r s)) r (is ≤ ?R) by (rule Times mono[OF ord le eq trans[OF le PlusR Star unfoldR[symmetric]] order refl ]) finally show ?L ≤ ?R . qed simp all lemma denesting: Star (Plus r s) = Times (Star r ) (Star (Times s (Star r ))) proof (rule antisym[OF ardenR[OF Plus upper [OF Star mono[OF le PlusL]]]]) have Star (Plus r s) = Times (Star (Plus r s)) One by simp also have . . . ≤ Times (Star r ) (Star (Times s (Star r ))) proof (rule ardenL[OF Plus upper ]) show Times (Plus r s) (Times (Star r ) (Star (Times s (Star r )))) ≤ Times (Star r ) (Star (Times s (Star r ))) (is Times ?L ≤ ?R) proof (subst Times PlusL, rule Plus upper ) show Times s ?L ≤ ?R by (subst (5 ) Star unfoldL, rule order trans[OF order trans[OF le PlusR] le TimesR]) auto qed (subst (4 ) Times Star [symmetric], auto simp del : Times Star intro: le Star ) qed simp finally show Star (Plus r s) ≤ Times (Star r ) (Star (Times s (Star r ))) . next have Times (Star (Plus r s)) (Times s (Star r )) ≤ Times (Star (Plus r s)) (Star (Plus r s)) by (subst (4 ) Star unfoldL, rule Times mono[OF order refl order trans[OF Times mono[OF le PlusR Star mono[OF le PlusL]] le PlusR]]) also have . . . = Star (Plus r s) by simp finally show Times (Star (Plus r s)) (Times s (Star r )) ≤ Star (Plus r s) . qed
It is useful to lift binary operators Plus and Times to n-ary operators (that take a list as input). definition PLUS :: 0a language list ⇒ 0a language where PLUS xs ≡ foldr Plus xs Zero lemma o foldr Plus: o (foldr Plus xs s) = (∃ x ∈set (s # xs). o x ) by (induct xs arbitrary: s) auto lemma d foldr Plus: d (foldr Plus xs s) a = foldr Plus (map (λr . d r a) xs) (d s a) by (induct xs arbitrary: s) simp all lemma o PLUS [simp]: o (PLUS xs) = (∃ x ∈set xs. o x ) unfolding PLUS def o foldr Plus by simp lemma d PLUS [simp]: d (PLUS xs) a = PLUS (map (λr . d r a) xs) unfolding PLUS def d foldr Plus by simp definition TIMES :: 0a language list ⇒ 0a language where TIMES xs ≡ foldr Times xs One lemma o foldr Times: o (foldr Times xs s) = (∀ x ∈set (s # xs). o x ) by (induct xs) (auto simp: PLUS def ) primrec tails where tails [] = [[]] | tails (x # xs) = (x # xs) # tails xs lemma tails snoc[simp]: tails (xs @ [x ]) = map (λys. ys @ [x ]) (tails xs) @ [[]] by (induct xs) auto lemma length tails[simp]: length (tails xs) = Suc (length xs) by (induct xs) auto
11
lemma d foldr Times: d (foldr Times xs s) a = (let n = length (takeWhile o xs) in PLUS (map (λzs. TIMES (d (hd zs) a # tl zs)) (take (Suc n) (tails (xs @ [s]))))) by (induct xs) (auto simp: TIMES def PLUS def Let def foldr map o def ) lemma o TIMES [simp]: o (TIMES xs) = (∀ x ∈set xs. o x ) unfolding TIMES def o foldr Times by simp lemma TIMES snoc One[simp]: TIMES (xs @ [One]) = TIMES xs by (induct xs) (auto simp: TIMES def ) lemma d TIMES [simp]: d (TIMES xs) a = (let n = length (takeWhile o xs) in PLUS (map (λzs. TIMES (d (hd zs) a # tl zs)) (take (Suc n) (tails (xs @ [One]))))) unfolding TIMES def d foldr Times by simp
3
Context Free Languages
A context-free grammar consists of a list of productions for every nonterminal and an initial nonterminal. The productions are required to be in weak Greibach normal form, i.e. each right hand side of a production must either be empty or start with a terminal. locale cfg = fixes init :: 0n::enum and prod :: 0n ⇒ ( 0t + 0n) list list assumes weakGreibach: ∀ N . ∀ rhs ∈ set (prod N ). case rhs of (Inr N # ) ⇒ False | context fixes init :: 0n::enum and prod :: 0n ⇒ ( 0t + 0n) list list begin private abbreviation on N ≡ ([] ∈ set (prod N )) private fun or where or [] = True | or (Inl # ) = False | or (Inr N # xs) = (on N ∧ or xs) private abbreviation oP P ≡ fold (op ∨) (map or P ) False private abbreviation dn N a ≡ List.map filter (λxs. case xs of Inl b # xs ⇒ if a = b then Some xs else None | ⇒ None) (prod N ) private fun dr where dr [] a = [] | dr (Inl b # xs) a = (if a = b then [xs] else []) | dr (Inr N # xs) a = map (λys. ys @ xs) (dn N a) @ (if on N then dr xs a else []) private abbreviation dP P a ≡ fold (op @) (map (λr . dr r a) P ) [] primcorec subst :: ( 0t + 0n) list list ⇒ 0t language where subst P = Lang (oP P ) (λx . subst (dP P x )) end abbreviation (in cfg) lang where lang ≡ subst prod (prod init)
12
⇒ True
4
Word-theoretic Semantics of Languages
We show our language codatatype being isomorphic to the standard language representation as a set of lists. primrec in language :: 0a language ⇒ 0a list ⇒ bool where in language L [] = o L | in language L (x # xs) = in language (d L x ) xs primcorec to language :: 0a list set ⇒ 0a language where o (to language L) = ([] ∈ L) | d (to language L) = (λa. to language {w . a # w ∈ L}) lemma in language to language[simp]: Collect (in language (to language L)) = L proof (rule set eqI , unfold mem Collect eq) fix w show in language (to language L) w = (w ∈ L) by (induct w arbitrary: L) auto qed lemma to language in language[simp]: to language (Collect (in language L)) = L by (coinduction arbitrary: L) auto lemma in language bij : bij (Collect o in language) proof (rule bijI 0, unfold o apply, safe) fix L R :: 0a language assume Collect (in language L) = Collect (in language R) then show L = R unfolding set eq iff mem Collect eq by (coinduction arbitrary: L R) (metis in language.simps) next fix L :: 0a list set have L = Collect (in language (to language L)) by simp then show ∃ K . L = Collect (in language K ) by blast qed lemma to language bij : bij to language by (rule o bij [of Collect o in language]) (simp all add : fun eq iff )
5
Coinductively Defined Operations Are Standard
lemma to language empty[simp]: to language {} = Zero by (coinduction) auto lemma in language Zero[simp]: ¬ in language Zero xs by (induct xs) auto lemma in language One[simp]: in language One xs =⇒ xs = [] by (cases xs) auto lemma in language Atom[simp]: in language (Atom a) xs =⇒ xs = [a] by (cases xs) (auto split: if splits) lemma to language eps[simp]: to language {[]} = One by (rule bij is inj [OF in language bij , THEN injD]) auto lemma to language singleton[simp]: to language {[a]} = (Atom a) by (rule bij is inj [OF in language bij , THEN injD]) auto lemma to language Un[simp]: to language (A ∪ B ) = Plus (to language A) (to language B )
13
by (coinduction arbitrary: A B ) (auto simp: Collect disj eq) lemma to language Int[simp]: to language (A ∩ B ) = Inter (to language A) (to language B ) by (coinduction arbitrary: A B ) (auto simp: Collect conj eq) lemma to language Neg[simp]: to language (− A) = Not (to language A) by (coinduction arbitrary: A) (auto simp: Collect neg eq) lemma to language Diff [simp]: to language (A − B ) = Inter (to language A) (Not (to language B )) by (auto simp: Diff eq) lemma to language conc[simp]: to language (A @@ B ) = Times (to language A) (to language B ) by (coinduction arbitrary: A B rule: language coinduct upto Plus) (auto simp: Deriv def [symmetric]) lemma to language star [simp]: to language (star A) = Star (to language A) by (coinduction arbitrary: A rule: language coinduct upto regular ) (auto simp: Deriv def [symmetric]) lemma to language shuffle[simp]: to language (A k B ) = Shuffle (to language A) (to language B ) by (coinduction arbitrary: A B rule: language coinduct upto Plus) (auto simp: Deriv def [symmetric])
6
Word Problem for Context-Free Grammars
The function in language decides the word problem for a given language. Since we can construct the language of a CFG using cfg.lang we obtain an executable (but not very efficient) decision procedure for CFGs for free. abbreviation a ≡ Inl True abbreviation b ≡ Inl False abbreviation S ≡ Inr () interpretation palindromes!: cfg () λ . [[], [a], [b], [a, S , a], [b, S , b]] by unfold locales auto lemma lemma lemma lemma lemma lemma lemma lemma lemma
in language palindromes.lang [] by normalization in language palindromes.lang [True] by normalization in language palindromes.lang [False] by normalization in language palindromes.lang [True, True] by normalization in language palindromes.lang [True, False, True] by normalization ¬ in language palindromes.lang [True, False] by normalization ¬ in language palindromes.lang [True, False, True, False] by normalization in language palindromes.lang [True, False, True, True, False, True] by normalization ¬ in language palindromes.lang [True, False, True, False, False, True] by normalization
interpretation Dyck !: cfg () λ . [[], [a, S , b, S ]] by unfold locales auto lemma in language Dyck .lang [] by normalization lemma ¬ in language Dyck .lang [True] by normalization lemma ¬ in language Dyck .lang [False] by normalization lemma in language Dyck .lang [True, False, True, False] by normalization lemma in language Dyck .lang [True, True, False, False] by normalization lemma in language Dyck .lang [True, False, True, False] by normalization lemma in language Dyck .lang [True, False, True, False, True, True, False, False] by normalization lemma ¬ in language Dyck .lang [True, False, True, True, False] by normalization lemma ¬ in language Dyck .lang [True, True, False, False, False, True] by normalization
14
interpretation abSSa!: cfg () λ . [[], [a, b, S , S , a]] by unfold locales auto lemma in language abSSa.lang [] by normalization lemma ¬ in language abSSa.lang [True] by normalization lemma ¬ in language abSSa.lang [False] by normalization lemma in language abSSa.lang [True, False, True] by normalization lemma in language abSSa.lang [True, False, True, False, True, True, False, True, True] by normalization lemma in language abSSa.lang [True, False, True, False, True, True] by normalization lemma ¬ in language abSSa.lang [True, False, True, True, False] by normalization lemma ¬ in language abSSa.lang [True, True, False, False, False, True] by normalization end
15