Blog de Frédéric

To content | To menu | To search

Thursday, April 4 2013

Suslin’s Problem

In a previous blog post, I mentioned the classical independence results regarding the Axiom of Choice and the Generalized Continuum Hypothesis. Here, I’m going to talk about a slightly less known problem that is undecidable in ZFC. It is about a characterization of the set of reals \mathbb{R} and its formulation does not involve at all cardinal arithmetics or the axiom of choice, but only properties of ordered sets.

First, the set of rationals (, <)(\mathbb{Q},<) is well-known to be countable. It is linearly ordered (for any x,yx,y\in\mathbb{Q} either x<yx<y or y<xy<x), unbounded (for any xx\in\mathbb{Q} there is y1,y2y_{1},y_{2}\in\mathbb{Q} such that x<y1x<y_{1} and y2<xy_{2}<x) and dense (for any x,yx,y\in\mathbb{Q} if x<yx<y we can find zz\in\mathbb{Q} such that x<z<yx<z<y). It turns out that \mathbb{Q} can be characterized by these order properties:

Lemma 0.1.

Let (P, <)(P,<) be a countable, dense, unbounded linearly ordered set. Then (P, <)(P,<) is isomorphic to (, <)(\mathbb{Q},<).


Let P={pn:n}P=\{p_{n}:n\in\mathbb{N}\} and ={qn:n}\mathbb{Q}=\{q_{n}:n\in\mathbb{N}\} be enumerations of PP and \mathbb{Q}. We shall construct by induction a sequence f0f1f2f_{0}\subseteq f_{1}\subseteq f_{2}\subseteq... of functions such that for all nn\in\mathbb{N}, dom(fn){pi:i<n}\operatorname{dom}(f_{n})\supseteq\{p_{i}:i<n\}, ran(fn){qi:i<n}\operatorname{ran}(f_{n})\supseteq\{q_{i}:i<n\} and x,ydom(fn),x<yf(x)<f(y)\forall x,y\in\operatorname{dom}(f_{n}),x<y\Leftrightarrow f(x)<f(y). Then f=nfnf=\bigcup_{{n\in\mathbb{N}}}f_{n} is a function: if (x,y),(x,z)f(x,y),(x,z)\in f then there is nn large enough such that (x,y),(x,z)fn(x,y),(x,z)\in f_{n} and since fnf_{n} is a function y=zy=z. Moreover, dom(f)=ndom(fn)=P\operatorname{dom}(f)=\bigcup_{{n\in\mathbb{N}}}\operatorname{dom}(f_{n})=P and ran(f)=nran(fn)=\operatorname{ran}(f)=\bigcup_{{n\in\mathbb{N}}}\operatorname{ran}(f_{n})=% \mathbb{Q}. Finally, for any x,yPx,y\in P there is nn large enough such that x,ydom(fn)x,y\in\operatorname{dom}(f_{n}) and so x<yfn(x)<fn(y)f(x)<f(y)x<y\Leftrightarrow f_{n}(x)<f_{n}(y)\Leftrightarrow f(x)<f(y). Hence ff is an isomorphism between (P, <)(P,<) and (, <)(\mathbb{Q},<).

Thus let f0=f_{0}=\emptyset. If fnf_{n} is defined, we construct fn+1fnf_{{n+1}}\supseteq f_{n} as follows. Suppose pndom(fn)p_{n}\notin\operatorname{dom}(f_{n}). If i<n,pn>pi\forall i<n,p_{n}>p_{i} then because \mathbb{Q} is unbounded we can consider the least n0n_{0} such that i<n,f(pi)<qn0\forall i<n,f(p_{i})<q_{{n_{0}}} and set fn+1(pn)=qn0f_{{n+1}}(p_{n})=q_{{n_{0}}}. Similarly if i<n,pn<pi\forall i<n,p_{n}<p_{i}. Otherwise, let i1,i2<ni_{1},i_{2}<n such that pi1<pn<pi2p_{{i_{1}}}<p_{n}<p_{{i_{2}}} with pi1,pi2p_{{i_{1}}},p_{{i_{2}}} respectively the largest and smallest possible. Because \mathbb{Q} is dense we can consider the least n0n_{0} such that fn(pi1)<qn0<fn(pi2)f_{{n}}(p_{{i_{1}}})<q_{{n_{0}}}<f_{{n}}(p_{{i_{2}}}) and again set fn+1(pn)=qn0f_{{n+1}}(p_{n})=q_{{n_{0}}}. Similarly, if nn0n\neq n_{0} we use the fact that PP is unbounded and dense to find m0n+1m_{0}\geq n+1 that allows to define fn+1(pm0)=qnf_{{n+1}}(p_{{m_{0}}})=q_{n} and ensures fn+1f_{{n+1}} is order-preserving.

We now notice that \mathbb{R} is linearly ordered, unbounded, dense and has the least upper-bound property (that is any nonempty bounded subset of \mathbb{R} has a least upper-bound). Moreover, the subset \mathbb{Q} is countable and dense in \mathbb{R} (that is for any x,yx,y\in\mathbb{R} such that x<yx<y we can find zz\in\mathbb{Q} such that x<z<yx<z<y). Using the previous lemma, we deduce again that these order properties give a characterization of the set of reals:

Theorem 0.1.

Let (R, <)(R,<) be an unbounded, dense and linearly ordered set with the least upper-bound property. Suppose that RR has a dense countable subset PP. Then (R, <)(R,<) is isomorphic to (, <)(\mathbb{R},<).


PP is countable by assumption and since PRP\subseteq R it is also linearly ordered. If x,yPRx,y\in P\subseteq R and x<yx<y then by density of PP in RR there is zPz\in P such that x<z<yx<z<y. So PP is actually dense. Similarly, if xPRx\in P\subseteq R then since RR is unbounded there is y1,y2Ry_{1},y_{2}\in R such that y2<x<y2y_{2}<x<y_{2} and again by density of PP in RR we can find z1,z2Pz_{1},z_{2}\in P such that y2<z2<x<z1<y1y_{2}<z_{2}<x<z_{1}<y_{1}. So PP is unbounded. By the previous lemma, there is an isomorphism of ordered sets f:Pf:P\rightarrow\mathbb{Q}.

We define for all xRx\in R, f(x)=supyP;y<xf(y)f_{\star}(x)=\sup_{{y\in P;y<x}}f(y). Because PP is dense in RR and \mathbb{R} has the least upper bound property this is well-defined. If xPx\in P then for all y<xy<x such that yPy\in P we have f(y)<f(x)f(y)<f(x) and so f(x)f(x)f_{\star}(x)\leq f(x). If f(x)<f(x)f_{\star}(x)<f(x) we could find (by density of \mathbb{Q} in \mathbb{R}) an element qq\in\mathbb{Q} such that f(x)<q<f(x)f_{\star}(x)<q<f(x). For p=f-1(q)p=f^{{-1}}(q) we get p<xp<x and so q=f(p)f(x)q=f(p)\leq f_{\star}(x). A contradiction. So f|P=f{f_{\star}}_{{|P}}=f.

Let x,yRx,y\in R such that x<yx<y. Because ff is increasing we get f(x)f(y)f_{\star}(x)\leq f_{\star}(y). By density of PP in RR we can find p1,p2Pp_{1},p_{2}\in P such that x<p1<p2<yx<p_{1}<p_{2}<y. Again, we get f(x)f(p1)=p1f_{\star}(x)\leq f_{\star}(p_{1})=p_{1} and p2=f(p2)f(y)p_{2}=f_{\star}(p_{2})\leq f_{\star}(y). Hence we actually have f(x)<f(y)f_{\star}(x)<f_{\star}(y). In particular, ff_{\star} is one-to-one.

We shall prove that ff_{\star} is surjective. Of course f(P)=f(P)=f_{\star}(P)=f(P)=\mathbb{Q} so let’s consider rr\in\mathbb{R}\setminus\mathbb{Q}. Define x=supq<r:qf-1(q)Rx=\sup_{{q<r:q\in\mathbb{Q}}}f^{{-1}}(q)\in R. This is well-defined because \mathbb{Q} is dense in \mathbb{R} and RR has the the least upper bound property. Then for all q<rq<r such that qq\in\mathbb{Q} we have f-1(q)xf^{{-1}}(q)\leq x by definition. By density of \mathbb{Q} in \mathbb{R} we can actually find qq^{{\prime}}\in\mathbb{Q} such that q<q<rq<q^{{\prime}}<r and so f-1(q)<f-1(q)xf^{{-1}}(q)<f^{{-1}}(q^{{\prime}})\leq x. We have x>f-1(q)Px>f^{{-1}}(q)\in P and so f(x)f(f-1(q))=qf_{\star}(x)\geq f(f^{{-1}}(q))=q. Hence f(x)rf_{\star}(x)\geq r. Suppose that r<f(x)r<f_{\star}(x) and consider (by density of \mathbb{Q} in \mathbb{R}) some qq\in\mathbb{Q} such that r<q<f(x)r<q<f_{\star}(x) and let p=f-1(q)p=f^{{-1}}(q). If p<xp<x then there is qq^{{\prime}}\in\mathbb{Q}, q<rq^{{\prime}}<r, such that pf-1(q)p\leq f^{{-1}}(q^{{\prime}}). Then r<q=f(p)q<rr<q=f(p)\leq q^{{\prime}}<r a contradiction. If instead pxp\geq x then f(x)f(p)=q<f(x)f_{{\star}}(x)\leq f_{{\star}}(p)=q<f_{\star}(x) which is again a contradiction.

Finally, let x,yRx,y\in R such that f(x)<f(y)f_{\star}(x)<f_{\star}(y). Because RR is totally ordered either x<yx<y or y<xy<x. But the latter is impossible since we saw above that it implied f(y)<f(x)f_{\star}(y)<f_{\star}(x). Hence ff_{\star} is an isomorphism between (R, <)(R,<) and (, <)(\mathbb{R},<).

Now consider a totally ordered set (R, <)(R,<). For any a,ba,b we define the open interval (a,b)={xR:a<x<b}(a,b)=\{x\in R:a<x<b\}. Suppose P={pn:n}P=\{p_{n}:n\in\mathbb{N}\} is a dense subset of RR. If ((ai,bi))iI{\left((a_{i},b_{i})\right)}_{{i\in I}} is a family of pairwise disjoint open intervals then we can associate to any (ai,bi)(a_{i},b_{i}) the least nin_{i}\in\mathbb{N} such that pni(ai,bi)p_{{n_{i}}}\in(a_{i},b_{i}) (by density of PP in RR). Since the family is disjoint the function inii\mapsto n_{i} obtained is one-to-one and so the family is at most countable. One naturally wonders what happens if we replace in theorem 0.1 the existence of a countable dense subset by this weaker property on open intervals:

Problem 0.1 (Suslin’s Problem).

Let (R, <)(R,<) be an unbounded, dense and linearly ordered set with the least upper-bound property. Suppose that any family of disjoint open intervals in RR is at most countable. Is (R, <)(R,<) isomorphic to (, <)(\mathbb{R},<)?

We have seen how this problem arises from a natural generalization of a characterization of \mathbb{R}. We note that we did not use the Axiom of Choice in the above analysis and that the problem can be expressed using only definitions on ordered sets. However, in order to answer Suslin’s Problem we will need to introduce more concepts. We will assume familiarity with basic notions of Set Theory like ordinals, cardinals or the Axiom of Choice. The first five chapters of Thomas Jech’s book ‘‘Set Theory’’ should be enough. In addition, we will rely on the axiom MA1\mathrm{MA}_{{\aleph_{1}}} and on the Diamond Principle \Diamond, that we define here:

Definition 0.1 (Martin’s Axiom 1\aleph_{1}, Diamond Principle).

MA1\mathrm{MA}_{{\aleph_{1}}} and \Diamond are defined as follows:

  • Let (P, <)(P,<) be a partially ordered set. Two elements x,yx,y are compatible if there is zz such that zxz\leq x and zyz\leq y. Note that comparable implies compatible.

  • DD is dense in PP if for any pPp\in P there is dDd\in D such that dpd\leq p.

  • GPG\subseteq P is a filter if:

    • GG\neq\emptyset

    • For any p,qPp,q\in P such that pqp\leq q and pGp\in G we have qGq\in G

    • Any p,qGp,q\in G there is rGr\in G such that rp,qr\leq p,q.

  • MA1\mathrm{MA}_{{\aleph_{1}}}: Let (P, <)(P,<) be a partially ordered set such that any subset of paiwise incompatible elements of PP is at most countable. Then for any family of at most 1\aleph_{1} dense subsets there is a filter GPG\subseteq P that has a nonempty intersection with each element of this family.

  • A set Cω1C\subseteq\omega_{1} is closed unbounded if

    • It is unbounded in the sense that for any α<ω1\alpha<\omega_{1} there is βC\beta\in C such that α<β\alpha<\beta

    • It is closed for the order topology, or equivalently if for any γ<ω1\gamma<\omega_{1} and any γ\gamma-sequence α0<α1<<αξ<\alpha_{0}<\alpha_{1}<...<\alpha_{\xi}<... of elements of CC, limξγαξ\lim_{{\xi\rightarrow\gamma}}\alpha_{\xi} is in CC.

  • A set Sω1S\subseteq\omega_{1} is stationary if for any CC closed unbounded, SCS\cap C\neq\emptyset.

  • \Diamond: There is an ω1\omega_{1}-sequence of sets SααS_{\alpha}\subseteq\alpha such that for every Xω1X\subseteq\omega_{1} the set {α<ω1:Xα=Sα}\{\alpha<\omega_{1}:X\cap\alpha=S_{\alpha}\} is stationary.

First we show that if MA1\mathrm{MA}_{{\aleph_{1}}} holds, then we get a positive answer to Suslin’s problem:

Theorem 0.2.

Assume Martin’s axiom MA1\mathrm{MA}_{{\aleph_{1}}} holds. Let (R, <)(R,<) be an unbounded, dense and linearly ordered set with the least upper-bound property. Suppose that any disjoint family of open intervals in RR is at most countable. Then (R, <)(R,<) is isomorphic to (, <)(\mathbb{R},<).


Suppose (R, <)(R,<) is not isomorphic to (, <)(\mathbb{R},<) and in particular does not have any countable dense subset (otherwise we could apply theorem 0.1). We define closed intervals IαRI_{\alpha}\subseteq R by induction on α<ω1\alpha<\omega_{1}. If Iβ=[aβ,bβ]I_{\beta}=[a_{\beta},b_{\beta}] is defined for all β<α\beta<\alpha then the set C={aβ:β<α}{bβ:β<α}C=\{a_{\beta}:\beta<\alpha\}\cup\{b_{\beta}:\beta<\alpha\} is countable and thus is not dense in RR. Then there is aα<bαa_{\alpha}<b_{\alpha} such that Iα=[aα,bα]I_{\alpha}=[a_{\alpha},b_{\alpha}] is disjoint from CC. We define the set S={Iα,α<ω1}S=\{I_{\alpha},\alpha<\omega_{1}\}. Clearly, |S|=1|S|=\aleph_{1} and (S, )(S,\subsetneq) is partially ordered. We note that if β<α<ω1\beta<\alpha<\omega_{1} then by construction aβ,bβIαa_{\beta},b_{\beta}\notin I_{\alpha} and so either IαIβ=I_{\alpha}\cap I_{\beta}=\emptyset or IαIβI_{\alpha}\subsetneq I_{\beta}. In particular, comparable is the same as compatible in SS and any family of pairwise incomparable/incompatible elements of SS is a family of pairwise disjoint intervals of RR so at most countable.

Let α<ω1\alpha<\omega_{1} and define PIα={IS:IIα}P_{{I_{\alpha}}}=\{I\in S:I\supsetneq I_{\alpha}\}. Let XPIαX\subseteq P_{{I_{\alpha}}} nonempty. Let β\beta the least ordinal such that IβXI_{\beta}\in X. If IγXI_{\gamma}\in X then βγ\beta\leq\gamma and IγIβIαI_{\gamma}\cap I_{\beta}\supseteq I_{\alpha}\neq\emptyset. Hence by the previous remark IβIγI_{\beta}\supseteq I_{\gamma}. So PIαP_{{I_{\alpha}}} is well-ordered by \supseteq and we define o(Iα)o(I_{\alpha}) the order-type of PIαP_{{I_{\alpha}}}. We note that the set PIαP_{{I_{\alpha}}} can be enumerated by Iα1Iα2IαI_{{\alpha_{1}}}\supsetneq I_{{\alpha_{2}}}\supsetneq...\supsetneq I_{\alpha} for some αξα\alpha_{\xi}\leq\alpha and so o(Iα)<ω1o(I_{\alpha})<\omega_{1}. Moreover for any α,β<ω\alpha,\beta<\omega, if IαIβI_{{\alpha}}\supsetneq I_{{\beta}} then PIαP_{{I_{\alpha}}} is an initial segment of PIβP_{{I_{\beta}}} and so o(Iα)<o(Iβ)o(I_{\alpha})<o(I_{\beta}). Hence for each α<ω1\alpha<\omega_{1} the set Lα={IS:o(I)=α}L_{\alpha}=\{I\in S:o(I)=\alpha\} has pairwise incomparable elements and so is at most countable.

For any ISI\in S, define SI={JS:JI}S_{I}=\{J\in S:J\subseteq I\} and let T={IS:|SI|=1}T=\{I\in S:|S_{I}|=\aleph_{1}\}. Let α<ω1\alpha<\omega_{1} and suppose that LαT=L_{\alpha}\cap T=\emptyset. Then S=β<αLβILαSIS=\bigcup_{{\beta<\alpha}}L_{\beta}\cup\bigcup_{{I\in L_{\alpha}}}S_{I} and since the LβL_{\beta} are at most countable for β<ω1\beta<\omega_{1} and the SIS_{I} are at most countable for ILαI\in L_{\alpha} we would have SS at most countable, a contradiction. So for each α<ω1\alpha<\omega_{1}, there is ITLαI\in T\cap L_{\alpha} and in particular |T|=1|T|=\aleph_{1}. We note that if ITI\in T and JIJ\supseteq I then SISJS_{I}\subseteq S_{J} and so JTJ\in T. In particular, PI={JT:JI}P_{I}=\{J\in T:J\supsetneq I\} and thus without loss of generality we may assume that S=TS=T.

For any α<ω1\alpha<\omega_{1} let Dα={IDα:o(I)>α}D_{\alpha}=\{I\in D_{\alpha}:o(I)>\alpha\}. For any ISI\in S, |SI|=1|S_{I}|=\aleph_{1} and SI=(βαSILβ)SIDαS_{I}=\left(\bigcup_{{\beta\leq\alpha}}S_{I}\cap L_{\beta}\right)\cup S_{I}% \cap D_{\alpha}. The first term is at most countable and so the second is uncountable and a fortiori nonempty. So DαD_{\alpha} is a dense subset of SS.

Using MA1\mathrm{MA}_{{\aleph_{1}}}, we find GSG\subseteq S a filter that intersects each DαD_{\alpha}. By definition, elements of a filter are pairwise compatible and so pairwise comparable. Let us construct by induction on α<ω1\alpha<\omega_{1}, some sets JαGJ_{\alpha}\in G. If JβJ_{\beta} is constructed for any β<α\beta<\alpha then γ=supβ<αo(Jβ)<ω1\gamma=\sup_{{\beta<\alpha}}o(J_{\beta})<\omega_{1} and we can pick JαGDγJ_{\alpha}\in G\cap D_{\gamma}. We obtain a decreasing ω1\omega_{1}-sequence of intervals J0J1JαJ_{0}\supsetneq J_{1}\supsetneq...\supsetneq J_{\alpha}\supsetneq.... If Jα=[xα,yα]J_{\alpha}=[x_{\alpha},y_{\alpha}] then this gives an increasing sequence x0<x1<x2<<xα<x_{0}<x_{1}<x_{2}<...<x_{\alpha}<.... The sets (xα,xα+1)(x_{\alpha},x_{{\alpha+1}}) form an uncountable family of disjoint open intervals. A contradiction.

Finally, we show that the \Diamond principle provides a negative answer to Suslin’s problem:

Theorem 0.3.

Assume the \Diamond principle holds. Then there is a linearly ordered set (R, <)(R,<) not isomorphic to (, <)(\mathbb{R},<), unbounded, dense, that has the least upper-bound property and such that any family of disjoint open intervals is at most countable.


Let (Sα)α<ω1{(S_{\alpha})}_{{\alpha<\omega_{1}}} be a \Diamond-sequence. We first construct a partial ordering \prec of T=ω1T=\omega_{1}. We define for all 1α<ω11\leq\alpha<\omega_{1} an ordering (Tα, )(T_{\alpha},\prec) on initial segments Tαω1T_{\alpha}\subseteq\omega_{1} and obtain the ordering \prec on ω1=T=α<ω1Tα\omega_{1}=T=\bigcup_{{\alpha<\omega_{1}}}T_{\alpha}.

Each (Tα, )(T_{\alpha},\prec) will be a tree i.e. for any xx in the tree the set Px={y:yx}P_{x}=\{y:y\prec x\} is well-ordered. As in the proof of 0.2 we can define o(x)o(x) the order-type of PxP_{x}. The level α\alpha is the set of elements such that o(x)=αo(x)=\alpha. The height of a tree is defined as the supremium of the o(x)+1o(x)+1. In a tree, a branch is a maximal linearly ordered subset and an antichain a subset of pairwise incomparable elements. A branch is also well-ordered and so we can define its length as its order-type. TαT_{\alpha} is constructed such that its height is α\alpha and for each xTαx\in T_{\alpha} there is some yxy\succ x at each higher level less than α\alpha.

We let T1={0}T_{1}=\{0\}. If α\alpha is a limit ordinal then (Tα, )(T_{\alpha},\prec) is the union of (Tβ, )(T_{\beta},\prec) for β<α\beta<\alpha. If α=β+1\alpha=\beta+1 is a successor ordinal, then the highest level of TαT_{\alpha} is Lβ={xTα:o(x)=β}L_{\beta}=\{x\in T_{\alpha}:o(x)=\beta\}. Tα+1T_{{\alpha+1}} is obtained by adding 0\aleph_{0} immediate successors to each element of LβL_{\beta}. These successors are taken from ω1\omega_{1} in a way that Tα+1T_{{\alpha+1}} is an initial segment of ω1\omega_{1}.

Let α\alpha is a limit ordinal. Let A=SαA=S_{\alpha} if it is a maximal antichain in (Tα, )(T_{\alpha},\prec) and take AA an arbitrary maximal antichain of TαT_{\alpha} otherwise. Then for each tTαt\in T_{\alpha} there is aAa\in A such that either ata\prec t or tat\prec a. Let btb_{t} be a branch that contains a,ta,t. We construct (Tα+1, )(T_{{\alpha+1}},\prec) by adding for each branch btb_{t} some xbtx_{{b_{t}}} that is greater than all the elements of btb_{t}. We can choose btb_{t} in way that it contains an element of each level less than α\alpha and so o(xbt)=αo(x_{{b_{t}}})=\alpha and the height of Tα+1T_{{\alpha+1}} is α+1\alpha+1. We note that each tTα+1t\in T_{{\alpha+1}} is either in TαT_{\alpha} (so comparable with some element of AA) or greater than (a fortiori comparable with) one element of AA. So AA is an antichain in (Tα+1, )(T_{{\alpha+1}},\prec).

Now consider a maximal antichain AT=ω1A\subseteq T=\omega_{1} and CC the set of ordinals α<ω1\alpha<\omega_{1} such that ‘‘ATαA\cap T_{\alpha} is a maximal antichain in TαT_{\alpha} and Tα=αT_{\alpha}=\alpha’’. Let α0<α1<<αξ<\alpha_{0}<\alpha_{1}<...<\alpha_{\xi}<... (ξ<γ<ω1\xi<\gamma<\omega_{1}) be a sequence of elements in CC and consider the limit ordinal λ=limξ<γαξ\lambda=\lim_{{\xi<\gamma}}\alpha_{\xi}. By construction, Tλ=α<λTα=ξ<γTαξ=ξ<γαξ=λT_{\lambda}=\bigcup_{{\alpha<\lambda}}T_{\alpha}=\bigcup_{{\xi<\gamma}}T_{{% \alpha_{\xi}}}=\bigcup_{{\xi<\gamma}}\alpha_{\xi}=\lambda. If xTλx\in T_{\lambda}, there is ξ<γ\xi<\gamma such that xTαξx\in T_{{\alpha_{\xi}}} and so xx is comparable with some yATαξATλy\in A\cap T_{{\alpha_{\xi}}}\subseteq A\cap T_{\lambda}. So ATλA\cap T_{\lambda} is a maximal antichain in TλT_{\lambda}. Finally λC\lambda\in C and CC is closed.

We note that T1={0}1T_{1}=\{0\}\geq 1. If TααT_{\alpha}\geq\alpha then Tα+1T_{{\alpha+1}} is obtained by adding at least one element at the end of the initial segment TαT_{\alpha} and so Tα+1α+1T_{{\alpha+1}}\geq\alpha+1. Finally if λ>0\lambda>0 is limit and TααT_{\alpha}\geq\alpha for each α<λ\alpha<\lambda then Tλ=α<λTαsupα<λα=λT_{\lambda}=\bigcup_{{\alpha<\lambda}}T_{\alpha}\geq\sup_{{\alpha<\lambda}}% \alpha=\lambda. Moreover by definition, each TαT_{\alpha} is at most countable. Let’s come back to the closed set CC above. Let α0<ω1\alpha_{0}<\omega_{1} be arbitrary. For each n<ωn<\omega, we let α2n+1\alpha_{{2n+1}} be the limit of the sequence α0Tα0TTα0TTTα0\alpha_{0}\leq T_{{\alpha_{0}}}\leq T_{{T_{{\alpha_{0}}}}}\leq T_{{T_{{T_{{% \alpha_{0}}}}}}}.... By definition, Tα2n+1=ξ<α2n+1Tξ=α0Tα0TTα0=α2n+1T_{{\alpha_{{2n+1}}}}=\bigcup_{{\xi<\alpha_{{2n+1}}}}T_{\xi}=\alpha_{0}\cup T_% {{\alpha_{0}}}\cup T_{{T_{{\alpha_{0}}}}}...=\alpha_{{2n+1}}. Because AA is a maximal antichain in TT, for each xTα2n+1x\in T_{{\alpha_{{2n+1}}}} we can find some αxα2n+1\alpha_{x}\geq\alpha_{{2n+1}} and axAαxa_{x}\in A_{{\alpha_{x}}} that is comparable with xx. Because Tα2n+1T_{{\alpha_{{2n+1}}}} is countable we can define α2n+2=supxTα2n+1αx<ω1\alpha_{{2n+2}}=\sup_{{x\in T_{{\alpha_{{2n+1}}}}}}\alpha_{x}<\omega_{1}. Then any xTα2n+1x\in T_{{\alpha_{{2n+1}}}} is comparable with some element of axATα2n+2a_{x}\in A\cap T_{{\alpha_{{2n+2}}}}. Let λ=limn<ωαn\lambda=\lim_{{n<\omega}}\alpha_{n}. With the same method as to prove the fact that CC is closed, the equality λ=limn<ωα2n+1\lambda=\lim_{{n<\omega}}\alpha_{{2n+1}} shows that Tλ=λT_{\lambda}=\lambda while the equality λ=limn<ωα2n+2\lambda=\lim_{{n<\omega}}\alpha_{{2n+2}} shows that ATλA\cap T_{\lambda} is a maximal antichain in TλT_{\lambda}. So λC\lambda\in C and CC is closed unbounded.

Using the Diamond principle, {α<ω1:Aα=Sα}C\{\alpha<\omega_{1}:A\cap\alpha=S_{\alpha}\}\cap C\neq\emptyset. If α\alpha is in the intersection, then Sα=ATαS_{\alpha}=A\cap T_{\alpha} is a maximal antichain in TαT_{\alpha}. By construction, it is also a maximal antichain in Tα+1T_{{\alpha+1}}. Each element of aATα+1a\in A\cap T_{{\alpha+1}} is at level o(a)αo(a)\leq\alpha. Any tTα+1t\in T_{{\alpha+1}} is comparable with some element of ATα+1A\cap T_{{\alpha+1}}. Moreover, by contruction any tTTα+1t^{{\prime}}\in T\setminus T_{{\alpha+1}} has some predecessor tTα+1t\in T_{{\alpha+1}} at level α\alpha and there is aATα+1a\in A\cap T_{{\alpha+1}} that is comparable with tt. Necessarily, o(a)<o(t)=αo(a)<o(t)=\alpha and so atta\prec t\preceq t^{{\prime}}. Thus ATα+1A\cap T_{{\alpha+1}} is maximal in TT and A=ATα+1Tα+1A=A\cap T_{{\alpha+1}}\subseteq T_{{\alpha+1}} is at most countable.

Let BB be a branch in TT. It is clear that BB is nonempty. Actually it is infinite: otherwise there is some limit ordinal α>0\alpha>0 such that BTαB\subseteq T_{\alpha} and by construction we can find some ymaxBy\succ\max B contradicting the maximality of BB. By construction, each xTx\in T has infinitely many successors at the next level and so these successors are pairwise incomparable. Hence if xBx\in B then we can pick one zxBz_{x}\notin B among these succcessors. Let xyx\prec y be two elements of BB. Then o(zx)=o(x)+1>o(y)+1=o(zy)o(z_{x})=o(x)+1>o(y)+1=o(z_{y}) so zxzyz_{x}\preceq z_{y} is impossible. Suppose zyzxz_{y}\prec z_{x}. We also have xzxx\prec z_{x} by definition. If zyxz_{y}\preceq x then we would have zyBz_{y}\in B because BB is a branch. If xzyx\prec z_{y} we would get o(x)<o(zy)=o(y)+1o(x)o(x)<o(z_{y})=o(y)+1\leq o(x). Hence zx,zyz_{x},z_{y} are incomparable (in particular distinct) and the set {zx:xB}\{z_{x}:x\in B\} is an antichain in TT and so BB is countable.

Finally, we define SS the set of all branches of TT. By construction, each xTx\in T has countably many immediate successors and we order them as \mathbb{Q}. Let B1,B2B_{1},B_{2} be two branches and α\alpha the least level where they differ. The level 0 is T1={0}T_{1}=\{0\} so α>0\alpha>0. If α\alpha is limit then the restriction of the branches B1,B2B_{1},B_{2} to TαT_{\alpha} is the same branch bb and Tα+1T_{{\alpha+1}} has been contructed in a way that there is only one possible element at level α\alpha to extend this branch and this is a contradiction. So α\alpha is actually a successor ordinal β+1\beta+1. Hence if B1(α)B1B_{1}(\alpha)\in B_{1} and B2(α)B2B_{2}(\alpha)\in B_{2} are the immediate successors of the point in B1B2B_{1}\cap B_{2} at level β\beta, we order B1,B2B_{1},B_{2} according to whether B1(α)<B2(α)B_{1}(\alpha)<B_{2}(\alpha) or B1(α)>B2(α)B_{1}(\alpha)>B_{2}(\alpha), using the \mathbb{Q}-isomorphic order we just defined. Clearly, this gives a linear ordering <S<_{S}. It is also unbounded: for any BSB\in S, if B(1)BB(1)\in B is the element at level 1 pick xx greater (or smaller) than for the \mathbb{Q}-isomorphic order on successors of B(0)=0B(0)=0 and consider a branch extending 0x0\prec x.

Now consider two branches B1<SB2B_{1}<_{S}B_{2}. Let α=β+1\alpha=\beta+1 be as above. We can find xx an immediate successor of B1(β)B_{1}(\beta) such that B1(α)<x<B2(α)B_{1}(\alpha)<x<B_{2}(\alpha) for the \mathbb{Q}-isomorphic order on immediate successors of B1(β)B_{1}(\beta). Let Ix={BS:xB}I_{x}=\{B\in S:x\in B\}. Any BIxB\in I_{x} contains {yT:yx}={yT:yB1(β)}={yT:yB2(β)}\{y\in T:y\prec x\}=\{y\in T:y\prec B_{1}(\beta)\}=\{y\in T:y\prec B_{2}(\beta)\}. Moreover x=B(α)x=B(\alpha) and so B1<B<B2B_{1}<B<B_{2}. IxI_{x} is nonempty (we can extend {yT:yx}\{y\in T:y\preceq x\} to a maximal branch) and so SS is dense. If IxIy=I_{x}\cap I_{y}=\emptyset then x,yx,y are incomparable. So from any collection of disjoint open intervals (B1i,B2i)iI{(B_{{1i}},B_{{2i}})}_{{i\in I}} we get an antichain {xi:iI}\{x_{i}:i\in I\} and so II is at most countable.

Let CC be a countable set of branches in TT. Since these branches are countable, we can find α<ω1\alpha<\omega_{1} larger than the length of any branches in CC. If xTx\in T is at level greater than α\alpha then for all BCB\in C, xBx\notin B so BIxB\notin I_{x}. Finally CIx=C\cap I_{x}=\emptyset and CC is not dense.

Now let RR be the Dedekind-MacNeille completion of SS. It is unbounded, linearly ordered, has the least upper-bound property and SS is dense in RR. Using the fact that SS is dense in RR we deduce that RR is dense. Similarly, if (ai,bi)iI{(a_{i},b_{i})}_{{i\in I}} is any collection of disjoint open intervals in RR we can find ci,diSc_{i},d_{i}\in S such that ai<ci<di<bia_{i}<c_{i}<d_{i}<b_{i}. Then (ci,di)iI{(c_{i},d_{i})}_{{i\in I}} is a collection of disjoint open intervals in SS and so II is at most countable.

Finally, it remains to prove that RR is not isomorphic to \mathbb{R} and it suffices to show that RR does not have any countable dense subset CC. If CC is a countable subset of RR, then for any elements a<ba<b in CC we pick cSc\in S such that a<c<ba<c<b. This gives a countable subset CC^{{\prime}} of SS. If a<ba<b are elements of SS, we can find c,dCc,d\in C such that a<c<d<ba<c<d<b and so eCe\in C^{{\prime}} such that a<c<e<d<ba<c<e<d<b. Thus CC^{{\prime}} would be a countable dense subset of SS. A contradiction.

The \Diamond principle holds in the model LL of constructible sets. Using iterated forcing, we can construct a model of ZFC in which MA1\mathrm{MA}_{{\aleph_{1}}} holds. Using theorems 0.2 and 0.3 we deduce that both the positive and negative answers to Suslin’s problem are consistent with ZFC and so Suslin’s problem is undecidable in ZFC. It’s remarkable that a problem that only involves linearly ordered set can be solved using sophisticated methods from Set Theory. The above proofs follow chapters 4, 9, 15 and 16 from Thomas Jech’s book ‘‘Set Theory’’. In particular:

  • Lemma 0.1 and theorem 0.1 are based on the sketch given in theorem 4.3.

  • Theorem 0.2 is based on the proofs of theorem 16.16 and lemma 9.14 (a).

  • Theorem 0.3 is based on the proofs of lemma 15.24, lemma 15.25, theorem 15.26, lemma 15.27 and lemma 9.14 (b).

  • Theorem 0.2 implicitely uses theorem 4.4 about the Dedekind-MacNeille completion of a dense unbounded linearly ordered.

  • In addition, theorem 0.2 also contains the proof of lemma 8.2 (the intersection of two closed unbounded sets is closed unbounded) the solutions to exercise 2.7 (any normal sequence has arbitrarily large fixed points) and exercise 9.7 (if all the antichains of a normal ω1\omega_{1}-tree are at most countable then so are its branches).

  • Finally, \Diamond and MA1\mathrm{MA}_{{\aleph_{1}}} are proved to be consistent with ZFC in theorems 13.21 and 16.13.

Wednesday, March 20 2013

Exercises in Set Theory: Classical Independence Results

Here are new solutions to exercises from Thomas Jech’s book ‘‘Set Theory’’:

Doing the exercises from these chapters gave me the opportunity to come back to the ‘‘classical’’ results about the independence of the Axiom of Choice and (Generalized) Continuum Hypothesis by Kurt Gödel and Paul Cohen. It’s funny to note that it’s easier to prove that AC holds in LL (essentially, the definition by ordinal induction provides the well-ordering of the class of contructible sets) than to prove that GCH holds in LL (you rely on AC in LL and on the technical condensation lemma). Actually, I believe Gödel found his proof for AC one or two years after the one for GCH. On the other hand, it is easy to make GCH fails (just add 2\aleph_{2} Cohen reals by Forcing) but more difficult to make AC fails (e.g. AC is preserved by Forcing). This can be interpreted as AC being more ‘‘natural’’ than GCH.

After reading the chapters again and now I analyzed in details the claims, I’m now convinced about the correctness of the proof. There are only two points I didn’t verify precisely about the Forcing method (namely that all axioms of predicate calculus and rules of inference are compatible with the Forcing method ; that the Forcing/Generic Model theorems can be transported from the Boolean Algebra case to the general case) but these do not seem too difficult. Here are some notes about claims that were not obvious to me at the first reading. As usual, I hope they might be useful to the readers of that blog:

  1. In the first page of chapter 13, it is claimed that for any set MM, Mdef(M)M\in\mathrm{def}(M) and Mdef(M)𝒫(M)M\subseteq\mathrm{def}(M)\subseteq\operatorname{\mathcal{P}}(M). The first statement is always true because M={xM:(M)x=x}def(M)M=\{x\in M:(M,\in)\models x=x\}\in\mathrm{def}(M) and (x=x)(M){(x=x)}^{{(M,\in)}} is x=xx=x by definition. However, the second statement can only be true if MM is transitive (since that implies M𝒫(M)M\subseteq\operatorname{\mathcal{P}}(M)). Indeed, if MM is transitive then for all aMa\in M we have aMa\subseteq M and since xax\in a is Δ0\Delta_{0} we get a={xM:(M)xa}def(M)a=\{x\in M:(M,\in)\models x\in a\}\in\mathrm{def}(M). If moreover we consider xXdef(M)x\in X\in\mathrm{def}(M) then xXMx\in X\subseteq M so xMdef(M)x\in M\subseteq\mathrm{def}(M) and def(M)\mathrm{def}(M) is also transitive. Hence the transitivity of the LαL_{\alpha} can still be shown by ordinal induction.

  2. The proof of lemma 13.7 can not be done exactly by induction on the complexity of GG, as suggested. For example to prove (ii) for G=G2=×G=G_{2}=\cdot\times\cdot, we would consider uF()×H()φ(u)aF(),bH(),φ((a,b))\exists u\in F(...)\times H(...)\varphi(u)\Leftrightarrow\exists a\in F(...),% \exists b\in H(...),\varphi((a,b)) and would like to say that φ((a,b))\varphi((a,b)) is Δ0\Delta_{0}. Nevertheless, we can not deduce that from the induction hypothesis. Hence the right thing to do is to prove the lemma for G1={,}G_{1}=\{\cdot,\cdot\} first and deduce the lemma for G=(,)G=(\cdot,\cdot) (and G=(,,)G^{{\prime}}=(\cdot,\cdot,\cdot)). Then we can proceed by induction.

  3. In the proof of theorem 13.18, it is mentioned that the assumption

    1. x<αyx<_{\alpha}y implies x<βyx<_{\beta}y

    2. xLαx\in L_{\alpha} and yLβLαy\in L_{\beta}\setminus L_{\alpha} implies x<βyx<_{\beta}y

    implies that if xyLαx\in y\in L_{\alpha} then x<αyx<_{\alpha}y. To show that, we consider βα\beta\leq\alpha the least ordinal such that yLβy\in L_{\beta}. In particular, β\beta is not limit (L0=L_{0}=\emptyset and if yLβy\in L_{\beta} for some limit β>0\beta>0 then there is γ<β\gamma<\beta such that yLγy\in L_{\gamma}) and we can write it β=γ+1\beta=\gamma+1. We have yLβ=Lγ+1y\in L_{\beta}=L_{{\gamma+1}} so there is a formula φ\varphi and elements a1,,anLγa_{1},...,a_{n}\in L_{\gamma} such that xy={zLγ:(Lγ)φ(z,a1,,an)}x\in y=\{z\in L_{\gamma}:(L_{\gamma},\in)\models\varphi(z,a_{1},...,a_{n})\}. Hence xLγx\in L_{\gamma}. Moreover by minimality of β\beta, yLβLγy\in L_{\beta}\setminus L_{\gamma} so by (ii) we have x<βyx<_{\beta}y and by (i) x<αyx<_{\alpha}y.

  4. In lemma 14.18, we have expressions that seem ill-defined for example au(t)a_{u}(t) where tdom(au)t\notin\operatorname{dom}(a_{u}). This happens in other places, like lemma 14.17 or definition 14.27. The trick is to understand that the functions are extended by 0. Indeed, for any x,yVBx,y\in V^{B} if xyx\subseteq y and tdom(y)dom(x),y(t)=0\forall t\in\operatorname{dom}(y)\setminus\operatorname{dom}(x),y(t)=0 then

    yx\displaystyle\|y\subseteq x\| =tdom(y)(-y(t)+tx)\displaystyle=\prod_{{t\in\operatorname{dom}(y)}}\left(-y(t)+{\|t\in x\|}\right)
    =tdom(x)(-x(t)+tx)\displaystyle=\prod_{{t\in\operatorname{dom}(x)}}\left(-x(t)+{\|t\in x\|}\right)
    =xx=1\displaystyle=\|x\subseteq x\|=1

    and similarly we get x=y=1\|x=y\|=1. Then we can use the inequality page 207 (φ(x)=x=yφ(x)φ(y)=x=yφ(y)φ(x)\|\varphi(x)\|=\|x=y\|\cdot\|\varphi(x)\|\leq\|\varphi(y)\|=\|x=y\|\cdot\|% \varphi(y)\|\leq\|\varphi(x)\|) to replace xx by its extension yy.

  5. In lemma 14.23, the inequality

    x is an ordinalxαˇ+x=αˇ+αˇx\|x\text{ is an ordinal}\|\leq\|x\in\check{\alpha}\|+\|x=\check{\alpha}\|+\|% \check{\alpha}\in x\|

    seems obvious but I don’t believe that it can be proved so easily at that point. For example the proof from chapter 2 requires at least the Separation axiom and the Δ0\Delta_{0} formulation from chapter 10 is based on the Axiom of Regularity. To solve that issue, it seems to me that the lemma should be moved after the proof that axioms of ZFC are valid in VBV^{B}. This is not an issue since lemma 14.23 is only used much later in lemma 14.31.

  6. Many details could be added to the proof of theorem 14.24, but let’s just mention Powerset. For any uVBu\in V^{B}, some udom(Y)u^{{\prime}}\in\operatorname{dom}(Y) is defined and satisfies uXu=u\|u\subseteq X\|\leq\|u=u^{{\prime}}\| (this follows from the definitions, using the Boolean inequality -a+b-a+ba-a+b\leq-a+b\cdot a to conclude). Since moreover tdom(Y),Y(t)=1\forall t\in\operatorname{dom}(Y),Y(t)=1 we get

    uXuY\displaystyle\|u\subseteq X\|\implies\|u\in Y\| -u=u+tdom(Y)(u=tY(t))\displaystyle\geq-\|u=u^{{\prime}}\|+\sum_{{t\in\operatorname{dom}(Y)}}\left(% \|u=t\|\cdot Y(t)\right)
    =tdom(Y)-(utu=u)\displaystyle=\sum_{{t\in\operatorname{dom}(Y)}}-\left(\|u\neq t\|\cdot\|u=u^{% {\prime}}\|\right)
  7. In theorem 14.34, we prove that any κ\kappa regular in VV remains regular in V[G]V[G] (the hard case is really κ\kappa uncountable and this assumption is implicitely used later to say that α<λAα\bigcup_{{\alpha<\lambda}}A_{\alpha} is bounded). It may not be obvious why this is enough. First recall that for any ordinal α\alpha, cfV[G](α)cfV(α)\operatorname{cf}^{{V[G]}}(\alpha)\leq\operatorname{cf}^{V}(\alpha), |α|V[G]|α|V{|\alpha|}^{{V[G]}}\leq{|\alpha|}^{{V}}, and any (regular) cardinal in V[G]V[G] is a (regular) cardinal in VV. Next we have,

    αOrd,cfV[G](α)cfV(α)\displaystyle\exists\alpha\in\mathrm{Ord},\operatorname{cf}^{{V[G]}}(\alpha)% \leq\operatorname{cf}^{{V}}(\alpha) αOrd,cfV[G](cfV(α))cfV[G](α)<cfV(α)\displaystyle\implies\exists\alpha\in\mathrm{Ord},\operatorname{cf}^{{V[G]}}(% \operatorname{cf}^{V}(\alpha))\leq\operatorname{cf}^{{V[G]}}(\alpha)<% \operatorname{cf}^{{V}}(\alpha)
    β regular cardinal in V, not regular cardinal in V[G]\displaystyle\implies\exists\beta\textrm{ regular cardinal in }V,\textrm{ not % regular cardinal in }V[G]
    βOrd,cfV[G](β)<β=cfV(β)\displaystyle\implies\exists\beta\in\mathrm{Ord},\operatorname{cf}^{{V[G]}}(% \beta)<\beta=\operatorname{cf}^{V}(\beta)

    that is αOrd,cfV[G](α)=cfV(α)\forall\alpha\in\mathrm{Ord},\operatorname{cf}^{{V[G]}}(\alpha)=\operatorname{% cf}^{{V}}(\alpha) is equivalent to ‘‘VV and V[G]V[G] have the same regular cardinals’’. Similarly, we can prove that αOrd,|α|V[G]=|α|V\forall\alpha\in\mathrm{Ord},{|\alpha|}^{{V[G]}}={|\alpha|}^{{V}} is equivalent to ‘‘VV and V[G]V[G] have the same cardinals’’.

    The proof of theorem 14.34 shows that ‘‘VV and V[G]V[G] have the same regular cardinals’’ and so to complete the proof, it is enough to show that α,cfV[G](α)=cfV(α)\forall\alpha,\operatorname{cf}^{{V[G]}}(\alpha)=\operatorname{cf}^{{V}}(\alpha) implies α,|α|V[G]=|α|V\forall\alpha,{|\alpha|}^{{V[G]}}={|\alpha|}^{{V}}. So suppose α,cfV[G](α)=cfV(α)\forall\alpha,\operatorname{cf}^{{V[G]}}(\alpha)=\operatorname{cf}^{{V}}(\alpha) and assume that there is α\alpha such that |α|V[G]<|α|V{|\alpha|}^{{V[G]}}<{|\alpha|}^{{V}}. Consider the least such α\alpha. If β=|α|V\beta={|\alpha|}^{{V}} then βα\beta\leq\alpha so |β|V[G]|α|V[G]<|α|V=β{|\beta|}^{{V[G]}}\leq{|\alpha|}^{{V[G]}}<{|\alpha|}^{{V}}=\beta. By minimality of α\alpha, β=α\beta=\alpha and so α\alpha is a cardinal in VV. α\alpha is actually regular in VV. Otherwise, suppose cfV(α)<α\operatorname{cf}^{V}(\alpha)<\alpha and let α=β<cfV(α)Xβ\alpha=\bigcup_{{\beta<\operatorname{cf}^{V}(\alpha)}}X_{\beta} such that |Xβ|V<|α|V{|X_{\beta}|}^{V}<{|\alpha|}^{V}. By minimality of α\alpha, we have |cfV(α)|V[G]=|cfV(α)|V{|\operatorname{cf}^{V}(\alpha)|}^{{V[G]}}={|\operatorname{cf}^{V}(\alpha)|}^{% {V}} and |Xβ|V[G]=|Xβ|V{|X_{\beta}|}^{{V[G]}}={|X_{\beta}|}^{{V}}. Then |α|V[G]=|cfV(α)|V[G]supβ<cfV(α)|Xβ|V[G]=|cfV(α)|Vsupβ<cfV(α)|Xβ|V=|α|V{|\alpha|}^{{V[G]}}={|\operatorname{cf}^{V}(\alpha)|}^{{V[G]}}\sup_{{\beta<% \operatorname{cf}^{V}(\alpha)}}{|X_{\beta}|}^{{V[G]}}={|\operatorname{cf}^{V}(% \alpha)|}^{{V}}\sup_{{\beta<\operatorname{cf}^{V}(\alpha)}}{|X_{\beta}|}^{{V}}% ={|\alpha|}^{{V}}, a contradiction. Finally, we get cfV(α)=α=|α|V>|α|V[G]cfV[G](α)\operatorname{cf}^{V}(\alpha)=\alpha={|\alpha|}^{V}>{|\alpha|}^{{V[G]}}\geq% \operatorname{cf}^{{V[G]}}(\alpha). This is again a contradiction and so α,|α|V[G]=|α|V\forall\alpha,{|\alpha|}^{{V[G]}}={|\alpha|}^{{V}}.

Saturday, March 2 2013

MathML Acid Tests

There has recently been discussion in the Mozilla community about Opera switch from Presto to Webkit and the need to preserve browser competition and diversity of rendering engines, especially with mobile devices. Some people outside the community seem a bit skeptic about that argument. Perhaps a striking example to convince them is to consider the case of MathML where basically only Gecko has a decent native implementation and the situation in the recent eBooks workshop illustrates that very well: MathML support is very important for some publishers (e.g. for science or education) but the main eBook readers rely exclusively on the Webkit engine and its rudimentary MathML implementation. Unfortunately because there is currently essentially no alternatives on mobile platforms, developers of eBook readers have no other choices than proposing a partial EPUB support or relying on polyfill....

After Google's announce to remove MathML from Chrome 25, someone ironized on twitter about the fact that an Acid test for MathML should be written since that seems to motivate them more than community feedback. I do not think that MathML support is something considered important from the point of view of browser competition but I took this idea and started writing MathML versions of the famous Acid2 and Acid3 tests. The current source of these MathML Acid tests is available on GitHub. Of course, I believe that native MathML implementation is very important and I expect at least that these tests could help the MathML community ; users and implementers.

Here is the result of the MathML Acid2 test with the stable Gecko release. To pass the test we only need to implement negative spacing or at least integrate the patch I submitted when I was still active in Gecko developments (bug 717546).

MathML Acid2 test ; Gecko

And here is the score of the MathML Acid 3 test with the stable Gecko release. The failure of test 18 was not supposed to happen but I discovered it when I wrote the test. That will be fixed by James Kitchener's refactoring in bug 827713. Obviously, reaching the score of 100/100 will be much more difficult to achieve by our volunteer developers, but the current score is not too bad compared to other rendering engines...

MathML Acid 3 ; Gecko

Friday, January 25 2013

Exercises in Set Theory

I'm finally done with the first part "Basic Set Theory" :-) The two last chapters:

Tuesday, January 22 2013

Analysis of Lithium's algorithm

I’ve recently been working on automated testcase reduction tools for the MathJax project and thus I had the chance to study Jesse Ruderman’s Lithium tool, itself inspired from the ddmin algorithm. This paper contains good ideas, like for example the fact that the reduction could be improved if we rely on the testcase structure like XML nodes or grammar tokens instead of just characters/lines (that’s why I’ve started to write a version of Lithium to work with abstract data structure). However, the authors of the ddmin paper really don’t analyse precisely the complexity of the algorithm, except the best and worst case and there is a large gap between the two. Jesse's analysis is much better and in particular introduces the concepts of monotonic testcase and clustered reduction where the algorithm performs the best and which intuitively seems the usual testcases that we meet in practice. However, the monotonic+clustered case complexity is only “guessed” and the bound O(Mlog2(N))O(M\log _{2}(N)) for a monotonic testcase (of size NN with final reduction of size MM) is not optimal. For example if the final reduction is relatively small compared to NN, say M=Nlog2(N)=o(N)M=\frac{N}{\log _{2}(N)}=o(N) then Mlog2(N)=N=Ω(N)M\log _{2}(N)=N=\Omega(N) and we can’t say that the number of verifications is small compared to NN. In particular, Jesse can not deduce from his bound that Lithium’s algorithm is better than an approach based on MM binary search executions! In this blog post, I shall give the optimal bound for the monotonic case and formalize that in some sense the clustered reduction is near the best case. I’ll also compare Lithium’s algorithm with the binary search approach and with the ddmin algorithm. I shall explain that Lithium is the best in the monotonic case (or actually matches the ddmin in that case).

Thus suppose that we are given a large testcase exhibiting an unwanted behavior. We want to find a smaller test case exhibiting the same behavior and one way is to isolate subtestcases that can not be reduced any further. A testcase can be quite general so here are basic definitions to formalize a bit the problem:

  • A testcase SS is a nonempty finite sets of elements (lines, characters, tree nodes, user actions) exhibiting an “interesting” behavior (crash, hang and other bugs…)

  • A reduction TT of SS is a testcase TST\subseteq S with the same “interesting” behavior as SS.

  • A testcase SS is minimal if TS,T is not a reduction of S\forall T\subsetneq S,T\text{ is not a reduction of }S.

Note that by definition, SS is a reduction of itself and \emptyset is not a reduction of SS. Also the relation “is a reduction of” is transitive that is a reduction of a reduction of SS is a reduction of SS.

We assume that verifying one subset to check if it has the “interesting” behavior is what takes the most time (think e.g. testing a hang or user actions) so we want to optimize the number of testcases verified. Moreover, the original testcase SS is large and so a fast reduction algorithm would be to have a complexity in o(|S|)o(|S|). Of course, we also expect to find a small reduction TST\subseteq S that is |T|=o(|S|)|T|=o(|S|).

Without information on the structure on a given testcase SS or on the properties of the reduction TT, we must consider the 2|S|-22^{{|S|}}-2 subsets TS\emptyset\neq T\neq S, to find a minimal reduction. And we only know how to do that in O(2|S|)O\left(2^{{|S|}}\right) operations (or O(2|S|/2)O\left(2^{{|S|/2}}\right) with Grover’s algorithm ;-)). Similarly, even to determine whether TST\subseteq S is minimal would require testing 2|T|-22^{{|T|}}-2 subsets which is not necessarily o(|S|)o(|S|) (e.g. |T|=log2(|S|)=o(|S|)|T|=\log _{2}{(|S|)}=o(|S|)). Hence we consider the following definitions:

  • For any integer n1n\geq 1, SS is nn-minimal if TS,|S-T|nT is not a reduction of S\forall T\subsetneq S,|S-T|\leq n\implies T\text{ is not a reduction of }S.

  • In particular, SS is 11-minimal if xS,S{x} is not a reduction of S\forall x\in S,S\setminus\{ x\}\text{ is not a reduction of }S.

  • SS is monotonic if T1T2S,T1 is a reduction of ST2 is a reduction of S\forall T_{1}\subseteq T_{2}\subseteq S,\, T_{1}\text{ is a reduction of }S\implies T_{2}\text{ is a reduction of }S.

Finding a nn-minimal reduction will give a minimal testcase that is no longer interesting if we remove any portion of size at most nn. Clearly, SS is minimal if it is nn-minimal for all nn. Moreover, SS is always nn-minimal for any n|S|n\geq|S|. We still need to test exponentially many subsets to find a nn-minimal reduction. To decide whether TST\subseteq S is nn-minimal, we need to consider subsets obtained by removing portions of size 1,2,,min(n,|T|-1)1,2,...,\min(n,|T|-1) that is k=1min(n,|T|-1)(|T|k)\sum _{{k=1}}^{{\min(n,|T|-1)}}\binom{|T|}{k} subsets. In particular whether TT is 11-minimal is O(|T|)O(|T|) and so o(|S|)o(|S|) if T=o(|S|)T=o(|S|). If SS is monotonic then so is any reduction TT of SS. Moreover, if TST\subsetneq S is a reduction of SS and xSTx\in S\setminus T, then TS{x}ST\subseteq S\setminus\{ x\}\subsetneq S and so S{x}S\setminus\{ x\} is a reduction of SS. Hence when SS is monotonic, SS is 11-minimal if and only if it is minimal. We will target 11-minimal reduction in what follows.

Let’s consider Lithium’s algorithm. We assume that SS is ordered and so can be identified with the interval [1,|S|][1,|S|] (think for example line numbers). For simplicity, let’s first assume that the size of the original testcase is a power of two, that is |S|=N=2n|S|=N=2^{n}. Lithium starts by n-1n-1 steps k=1,2,,n-1k=1,2,...,n-1. At step kk, we consider the chunks among the intervals [1+j2n-k,(j+1)2n-k](0j<2k)[1+j2^{{n-k}},(j+1)2^{{n-k}}]\ (0\leq j<2^{k}) of size 2n-k2^{{n-k}}. Lithium verifies if removing each chunk provides a reduction. If so, it permanently removes that chunk and tries another chunk. Because \emptyset is not a reduction of SS, we immediately increment kk if it remains only one chunk. The nn-th step is the same, with chunk of size 1 but we stop only when we are sure that the current testcase TT is 11-minimal that is when after |T||T| attempts, we have not reduced TT any further. If NN is not a power of 2 then 2n-1<N<2n2^{{n-1}}<N<2^{n} where n=log2(N)n=\lceil\log _{2}(N)\rceil. In that case, we apply the same algorithm as 2n2^{n} (i.e. as if there were 2n-N2^{n}-N dummy elements at the end) except that we don’t need to remove the chunks that are entirely in that additional part. This saves testing at most nn subtests (those that would be obtained by removing the dummy chunks at the end of sizes 2n-1,2n-2,,12^{{n-1}},2^{{n-2}},...,1). Hence in general if CNC_{N} is the number of subsets of SS tried by Lithium, we have C2n-nCNC2nC_{{2^{n}}}-n\leq C_{N}\leq C_{{2^{n}}}. Let MM be the size of the 11-minimal testcase found by Lithium and m=log2(M)m=\lceil\log _{2}(M)\rceil.

Lithium will always perform the n-1n-1 initial steps above and check at least one subset at each step. At the end, it needs to do MM operations to be sure that the testcase is 11-minimal. So CNlog2(N)+M-1=Ω(log2(N)+M)C_{N}\geq\lceil\log _{2}(N)\rceil+M-1=\Omega(\log _{2}(N)+M). Now, consider the case where SS monotonic and has one minimal reduction T=[1,M]T=[1,M]. Then TT is included in the chunk [1,2m][1,2^{{m}}] from step k=n-mk=n-m. Because SS is monotonic, this means that at step k=1k=1, we do two verifications and the second chunk is removed because it does not contain the TT (and the third one too if NN is not a power of two), at step k=2k=2 it remains two chunks, we do two verifications and the second chunk is removed etc until k=n-mk=n-m. For k>n-mk>n-m, the number of chunk can grow again: 2, 4, 8… that is we handle at most 21+k-(n-m)2^{{1+k-(n-m)}} chunks from step n-m+1n-m+1 to n-1n-1. At step k=nk=n, a first round of at most 2m2^{m} verifications ensure that the testcase is of size MM and a second round of MM verifications ensure that it is 11-minimal. So CN1+(k=1n-m2)+(k=n-m+1n21+k-(n-m))+2m+M=1+2(n-m)+2m-1+2m+MC_{N}\leq 1+\left(\sum _{{k=1}}^{{n-m}}2\right)+\left(\sum _{{k=n-m+1}}^{{n}}2^{{1+k-(n-m)}}\right)+2^{m}+M=1+2(n-m)+2^{m}-1+2^{m}+M and after simplification CN=O(log2(N)+M)C_{N}=O(\log _{2}(N)+M). Hence the lower bound Ω(log2(N)+M)\Omega(\log _{2}(N)+M) is optimal. The previous example suggests the following generalization: a testcase TT is CC-clustered if it can be written as the union of CC nonempty closed intervals T=I1I2ICT=I_{1}\cup I_{2}\cup...\cup I_{C}. If the minimal testcase found by Lithium is CC-clustered, each IjI_{j} is of length at most M2mM\leq 2^{m} and so IjI_{j} intersects at most 2 chunks of length 2m2^{m} from the step k=n-mk=n-m. So TT intersects at most 2C2C chunks from the step k=n-mk=n-m and a fortiori from all the steps kn-mk\leq n-m. Suppose that SS is monotonic. Then if cc is a chunk that does not contain any element of TT then TcT\setminus c is a reduction of TT and so Lithium will remove the chunk cc. Hence at each step kn-mk\leq n-m, at most 2C2C chunks survive and so there are at most 4C4C chunks at the next step. A computation similar to what we have done for T=[1,M]T=[1,M] shows that CN=O(C(log2(N)+M))C_{N}=O(C(\log _{2}(N)+M)) if the final testcase found by Lithium is CC-clustered. Note that we always have M=o(N)M=o(N) and log2(N)=o(N)\log _{2}(N)=o(N). So if C=O(1)C=O(1) then CN=O(log2(N)+M)=o(N)C_{N}=O(\log _{2}(N)+M)=o(N) is small as wanted. Also, the final testcase is always MM-clustered (union of intervals that are singletons!) so we found that the monotonic case is O(M(log2(N)+M))O(M(\log _{2}(N)+M)). We shall give a better bound below.

Now, for each step k=1,2,,n-1k=1,2,...,n-1, Lithium splits the testcase in at most 2k2^{k} chunk and try to remove each chunk. Then it does at most NN steps before stopping or removing one chunk (so the testcase becomes of size at most N-1N-1), then it does at most N-1N-1 steps before stopping or removing one more chunk (so the testcase becomes of size at most N-1N-1), …, then it does at most M+1M+1 steps before stopping or removing one more chunk (so the testcase becomes of size at most MM). Then the testcase is exactly of size MM and Lithium does at most MM additional verifications. This gives CNk=1n-12k+k=MNk=2n-2+N(N+1)-M(M-1)2=O(N2)C_{N}\leq\sum _{{k=1}}^{{n-1}}2^{k}+\sum _{{k=M}}^{{N}}k=2^{n}-2+\frac{N(N+1)-M(M-1)}{2}=O(N^{2}) verifications. This bound is optimal if 1M2n-11\leq M\leq 2^{{n-1}} (this is asymptotically true since we assume M=o(N)M=o(N)): consider the cases where the proper reductions of SS are exactly the segments [1,k](2n-1+1k2n-1)[1,k]\ (2^{{n-1}}+1\leq k\leq 2^{n}-1) and [k,2n-1+1](2k2n-1-M+2)[k,2^{{n-1}}+1]\ (2\leq k\leq 2^{{n-1}}-M+2). The testcase will be preserved during the first phase. Then we will keep browsing at least the first half to remove elements at position 2n-1+2k2n-12^{{n-1}}+2\leq k\leq 2^{n}-1. So CNk=2n-1+22n-12n-1=2n-1(2n-1-2)=Ω(N2)C_{N}\geq\sum _{{k=2^{{n-1}}+2}}^{{2^{n}-1}}2^{{n-1}}=2^{{n-1}}\left(2^{{n-1}}-2\right)=\Omega(N^{2}).

We now come back to the case where SS is monotonic. We will prove that the worst case is CN=Θ(Mlog2(NM))C_{N}=\Theta\left(M\log _{2}(\frac{N}{M})\right) and so our assumption M=o(N)M=o(N) gives Mlog2(NM)=(-MNlog2(MN))N=o(N)M\log _{2}(\frac{N}{M})=\left(-\frac{M}{N}\log _{2}(\frac{M}{N})\right)N=o(N) as we expected. During the steps 1km1\leq k\leq m, we test at most 2k2^{k} chunks. When k=mk=m, 2mM2^{{m}}\geq M chunks but at most MM distinct chunks contain an element from the final reduction. By monocity, at most MM chunks will survive and there are at most 2M2M chunks at step m+1m+1. Again, only MM chunks will survive at step m+2m+2 and so on until k=n-1k=n-1. A the final step, it remains at most 2M2M elements. Again by monocity a first round of 2M2M tests will make MM elements survive and we finally need MM additional tests to ensure that the test case is minimal. Hence CNk=1m2k+k=m+1n2M+M=2m+1-3+M(2(n-m)+1)=O(Mlog2(NM))C_{N}\leq{\sum _{{k=1}}^{{m}}2^{k}}+{\sum _{{k=m+1}}^{{n}}2M}+M=2^{{m+1}}-3+M(2(n-m)+1)=O(M\log _{2}(\frac{N}{M})). This bound is optimal: if M=2mM=2^{m}, consider the case where T={j2n-m+1:0j<2m}T=\{ j2^{{n-m}}+1:0\leq j<2^{{m}}\} is the only minimal testcase (and SS monotonic) ; if MM is not a power of two, consider the same TT with 2m-M2^{{m}}-M points removed at odd positions. Then for each step 1km-11\leq k\leq m-1, no chunks inside [1,2n][1,2^{n}] are removed. Then some chunks in [1,2n][1,2^{n}] are removed (none if MM is a power of two) at step mm and it remains MM chunks. Then for steps m+1kn-1m+1\leq k\leq n-1 there are always exactly 2M2M chunks to handle. So CNm+1kn-12M=2M(n-m-2)=2M(log2(NM)-2)=Ω(Mlog2(NM))C_{N}\geq\sum _{{m+1\leq k\leq n-1}}2M=2M(n-m-2)=2M(\log _{2}(\frac{N}{M})-2)=\Omega(M\log _{2}(\frac{N}{M})).

We note that we have used two different methods to bound the number of verifications in the general monotonic case, or when the testcase is CC-clustered. One naturally wonders what happens when we combine the two techniques. So let c=log2Cmc=\lceil\log _{2}C\rceil\leq m. From step 11 to cc, the best bound we found was O(2k)O(2^{k}) ; from step cc to mm, it was O(C)O(C) ; from step mm to n-mn-m it was O(C)O(C) again ; from step n-mn-m to n-cn-c, it was O(21-k-(n-m)C)O(2^{{1-k-(n-m)}}C) and finally from step n-cn-c to nn, including final verifications, it was O(M)O(M). Taking the sum, we get CN=O(2c+((n-m)-c)C+2(n-c-(n-m))C+(n-(n-c))M)=O(C(1+log2(NMC))+M(1+log2C))C_{N}=O(2^{c}+((n-m)-c)C+2^{{(n-c-(n-m))}}C+(n-(n-c))M)=O(C\left(1+\log _{2}{\left(\frac{N}{MC}\right)}\right)+M(1+\log _{2}{C})) Because C=O(M)C=O(M), this becomes CN=O(Clog2(NMC)+M(1+log2C))C_{N}=O(C\log _{2}{\left(\frac{N}{MC}\right)}+M(1+\log _{2}{C})). If C=O(1)C=O(1), then we get CN=O(log2(N)+M)C_{N}=O(\log _{2}(N)+M). At the opposite, if C=Ω(M)C=\Omega(M), we get CN=Ω(Mlog2(NM))C_{N}=\Omega(M\log _{2}{\left(\frac{N}{M}\right)}). If CC is not O(1)O(1) but C=o(M)C=o(M) then 1=o(log2C)1=o(\log _{2}{C}) and Clog2(C)=o(Mlog2C)C\log _{2}(C)=o(M\log _{2}{C}) and so the expression can be simplified to CN=O(Clog2(NM)+Mlog2C)C_{N}=O(C\log _{2}{\left(\frac{N}{M}\right)}+M\log _{2}{C}). Hence we have obtained an intermediate result between the worst and best monotonic cases and shown how the role played by the number of clusters: the less the final testcase is clustered, the faster Lithium finds it. The results are summarized in the following table:

Number of tests
Best case Θ(log2(N)+M)\Theta(\log _{2}(N)+M)
SS is monotonic ; TT is O(1)O(1)-clustered O(log2(N)+M)O(\log _{2}(N)+M)
SS is monotonic ; TT is CC-clustered (C=o(M)C=o(M) and unbounded) O(Clog2(NM)+Mlog2C)O(C\log _{2}{\left(\frac{N}{M}\right)}+M\log _{2}{C})
SS is monotonic O(Mlog2(NM));o(N)O\left(M\log _{2}\left(\frac{N}{M}\right)\right);o(N)
Worst case Θ(N2)\Theta(N^{2})
Figure 0.1: Performance of Lithium’s algorithm for some initial testcase SS of size NN and final reduction TT of size M=o(N)M=o(N). TT is CC-clustered if it is the union of CC intervals.

In the ddmin algorithm, at each step we add a preliminary round where we try to immediately reduce to a single chunk (or equivalently to remove the complement of a chunk). Actually, the ddmin algorithm only does this preliminary round at steps where there are more than 2 chunks for otherwise it would do twice the same work. For each step k>1k>1, if one chunk c1c_{1} is a reduction of SS then c1c2c_{1}\subseteq c_{2} for some chunk c2c_{2} at the previous step k-1k-1. Now if SS is monotonic then, at level k-1k-1, removing all but the chunk c2c_{2} gives a subset that contains c1c_{1} and so a reduction of SS by monocity. Hence on chunk survive at level kk and there are exactly 2 chunks at level kk and so the ddmin algorithm is exactly Lithium’s algorithm when SS is monotonic. The ddmin algorithm keeps in memory the subsets that we didn’t find interesting in order to avoid repeating them. However, if we only reduce to the complement of a chunk, then we can never repeat the same subset and so this additional work is useless. That’s the case if SS is monotonic.

Finally, if SS is monotonic Jesse proposes a simpler approach based on a binary search. Suppose first that there is only one minimal testcase TT. If kminTk\geq\min T then [1,k][1,k] intersects TT and so Uk=S[1,k]TU_{k}=S\setminus[1,k]\neq T. Then UkU_{k} is not a reduction of SS for otherwise a minimal reduction of UkU_{k} would be a minimal reduction of SS distinct from TT which we exclude by hypothesis. If instead k<minTk<\min T then [1,k][1,k] does not intersect TT and Uk=S[1,k]T[1,k]TU_{k}=S\setminus[1,k]\supseteq T\setminus[1,k]\supseteq T is a reduction of SS because SS is monotonic. So we can use a binary search to find minT\min T by testing at most log2(N)\log _{2}(N) testcases (modulo some constant). Then we try with intervals [1+minT,k][1+\min T,k] to find the second least element of TT in at most log2(N)\log _{2}(N). We continue until we find the MM-th element of TT. Clearly, this gives Mlog2(N)M\log _{2}(N) verifications which sounds equivalent to Jesse’s bound with even a better constant factor. Note that the algorithm still works if we remove the assumption that there is only one minimal testcase TT. We start by S=S1S=S_{1} and find x1=max{minT:T is a minimal reduction of S1}x_{1}=\max\{\min T:T\text{ is a minimal reduction of }S_{1}\}: if k<x1k<x_{1} then S[1;k]S\setminus[1;k] contains at least one mininal reduction with least element x1x_{1} and so is a reduction because SS is monotonic. If kx1k\geq x_{1} then S[1;k]S\setminus[1;k] is not a reduction of SS or a minimal reduction of S[1;k]S\setminus[1;k] would be a minimal reduction of SS whose least element is greater than x1x_{1}. So S2=S1[1;x1-1]S_{2}=S_{1}\setminus[1;x_{1}-1] is a reduction of S1S_{1}. The algorithm continues to find x2=max{min(T{x1}):T is a minimal reduction of S2,min(T)=x1}x_{2}=\max\{\min(T\setminus\{ x_{1}\}):T\text{ is a minimal reduction of }S_{2},\min(T)=x_{1}\} etc and finally returns a minimal reduction of SS. However, it is not clear that this approach can work if SS is not monotonic while we can hope that Lithium is still efficient if SS is “almost” monotonic. We remark that when there is only one minimal testcase T=[1,M]T=[1,M], the binary search approach would require something like k=0M-1log2(N-k)=Mlog2(N)+k=0M-1log2(1-kN)Mlog2(N)+Mlog2(1-MN)=Mlog2(N)+o(M)\sum _{{k=0}}^{{M-1}}\log _{2}(N-k)=M\log _{2}(N)+\sum _{{k=0}}^{{M-1}}\log _{2}\left(1-\frac{k}{N}\right)\geq M\log _{2}(N)+M\log _{2}\left(1-\frac{M}{N}\right)=M\log _{2}(N)+o(M). So that would be the worst case of the binary search approach whereas Lithium handles this case very nicely in Θ(log2(N)+M)\Theta(\log _{2}(N)+M)! In general, if there is only one minimal testcase TT of size MN2M\leq\frac{N}{2} then max(T)\max(T) can be anywhere between [M,N][M,N] and if TT is placed at random, max(T)34N\max(T)\leq\frac{3}{4}N with probability at least 12\frac{1}{2}. So the average complexity of the binary search approach in that case will be at least 12k=0M-1log2(14N)=12Mlog2(N)+o(M)\frac{1}{2}\sum _{{k=0}}^{{M-1}}\log _{2}(\frac{1}{4}N)=\frac{1}{2}M\log _{2}(N)+o(M) which is still not as good as Lithium’s optimal worst case of O(Mlog2(NM))O(M\log _{2}(\frac{N}{M}))

- page 2 of 12 -