On the word topology, and beyond

Given a quasi-ordered set (X, ≤), the set of finite words over X is built as X^* ≜ ⋃_{0 ≤ k < ∞} X^k. (We will use ≜ for equality by definition.)

This set can be equipped with many different quasi-orders, but it appears that often the right notion is the one relying on subwords, that is, a word w is smaller than a word w′ if there exists a map h: |w|→|w′| such that ∀1 ≤ k ≤ |w|,w_k ≤ w_h(k)′. This order is called the subword ordering.

Allow us to recall that in a quasi-ordered set (X, ≤) an infinite sequence (x_i)_i ∈ I is good, whenever there exists i < j such that x_i ≤ x_j. A set (X, ≤) is a well quasi-order whenever every infinite sequence is good. Instances of well quasi-orders occur naturally in computer science and mathematics [6] as a natural generalization of well-founded orderings.

The first lemma we are interested in this post is due to Higman [5] and states that if (X, ≤) is a well quasi-order, then (X^*, ≤^*) is, too.

Now, if you have been reading this blog long enough, or if you have just read the title, you might wonder where topology is going to strike in. A good first guess would be to notice that a quasi-ordered space (X, ≤) always provides us with many topologies, among them we can find the upper topology, the Alexandroff topology or even the Scott topology. The specialization preordering of a topology τ, written ≤_τ, is obtained as x≤_τy ⇔ ∀U ∈ τ, x ∈ U ⟹ y ∈ U. The one thing that we expect when going from a preorder ≤ to a topology τ in order to be called a ‘generalization’, is that ≤_τ = ≤.

One can generalize many results from the theory of well quasi-orderings to the topological setting, see [1,2]. And the topological notion that corresponds to well quasi-orders is that of a Noetherian space, that is a space in which all subsets are compact.

We now have enough background to ask the real question: How can we generalize Higman’s Lemma? That is, how can we build a Noetherian space (X^*, τ^*) from a Noetherian space (X, τ)?

This intriguing question has already been answered in [1]… and, as a consequence, we lied when we told you that this would be our goal this month.

The real objective of this post is to explain why there might be very different answers, and why the other constructions are wrong. We will first study the relatively simple case of finite words, then we will move on to infinite words. This will lead us to a conjecture, and to a very brief comment on transfinite words.

The word topology

Let us see what the current proposition for a topology over X^* is, which would allow us to generalize Higman’s Lemma. We will observe a few key properties and reconstruct this topology using a perhaps more natural inductive construction. From this point on, every definition is going to be taken from the book [3].

Definition [3, def 9.7.26]. For every topological space X, the word topology on X^* is the coarsest one containing the subsets X^*U₁X^*U₂X^*…X^*U_nX^* as open subsets, where n ∈ ℕ, and U₁, U₂, …, U_n are open subsets of X. We write X^* for the space of all finite words over X with this topology.

Now, the adaptation of Higman’s Lemma works as you expect: when X is a Noetherian space, X^* is Noetherian as well [3, thm 9.7.33].

In order to better understand this topology, we will introduce a notation for the basic sets that generate the word topology. We write [U₁,…,U_n] for the set X^*U₁X^*…X^*U_nX^*, and therefore the word topology on X^* is generated by the set {[U] ∣ U ∈ (τ)^*}. Notice that we write (τ)^* to denote sequences of elements of τ, and not open subsets of X^*.

Using this notation, the resemblance between the construction on the points and the construction on the topology is uncanny: we are basically doing the same thing except that we have to intersperse X^* between the letters in the case of the topology. However, let us now instill doubt in your mind that this may be a completely arbitrary choice and others could work.

If we only want an equivalent of Higman’s Lemma, we could choose that τ^* is always defined as the indiscrete topology, which is Noetherian.
If we also want that the specialisation preorder ≤_τ^* of τ^* to coincide with (≤_τ)^*, we also have a lot of choices. For instance, we can consider the Alexandroff topology over (≤_τ)^*, although this one may fail to be Noetherian (it will be if and only if ≤_τ is a well quasi-ordering).

One might argue that given two topologies that are Noetherian and have the same specialization preordering, their join is also Noetherian (oh yes! see the Appendix), and possesses the same specialization preordering (this you can certainly prove for yourselves, right?). Therefore the family of topologies of interest to us is directed. However, there might not be a single finest topology that is both Noetherian and possesses the right specialization preordering, leaving us with no canonical choice of topology.

Constructors will not do

A good way of finding a canonical topology on a set is to insist that it is the coarsest (or the finest) one that satisfies a few properties.

Now finite words over X are simply finite lists of letters in X, and lists over X are a data type (in computer science parlance), with two constructors:

creating a list with a single element ι : X → X^*,
and concatenating two lists • : X^* × X^* → X^*.

By the way, those two maps are continuous if you equip X^* with the word topology: see Exercise 9.7.27 in the book [3], where they are called i and cat respectively.

But the word topology is certainly not the coarsest one on X^* such that ι and • are continuous: instead, this is the indiscrete topology, which trivially makes those two operations continuous.

It is not the finest one that makes ι and • continuous either. For example, if the topology of X is discrete, then the discrete topology on X^* makes both ι and • continuous. But the word topology is not the discrete topology, even in that case. We elucidate what it is as follows. The discrete topology on X is also the Alexandroff topology on the equality ordering =. Using Exercise 9.7.30 of the book [3], the word topology on X^* is then the Alexandroff topology of the subword ordering (=)^*: explicitly, u (=)^* v if and only if one can obtain the word v from u by inserting arbitrarily many letters from X at arbitrary positions in u. Note that (=)^* is certainly not equality on words. In fact, if X is finite, (=)^* is even a well quasi-ordering on X^*, by Higman’s Lemma, while equality on X^* definitely is not.

In general, we do not even know whether there is a finest topology on X^* that makes ι and • continuous. That suggests that the word topology on X^* is perhaps somewhat arbitrary… or that characterizing the word topology on X^* through constructors is a bad idea.

Canonicality through destructors

Instead, we look at destructors. While constructors allow you to build new elements of type X^*, destructors give you ways of examining the structure of elements of X^*, and of decomposing them into smaller parts.

Assume that (X, θ) is a topological space. The two destructors we consider are the following functions:

supp : X^* → P(X), where P(X) is the set of subsets of X, endowed with the lower Vietoris topology; supp computes the support of a word w, namely the set of letters that occur in w;
and split : X^* → P(X^* × X^*), where P(X^* × X^*) is endowed with the lower Vietoris topology over the product topology; split computes the set of all the decompositions of a word w: split(w) ≜ {(u, v) ∈ X^* × X^* ∣ uv = w}.

The us recall that the lower Vietoris topology on P(X) has a subbase of open sets of the form ◇U ≜ {F ∈ P(X) ∣ F ∩ U ≠ ∅}, where U ranges over the open subsets of X.

Proposition. The word topology θ^* is the coarsest topology on X^* that make supp and split continuous.

Proof. Let us first show that supp and split are continuous with respect to the word topology. We consider an arbitrary subbasic open subset ◇U of P(X), where U is open in X.

For supp, it suffices to notice that supp⁻¹(◇U) = {w ∈ X^* ∣ some letter of w is in U} = [U], and that is in θ^*.

Similarly, we consider an arbitrary subbasic open subset ◇(U × V) of P(X^* × X^*), with U, V ∈ θ^* themselves basic open sets, namely U=[U₁,…,U_m] and V=[V₁,…,V_n]. Then split⁻¹(◇(U × V)) = {w ∈ X^* ∣ ∃(u, v) ∈ X^* × X^*, uv = w and u ∈ U and v ∈ V}=[U₁,…,U_m,V₁,…,V_n] (an open set that we might simply write as UV), and this is in θ^*.

Therefore, both supp and split are continuous with respect to θ^*.

In the converse direction, let us assume that τ is a topology on X^* such that supp and split are continuous. We claim that θ^* ⊆ τ.

Let us first show that [U] ∈ τ for every U ∈ θ. Because supp is continuous, supp⁻¹(◇U) is an open set in τ, and we have calculated that supp⁻¹(◇U)=[U]. Therefore, every set [U] with U ∈ θ is in τ.

Because split is continuous, for all U ∈ τ and V ∈ τ, the set UV=split⁻¹(◇(U × V)) is also in τ. This implies that every set of the form [U₁][U₂]…[U_n] is also in τ, for every non-empty sequence (n≥1) of open subsets U₁, U₂, …, U_n of X. [U₁][U₂]…[U_n] is just the open set [U₁, U₂, …, U_n] when n≥1. When n=0, [U₁, U₂, …, U_n] is simply the whole space X^*, which must be in τ as well. Hence τ contains all the subbasic open sets of θ^*, proving that θ^* ⊆ τ. ☐

This gives a partial answer as to why this topology is of particular interest: not only does it provide us with an endfunctor X ↦ X^*on Top that preserves Noetherian spaces, but the topology can be constructed in a uniform fashion through the use of supp and split .

There is still a number of arbitrary decisions that are involved here: why have we chosen those particular destructors? why have we chose the lower Vietoris topology on sets of subsets? why the product topology on X^* × X^*? While the lower Vietoris and product topologies are fairly off-the-shelf choices, the choice of supp and split may seem arbitrary. However, we will see that similar choices extend well to the cases of infinite words.

The case of infinite words

Let us call infinite word over X any sequence w of elements w₀, w₁, …, w_i, … of X indexed by natural numbers i. By contrast, a finite word, namely an element of X^*, is a sequence of elements of X indexed by elements of a bounded interval [1,…,n] of natural numbers.

Let X^ω be the set of all infinite words over X, and X^≤ω = X^ω ∪ X^* be the set of finite-or-infinite words over X. In a paper submitted but not yet published [4], one of us suggested to use the following topology on those sets.

Definition. [4, def 11]. Given a topological space (X, τ) the asymptotic subword topology τ^≤ω over X^≤ω is generated by the sets:

⟨U₁, …, U_n ∣ U_∞⟩ ≜ X^*U₁X^*U₂X^*⋯X^*U_n(X^*U_∞X^*)^ω, and
[U₁, …, U_n] ≜ X^*U₁X^*U₂X^*⋯X^*U_nX^≤ω,

where n ∈ ℕ and the sets U₁, …, U_n, U_∞ are all open sets of X.

The sets of the form [U₁, …, U_n] are written as in the case of finite words, but also contain infinite words: [U₁, …, U_n] is the sets of finite or infinite words that contain a letter from U₁, followed by arbitrary many (but finitely many) letters, and then eventually a letter from U₂, then some arbitrary finite number of letters, …, then eventually a letter from U_n, and finally all remaining letters (finitely or infinitely many of them). The words in ⟨U₁, …, U_n ∣ U_∞⟩ are defined similarly, except that they must all be infinite, and that we will find infinitely many letters from U_∞ in them. Note that ⟨U₁, …, U_n ∣ X⟩ is different from [U₁, …, U_n], since the former only contain infinite words.

One of the theorems of [4] is that, for every Noetherian space (X, τ), the space (X^≤ω, τ^≤ω) is also Noetherian. Let us not give the proof of that here (sorry!). As in the case of finite words, we will see that the asymptotic subword topology is also canonically described by destructors.

First, we introduce the natural generalization of the functions supp and split to infinite words. We define the support of an infinite word as supp : w ↦ {w_i ∣ i ∈ ℕ} and the splitting as split : w ↦ {(u, v) ∈ X^* × X^≤ω ∣ uv = w}.

Those two destructors will not be enough to recover the whole topology τ^≤ω. We need another one, which will detect recurring behaviors in infinite words.

The idea is as follows. In order to convey the intuition more clearly, let us start with the case where X is a finite set, with the discrete topology. We say that a letter a is recurring in an infinite word w ∈ X^ω if and only if it occurs infinitely often in w. In formulae, that can be expressed by saying that for every i ∈ ℕ, there is a j≥i such that a=w_j. Yet another, equivalent, form is that a is in ∩_{i ∈ ℕ} supp(w_≥i), where w_≥i is the suffix of w starting at index i.

Note that, since we have assumed X to be finite (for now), ∩_{i ∈ ℕ} supp(w_≥i) is an intersection of finite sets. It is even an intersection of a non-increasing chain of finite sets, so this chain stabilizes: there is an index i₀ such that ∩_{i ∈ ℕ} supp(w_≥i) is equal to supp(w_≥i) for any i≥i₀.

A similar thing happens in the more general case where X is Noetherian. Let us define recur(w) as ∩_{i ∈ ℕ} cl(supp(w_≥i)), where cl() denotes closure in X. We have a similar property: recur(w) is the intersection of a non-increasing chain of closed subsets of X. Hence, since X is Noetherian, the chain stabilizes: there is an index i₀ such that ∩_{i ∈ ℕ} cl(supp(w_≥i)) is equal to cl(supp(w_≥i)) for any i≥i₀.

This turns out to be important in the study of the asymptotic subword topology on X^≤ω [4]. However, let us return to the main point of this post: just like with finite words, we have the following characterization of the infinite subword topology in terms of destructors.

Proposition. The asymptotic subword topology τ^≤ω is the coarsest topology on X^≤ω that makes supp : X^≤ω → P(X), split : X^≤ω → P(X^* × X^≤ω) and recur : X^≤ω → P(X) continuous.

Proof. As in the case of finite words, supp⁻¹(◇U) = {w ∈ X^≤ω ∣ some letter of w is in U} = [U], and that is in τ^≤ω. Given an arbitrary subbasic open subset ◇(U × V) of P(X^* × X^≤ω), with U, V ∈ θ^* themselves basic open sets, namely U=[U₁,…,U_m] and V=[V₁,…,V_n] (resp., or V=⟨V₁, …, V_n ∣ V_∞⟩). Then split⁻¹(◇(U × V)) = {w ∈ X^* ∣ ∃(u, v) ∈ X^* × X^*, uv = w and u ∈ U and v ∈ V}=[U₁,…,U_m,V₁,…,V_n], resp. ⟨U₁,…,U_m,V₁,…,V_n ∣ V_∞⟩ (an open set that we will simply write as UV), and this is in τ^≤ω. Hence both supp and split are continuous with respect to the asymptotic infinite subword topology.

In order to show that recur is continuous, then, we observe that recur⁻¹(◇U) is the set of finite or infinite words w over X such that, for every i ∈ ℕ, cl(supp(w_≥i)) intersects U; the latter is equivalent to requiring that supp(w_≥i) intersects U, namely to w_≥i ∈ supp⁻¹(◇U) = [U]; simplifying further, the latter means that w contains a letter in U at some position ≥i. Since i is arbitrary, w is in recur⁻¹(◇U) if and only if infinitely many letters of w are in U. Therefore recur⁻¹(◇U) = ⟨ ∣ U⟩, which is in τ^≤ω.

In the converse direction, let us assume that τ is a topology on X^≤ω such that supp, split and recur are continuous. We claim that τ^≤ω ⊆ τ. Since supp is continuous, supp⁻¹(◇U)=[U] is an open set in τ for every open subset U of X, as in the case of finite words. Now we recall the formula split⁻¹(◇(U × V)) = UV. Relying on the fact that split is continuous, for every finite sequence of open subsets U₁,…,U_m of X (with m≥1), we successively show that [U_m] is in τ (as we have just seen), that [U_m_–1, U_m] = split⁻¹(◇([U_m_–1] × [U_m])) is also in τ, …, and finally that [U₁,…,U_m] = split⁻¹(◇([U₁] × [U₂,…,U_m])) is in τ. This also works when m=0, since in that case [U₁,…,U_m] = X^≤ω must also be in τ.

We repeat the same argument using recur instead of supp. Since recur is continuous, recur⁻¹(◇U) = ⟨ ∣ U⟩ is in τ for every open subset U of X. Then, using the continuity of split as above, for every finite sequence of open subsets U₁,…,U_m of X, ⟨U_m ∣ U⟩ = split⁻¹(◇([U_m_–1] × ⟨ ∣ U⟩)), …, ⟨U₁, …, U_n ∣ U⟩ = split⁻¹(◇([U₁] × ⟨U₂, …, U_n ∣ U⟩)) are all in τ. ☐

Some food for thought

We see a pattern emerge here. In each case, we have a space Y, either X^* or X^≤ω, and its topology is the coarsest that makes a certain, finite family of maps f_i : Y → F_i(X,Y) continuous (1≤i≤n), where F_i(X,_) is an endofunctor on Top that restricts to an endofunctor on the full subcategory Noeth of Noetherian spaces.

Explicitly, in the case of finite and infinite words, we had n=3 such functions:

f₁ is supp, and the functor F₁(X,_) maps any space Y to P(X). In other words, F₁(X,Y)=P(X) (a space that does not depend on Y, only on X; in particular, F₁(X,_) maps every morphism f in Top to the identity map on P(X)). This is an endofunctor on Noeth as soon as X is Noetherian. This is something remarkable: if X is Noetherian, then so is P(X) in its lower Vietoris topology [1, Prop. 7.3]. (See also Exercise 9.7.14 in [3], and realize that P(X) and the Hoare powerspace of X have isomorphic lattices of open sets. Or see the Appendix!)
f₂ is split, and the functor F₂(X,_) maps any space Y to P(X^* × Y). Note that F₂ is also an endofunctor on Top that restricts to one on Noeth, as soon as X is Noetherian. Indeed, X^* is then Noetherian by the Topological Higman Lemma (Theorem 9.7.33 in [3]), binary products preserve Noetherianness (Proposition 9.7.18, item~(vii) of [3]), and P preserves Noetherianness as well, as we have just recalled.
f₃ is recur, and the functor F₃(X,_) maps any space Y to P(X).

The functors F_i(X,_) have an additional property: there is a subbase of the topology of F_i(X,Y) whose elements are defined by expressions involving finite combinations of open subsets (of X and) of Y. For example, in the case of P(X^* × Y), that subbase consists of sets of the form ♢([U₁,…,U_m] × V), which involves only finitely many open sets U₁,…,U_m and V. The correct generalization of this property to an abstract setting is probably some form of continuity requirement of the functors F_i(X,_). Let us suggest that the correct notion may be the following slightly weaker form of continuity: if τ is the join of a directed family of topologies τ_j on the same set Y (when j varies), then the topology of F_i(X,(Y,τ)) is the join of the topologies of F_i(X,(Y,τ_j)) when j varies—this, for every i. Let me call that fiberwise continuity.

That leads to the following conjecture.

Conjecture. Let F₁(X,_), …, F_n(X,_) be finitely many endofunctors on Top that restrict to fiberwise continuous endofunctors on Noeth, provided that X is Noetherian. Let f₁, …, f_n be maps from a set Y to F₁(X,Y), …, F_n(X,Y), respectively. With the coarsest topology that makes the latter maps continuous, Y is Noetherian.

If the conjecture is wrong, then one should find conditions under which it holds. The technology of minimal bad sequences of basic open subsets used in the proof of the Topological Kruskal Theorem (Theorem 9.7.46 in [3]) seems to adapt well to this case. In fact, the notion of fiberwise continuity was invented in order to mimic the latter proof. (This actually works, under an additional assumption that unfortunately seems less natural to us, at the moment.)

Going even further, with Simon Halfon, we have looked at similar Noetherian topologies for transfinite words on X, namely sequences of elements of X indexed by ordinals. (We are waiting for [4] to be accepted before we submit this one. We really look at the set of transfinite words of length ≤α, for some indecomposable ordinal α.) Without going into details, if we are to play the same game of defining the intended topology through destructors, it turns out that in addition to a natural generalization of split, we need infinitely many functions supp_β, one for each ordinal β<α that is either indecomposable or successor of an indecomposable, and which maps every transfinite word w to the intersection of the closed sets cl(supp(w_≥γ)), where γ ranges over the ordinals < β: when β=1, this yields the function supp, and when β=ω, this yields the function recur, for words of length ≤ω (our so-called infinite words). For words of larger transfinite length, we will also have functions supp_ω+1, supp_ω.2, supp_ω.2+1, etc.

That is leading us too far. Let us just say that this case of transfinite words is already out of reach of the proposed conjecture: for α large enough, the topology will be defined by infinitely many destructors.

Appendix

We have mentioned the following in passing.

Lemma. The join of finitely many Noetherian topologies on a set X is Noetherian.

Proof. There is a neat, quick proof. Let τ₁, …, τ_n be finitely many Noetherian topologies on X, and τ be their join. The finite product of the spaces (X, τ_i), 1≤i≤n, is Noetherian, because finite products of Noetherian spaces are Noetherian (Proposition 9.7.18, item~(vii) of [3]). Now it is easy to see that the diagonal map x ↦ (x, …, x) is a topological embedding from (X, τ) into Π_i=1ⁿ (X, τ_i), so (X, τ) is homeomorphic to a subspace of a Noetherian space, and is therefore also Noetherian (by Proposition 9.7.18, item~(iii) of [3]). ☐

… And also the following.

Lemma. For every Noetherian space X, P(X) is Noetherian in its lower Vietoris topology.

Proof. There are many proofs of this, but let us just give one that assumes as little knowledge about Noetherian spaces as possible. Let A be any subset of P(X). We claim that A is compact in the lower Vietoris topology. In order to show this, let us use Alexander’s subbase Lemma (Theorem 4.4.29 in [3]), and let us consider any covering of A by subbasic open sets ♢U_i, i ∈ I. Every subset E of X that is a member of A must therefore intersect some U_i. Let us pick some element x_E of E that is in some U_i, and let us form the set D ≜ {x_E | E ∈ A}. Since X is Noetherian, D is compact. By construction, D is included in ∪_i_∈_I U_i. We extract a finite subcover from the latter: there is a finite subset J of I such that D is included in ∪_j_∈_J U_j. In particular, for every E in A, x_E is in U_j for some j in J. Since x_E is also in E, E intersects U_j. In other words, E is in ♢U_j. Since E is arbitrary in A, the sets ♢U_j with j in J form a finite subcover of our original cover. Hence A is compact. ☐

Jean Goubault-Larrecq. On Noetherian Spaces. In 22nd Annual IEEE Symposium on Logic in Computer Science (LICS 2007), 2007, pages 453–62. IEEE.
———. Noetherian Spaces in Verification. In International Colloquium on Automata, Languages, and Programming, 2010, pages 2–21. Springer. doi:10.1007/978-3-642-14162-1_2.
———. Non-Hausdorff Topology and Domain Theory: Selected Topics in Point-Set Topology, 2013. Vol. 22. Cambridge University Press.
———. Infinitary Noetherian Constructions I. Infinite Words. Submitted to Colloquium Mathematicum, 2019.
Graham Higman. Ordering by Divisibility in Abstract Algebras. Proceedings of the London Mathematical Society s3–2(1), 1952, pages 326–336. doi:10.1112/plms/s3-2.1.326.
Joseph Bernard Kruskal. The Theory of Well-Quasi-Ordering: A Frequently Discovered Concept. Journal of Combinatorics A 13(3), 1972, pages 297–305. doi:10.1016/0097-3165(72)90063-5.
Maurice Pouzet. Un bel ordre d’abritement et ses rapports avec les bornes d’une multirelation. Comptes rendus de l’académie des sciences de Paris, série A, 274, 1972, pages 1677–1680.

— Aliaume Lopez and Jean Goubault-Larrecq (September 20th, 2020)

The word topology

Constructors will not do

Canonicality through destructors

The case of infinite words

Some food for thought

Appendix

Recent Posts

Recent Comments

Archives

Meta