Bariz - Problem

daniel73

12.07.2012 10:00

Solution:

We may assume wlog that $N\leq2^{n+1}$, since the following solution shows that if $m$ is the largest integer such that $2^{m+1}\geq N>2^m$, then there are at least $m$ ubiquitous words. Note that if a word $U$ of length $\ell$ appears more than $2^k$ times in a period, then by the pigeonhole principle, either $Ua$ or $Ub$ of length $\ell+1$ appear more than $2^{k-1}$ times. If a word $U$ of length $\ell\geq N-k$ appears $t\geq2^k$ times, after trivial downward induction on $k$, there is a word of length $\geq N$ that appears more than $2^0=1$ times, or indices $1\leq i<j\leq N$ exist such that $x_i+d=x_j+d$ for $d=0,1,2,\dots,N-1$, and $W$ would have period $j-i<N$, contradiction. Hence a word that appears more than $2^k$ times has length at most $N-k-1$. We say that a word is a $k$-champion if it appears more than $2^k$ times, it has length $\ell$, and any word that appears more than $2^k$ times has length at most $\ell$. Clearly either $a$ or $b$ appear, by the pigeonhole principle, more than $2^{n-1}$ times in a period, or the set of words that appear more than $2^k$ times exists for each $k=0,1,\dots,n-1$. Given the set of words $U$ that appear more than $2^k$ times and have length $\ell$, either $Ua,Ub,aU,bU$ each appear at most $2^k$ times, and the given set would be the set of $k$-champions, or at least one of the words $Ua,Ub,aU,bU$ appears more than $2^k$ times, and we can start again, this time with the words that appear more than $2^k$ times and have length $\ell+1$. The process cannot continue indefinitely because the words that appear more than $2^k$ times have length at most $N-k-1$, or $k$-champions indeed exist for $k=0,1,2,\dots,n-1$. Assume that $U$ is a $k$-champion. If it appears more than $2^{k+1}$ times, then by the pigeonhole at least one of $Ua,Ub$ would appear more than $2^k$ times, in contradiction with $U$ being a $k$-champion. So a $k$-champion cannot appear more than $2^{k+1}$ times, and it cannot be a $k'$-champion for any $k'>k$. Conversely, no $k$-champion can be a $k'$-champion for any $k'<k$, otherwise a $k'$-champion would be a $k$-champion for $k>k'$, in contradiction with the previous result. It follows that there are at least $n$ distinct words that are $k$-champions, each one of them for $k=1,2,\dots,n-1$. Finally, each $k$-champion is ubiquitous; indeed, if $aU$ does not appear in a period of $W$ for a given $k$-champion $U$ of length $\ell$, then $bU$ appears as many times as $U$, or $bU$ appears more than $2^k$ times and has length $\ell+1$, in contradiction with $U$ being a $k$-champion. So $aU$ must appear at least once, and similarly $bU,Ua,Ub$ must each appear at least once. Since there are at least $n$ distinct $k$-champions (one for each $k=0,1,\dots,n-1$), then there are at least $n$ distinct ubiquitous words. The conclusion follows.

AnonymousBunny

22.07.2014 19:31

Incorrect solution, please ignore.

theflowerking

02.07.2015 05:19

Here is my solution! (good=ubiquitous). First notice that no two words of length $\ge N$ at a distance not multiple of $N$ can be equal. This is easy. For two equal blocks of letters that aren't at a distance that is a multiple of $N$, consider the following operation. Look to the left of these blocks. If the letters are equal, incorporate them into the blocks and continue. The process will eventually stop by the first observation. Now repeat the process to the right. We are left with two equal blocks $U$ of length at least the original length. To the left of one block is an $a$ and to the left of another block is a $b$. To the right, same thing. So $U$ is good. To this process we will call "expanding the blocks" (the blocks I talked about at the beginning of this paragraph). Let $ U_1, U_2, ... U_{\ell} $ denote the good words (with lengths $ M_1 \ge M_2 \ge ... M_{\ell} $). LEMMA: $U_k$ occurs at most $2^k$ times in a period. PROOF: By induction. $U_1$ occurs at most $2$ times, otherwise, by Pigeonhole, we could expand two of the three $U_1$'s and create a longer good word. Now, assume it works for $1,...,k$ and I prove it works for $k+1$. Otherwise, we have $2^{k+1}+1$ occurrences of $U_{k+1}$. At least half have the same letter to the left, by Pigeonhole. So we have $2^k+1$ words of $U_{k+1}$ like this, and we can expand all these block until we obtain a new good word, and by the induction it occurs at most $2^k$ times, contradiction since the new word measures more then $U_{k+1}$, so it is $U_k$ or $U_i$ with $i<k$. Now consider a period and consider the $N$ blocks of $n$ letters formed. Since $N>2^n$, two of those are the same, by Pigeonhole. If we expand them we get a good word of length $ \ge n$. Thus $M_1 \ge n$. I prove by induction that $M_i \ge n+1-i$. This was the base case. Now assume it works for $1,..,k-1$, I prove it works for $k$. Now consider the $N$ blocks of $n+1-k$ letters formed. By Pigeonhole, $2^{k-1}+1$ are the same. Assume this block $B$ is not a good word. Then there do not exist two occurrences of this block that are of the form $aBa,bBb$ nor $aBb, bBa$. This easily implies that they all have the same letter to the right or the same letter to the right. Expand $B$ like this and we get a new $B'$ of lenght at least $n+1-k+1$. If this $B'$ is not good, we repeat the process, until eventually it is good (this moment will come, by the first observation). Since it occurs $2^{k-1}+1$ different times, we have that it cannot be $U_1,...,U_{k-1}$. And so this implies $U_k$ exists and that $M_k \ge n+1-k$. So $U_1,...,U_n$ exist, so we are done! Notice this also proof that there's at least $m$ good words of length at least $n+1-m$, for $m \le n$.

solver6

22.11.2016 21:02

My proof : Consider minimal period $N$ and period word $a_0a_1\ldots a_{N-1}$, so we have that $N>2^n$. Let $I_1=\{0\leq i\leq N-1| a_i = a\}$. Without loss of generality we can assume that $|I_1| = |\{0\leq i\leq N-1| a_i = a\}|\geq |\{0\leq i\leq N-1| a_i = b\}|$. So we have that $|I_1|> 2^{n-1}$. Next we will construct iteration process. First consider maximal continuation of letters $\{a_i\}_{i\in I_1}$ in left and right sides, such that resulting words $U_i^{(1)}, i\in I_1$ are same ($|U_i^{(1)}|< \infty$ because $|a_i-a_j|<N, \forall i\not= j\in I_1$ and $N$ is period). Easy to see that word $U^{(1)}:= U_i^{(1)}$ is ubiquitous. From pigeonhole principle we get that there exists at least $|I_1|/2$ words $U_i^{(1)}$ where $i\in I_1$, such that one neighbor right letter for any such word are same. Name these words as $U_i^{(1)'}$ and set of their indexes as $I_2\subseteq I_1$. So we know that $|I_2|>|I_1|/2>2^{n-2}$. Next like the same consider maximal continuations of words $U_i^{(1)'}$ in left and right sides, such that resulting words $U_i^{(2)}, i\in I_2$ are same. So we get that word $U^{(2)}:= U_i^{(2)}$ is ubiquitous. Iterating this process we will get sequences $\ldots I_3\subseteq I_2\subseteq I_1$ and words $\{U_i^{(k)}\}_{i\in I_k}$, where $2^{n-k}<|I_k|$ and words $U^{(k)}:=U_i^{(k)}$ are all ubiquitous. So last ubiquitous word which we can get by such method is $U^{(n)}$. And so we have algorithm to construct $n$ ubiquitous words $U^{(1)}, U^{(2)}, \ldots, U^{(n)}$. $\Box$

solver6

23.11.2016 00:50

For example consider infinite word $\ldots ababbaabaaabbbbaa .ababbaabaaabbbbaa.ababbaabaaabbbbaa\ldots$ With period $ababbaabaaabbbbaa$ of size $2^4+1$. Then algorithm acts as follows : $\ldots (a)b(a)bb(a)(a)b(a)(a)(a)bbbb(a)(a).ababbaabaaabbbbaa.ababbaabaaabbbbaa\ldots$ Word $a$ is ubiquitous. $\ldots ababb(aa)b[ a(a]a)bbbb[a(a].a)babbaabaaabbbbaa.ababbaabaaabbbbaa\ldots$ Word $aa$ is ubiquitous. $\ldots ababb(aab)a(aab)bbba(a.ab)abbaabaaabbbbaa.ababbaabaaabbbbaa\ldots$ Word $aab$ is ubiquitous. $\ldots ababb(aaba)aabbbba(a.aba)bbaabaaabbbbaa.ababbaabaaabbbbaa\ldots$ Word $aaba$ is ubiquitous. Finish.

MNJ2357

21.10.2018 15:58

Can someone explain solver6's algorithm? Thanks

william122

17.02.2020 00:05

WLOG $W$ has more $a$s than $b$s. Consider the following sequence of strings: $s_0=a$, and $s_i$ is the string which appears most often every $N$ characters out of the four strings formed by $s_{i-1}$ with one letter annexed at the beginning or end (if two strings satisfy this criterion, pick one arbitrarily.) We terminate this sequence once $s_i$ only appears only once every $N$ characters in $W$, and suppose this last term is $s_k$. $s_k$ has to exist as if we have two identical sufficiently large strings separated by less than $N$ characters, then we can find a period less than $N$. I claim that there exist $n$ ubiquitous words among the $\{s_i\}$. For convenience, denote the number of times $s_i$ appears every $N$ characters be $n_i$. Note that $n_0>2^{n-1}$ and $n_k=1$. Claim 1: If $s_i$ is not ubiquitous, then $n_i=n_{i+1}$. Proof: Consider one appearance of $s_i$, and suppose the characters directly to the left and right of it are $l$ and $r$. Then, I claim that all appearances of $s_i$ have $l$ on the left, or all appearances have $r$ on the right. If not, then we can find an appearance which does not have $l$ on the left, and another which does not have $r$ on the right. However, this means that $as_i$, $bs_i$, $s_ia$, $s_ib$ all appear, and $s_i$ is ubiquitous, which is a contradiction. Now, WLOG $s_i$s have $r$ on the right, then we can set $s_{i+1}=s_ir$, and we get $n_i=n_{i+1}$ as desired. Claim 2: If $s_i$ is ubiquitous, then $n_{i+1}\ge n_i/2$ Proof: Consider the characters to the right of every appearence of $s_i$. If there are more $a$s than $b$s, we set $s_{i+1}=s_ia$. Otherwise, we set $s_{i+1}=s_ib$. As $s_{i+1}$ can be constructed from the majority of the $s_i$s, we have $n_{i+1}\ge n_i/2$ as desired. Combining these two claims, note that in $\{s_i\}$, we must have at least $\left\lceil\log_2\left(\frac{n_0}{n_k}\right)\right\rceil\ge n$ ubiquitous words, as desired.

CANBANKAN

10.04.2021 08:38

Here is one important observation. Let $T$ be the period of $W$. Suppose $X_iX_{i+1}\cdots X_{i+d} = X_jX_{j+1}\cdots X_{j+d}$, and this word is NOT ubitquitous. Say it always has a prefix a, then we add $a$ to it, and consider that string instead. Eventually, we can form a ubiquitous string. Now, the $2^n$ suggests a divide and conquer algorithm, so here is a solution. WLOG, there are more than $2^{n-1}$ a's in the period. Call this number $A$. Obviously $a$ is an ubiquitous string. We consider this algorithm to construct $c_i$ from $c_{i-1}$ where $c_i$ are strings with $c_1=1$. Append a prefix to our current string. Case 1: Preceding all occurences of $c_i$, there is all a (or all b). Then we simply add this letter to the left of our current string $c_i$. Case 2: Preceding occurences of $c_i$, some start with $a$, and some start with $b$. Take the one with more letters. By pigeonhole, there are more than $2^{n-i}$ of them. If suffixes turn out to be a problem, i.e. they all end with the same letter, I can append that letter until that is no longer a problem. If this goes on long enough, we can also conclude our string is periodic. This indeed produces $\lceil \log_2(A) \rceil$ ubiquitous substrings. Example: N=aabaabab Our string is aabaabaabab|aabaabab... First of all, a work. Let $c_1=a$. There are 3 occurences of $ba$ and 2 occurrences of $aa$ (we can see $X_1=X_2=X_4=X_5=X_7=a, X_3=X_6=X_8=b$ here, $X_{8+i}=X_i$, and this counts $1\le i\le 8$ as our leftmost character of the string), so we take ba. There are 3 occurrences of $aba$. Do nothing There are 2 occurences of $aaba$ and one occurence of $baba$, so $aba$ is ubiquitous. If there happens to all stop in case 1 before there is one occurence of something, then $X_i=X_j \rightarrow X_{i-1}=X_{j-1} \rightarrow \cdots$, we can easily see that there is a smaller period.

SnowPanda

30.01.2022 18:56

Number1048576

08.04.2022 18:09

First, note that by the pigeonhole principle since there are $N > 2^n$ strings of length $k$ in $W$ up to periodicity, since there are $2^k$ different possible strings of length $k$, by the pigeonhole principle there exists a string of length $k$ appearing in $W$ at least $2^{n-k} + 1$ times. Now, we claim that if a string appears twice (not in different periods), there exists a superset of that string which is ubiquitous. Note that if a string $K$ appears twice and is not ubiquitous, then either $aK$, $Ka$, $bK$ or $Kb$ appears twice. Therefore, we have two equal strings of a length one larger. Assuming that there is no superset of $K$ which is ubiquitous, we can keep doing this process until we get two equal strings of length $N$. Note that these will overlap, so say that the string which they overlap on is $Y$, and the remaining string is $X$. Then we need that $XY = YX$. We claim that there exists a period $V$ of $Y$ such that $X$ is periodic in $V$, implying that $XY$ is periodic in $V$, a contradiction; we will prove this by induction. $Base$: $Y$ has length $1$. This forces $Y = a$ or $Y = b$, WLOG $Y = a$. Then note that $aX = Xa$, so $X$ must both start and end in $a$. If $X$ has length smaller than or equal to $2$, then we are done, as this forces that $X$ contains only $a$, and so is periodic in $a$. Otherwise, let $X = aZa$. Then we need that $aaZa = aZaa$. This implies that $aZ = Za$, so if we keep doing this we will eventually arrive at something with length $2$ or less, so $X$ is made up of only $a$s, as required. $Step$: $Y$ has length $y$ (let $X$ have length $x$) Note that we need that the first $y$ letters of $X$ are $Y$, as otherwise $XY$ wouldn't start with $Y$ but $YX$ would, which is impossible. Likewise, the last $y$ letters of $X$ must be $Y$. Now, if $x \leq 2y$, we have that either $X = YY$, so both $X$ and $Y$ are periodic in $Y$, or the two $Y$s overlap. If we let the overlap be $Z$, and the non-overlap be $W$, then we need that $ZW = WZ$. Therefore, by the inductive hypothesis (since $ZW = Y$ both $Z$ and $W$ are shorter than $Y$) there exists a $w$ such that $W$ is periodic in $w$ and $Z$ is periodic in $w$. Therefore, $Y$ is periodic in $w$, and $X$ is periodic in $w$, as required. On the other hand, if $x > 2y$, then we can let $X = YZY$ as before, and then we need that $YYZY = YZYY \implies YZ = ZY$. We can keep doing this until eventually we get that $X = Y...YCY...Y$, where $C$ has length shorter than $2y$. Hence, there exists a $w$ such that $Y$ and $C$ are periodic in $w$, so $X$ and $Y$ are periodic in $w$, as required, completing the induction. Now, let $K$ have length $k$, and assume that $E$ is the smallest superset of $K$ which is ubiquitous, and that $K$ appears at least $2^{n-k}+1$ times. Then, note that if $E$ has length larger than $k+1$, then for all $K_i$ appearing in $W$ (up to periodicity), either $aK_i$, $K_ia$, $bK_i$ or $K_ib$ appears in $W$, say that $aK_i$ appears in $W$ for all $i$. Then $aK$ appears in $W$ $2^{n-k}+1$ times. Therefore, we can keep doing this to get that if $E$ has length $e$, then there are $2^{n-k}+1$ equal subsets of $E$ with $e-1$ letters appearing in $W$ (say that this subset is $F$). Therefore, either there are $2^{n-k-1} + 1$ appearances of $aF$ in $W$ or $2^{n-k-1} + 1$ appearences of $bF$ in $W$. Then, we can do the same thing to get a new ubiquitous word $G$ with $2^{n-k-1}+1$ equal subsets $H$ with one less letter than $H$, and then keep doing this method until we get only two equal strings, in which case we do the method one last time to get one more ubiquitous word. If we start this at $k=0$, then this method will generate $n$ ubiquitous words, as required (we go up to $k = n-1$, and get a new ubiquitous word for each $k$, but there are $n$ $k$s in total, as required).

squareman

22.06.2022 15:42

If we have a word $w$ that works and occurs more than $k$ times in any minimal period, we can have a word $w'$ that works and occurs more than $\frac{k}{2}$ times in any minimal period, for any $k > 2.$ By Pigeonhole more than $\frac{k}{2}$ of the $w$-words are preceded by a common letter, say an $a.$ Append this to $w.$ Extend this new word in both directions until we find differing characters in the preceding and following positions. This exists since there occur multiple instances of our word within each period, but there is not a smaller period. One of $a$ or $b$ occurs more than $2^{n-1}$ times in each minimal period and works, so we're done.

L567

27.06.2022 14:07

Let $x_1x_2 \cdots x_p$ be the repeating part of the word $W$, with $p > 2^n$. Define the value of a word $X = w_1w_2 \cdots w_k$ (denoted by $f(X)$) as the number of $i$ with $1 \le i \le p$ such that $w_i w_{i+1} \cdots w_{i+k-1}$ coincides with $X$. Note that since this is succeeded by an $a$ or a $b$ in all occurences, we have $f(Xa) + f(Xb) = f(X)$. Suppose $U$ is a ubiquitous word with $f(U) = m \geqslant 2$. Construct two binary trees from $U$, one going forward, where $X$ points to $Xa$ and $Xb$ and one going backward, with $X$ pointing backward to $aX$ and $bX$. Move forward on the tree until we have a word $X$ such that $f(Xa), f(Xb) > 0$. Similarly move backward until we have a word $Y$ such that $f(aY), f(bY) > 0$. Such a point must exist since otherwise we get a word of length $> p$ which has value at least $2$, which would contradict the fact that $p$ was its minimal period. Now, the new word $YUX$ is ubiquitous too. Note that if each time, we go along the branch with higher value, we can ensure that he new word satisfies that $f(YUX) \geqslant \frac{m}{2}$. To finish, see that one of $a,b$ whichever appears more times, must be ubiquitous. Since such a word must appear $> 2^{n-1}$ times and each time, $f(U)$ at most halves, we can ensure that by this procedure, we obtain at least $n$ ubiquitous finite nonempty words, as desired. $\blacksquare$