Some language has only three letters - $A, B$ and $C$. A sequence of letters is called a word iff it contains exactly 100 letters such that exactly 40 of them are consonants and other 60 letters are all $A$. What is the maximum numbers of words one can pick such that any two picked words have at least one position where they both have consonants, but different consonants?
Problem
Source: All-Russian 2021/11.3
Tags: combinatorics
23.04.2021 14:41
Great problem! Sketch of the official solution: Let $M$ be the set of the set of the chosen words, and suppose it works (every two words have a position, where the one has $B$, and the other has $C$) We construct function $f$ from $M$ to $S$, where $S$ is the set of the words, which have only $B$ and $C$ and have length $100$ (obviously they are $2^{100}$). For each word in $M$, replace each $A$ with $B$ or $C$ and we assign it the $2^{60}$ new words. We now have assigned $2^{60}.|M|$ words of only $B$ and $C$ to the words in $M$, and note that these new words are distinct (if there are two identical, there is no position in which the one has $B$, and the other has $C$). So $2^{60}.|M|<=2^{100}$, which proves the bound $|M|<=2^{40}$. The equality case is to take each of the chosen words to have its $60$ last positions to be $A$, and the first $40$ are only $B$ and $C$ and the first parts (the first $40$ positions) are all distinct (thus we have chosen exactly $2^{40}$).
28.06.2021 21:32
Our answer is $2^{40}$. We can get a set of such words by setting the first $60$ letters to $A$ and varying the remaining letters between $B$ and $C$. Any two such words must differ in at least one of the last $40$ positions else they are the same word, and if they do differ, they must consist of $\{B, C\}$, which are two consonants. Next we show that the set of words picked, call it $S$, has size $|S| \leq 2^{40}$. Indeed, consider the multiset $S'$ formed by: for each word $s$ in $S$, consider each of the $2^{60}$ words formed by replacing each of the $60$ instances of $A$ in $s$ with either $B$ or $C$, and slap them into $S'$. Clearly, $|S'| = 2^{60}|S|$. Note that in fact, no two strings in $S'$ can be the same $-$ if some two $s_1, s_2 \in S'$ were identical, then no matter how we replace some $60$ letters in each of $s_1, s_2$ with $A$, there still cannot be a position where $s_1$ and $s_2$ have letters $B$ and $C$ (in some order). Thus, $|S'| \leq 2^{100}$, as there are $2^{100}$ total strings whose letters are either $B$ or $C$. This proves our bound\[2^{60}|S| = |S'| \leq 2^{100} \implies |S| \leq 2^{40}\]and we are done.
29.06.2021 06:13
Solved with Elliott Liu and Ryan Yang. The answer is \(2^{40}\), achieved by taking all words whose first 60 characters are \(A\) and whose last 40 characters are either \(B\) and \(C\). Now we show this is optimal. Suppose we select a set \(S\) of words such that any two words have differing consonants in at least one position. For each string \(s\in S\), consider the \(2^{60}\) strings obtained by replacing each \(A\) in \(s\) with either \(B\) or \(C\). It can be seen that all these strings are distinct, and there are at most \(2^{100}\) strings total, so \(S\) has size at most \(2^{40}\).
31.08.2022 12:26
We claim, that the answer is $2^{40}.$ Example: pick all words with $A$ on first $60$ positions and either $B$ or $C$ on each other. Estimation. Consider any set $\mathcal{P}$ of words, which satisfy the condition, and for it's every word consider $2^{60}$ sequences of letters, obtained by replacing all $A$ with either $B$ or $C$; it's obvious that all such sequences are different, and let $\mathcal{Q}$ denotes set of them: $$|\mathcal{P}|=|\mathcal{Q}|:2^{60}\leq 2^{100}:2^{60}=2^{40}.$$
31.08.2022 12:37
Amazing problerm
06.03.2023 18:42
Beautiful The answer is $2^{40}$. A construction is to pick $2^{40}$ distinct sequences of $B$ and $C$ which are $40$ letters long and then add $60$ copies of $A$ to the end. We now prove the bound where $60$ is replaced by $k \geq 0$ by induction on $k$, with the $k=0$ case being obvious. Now consider the scenario with $k+1$ letters which are $A$. Partition them into sets $A$, $B$, and $C$ based on their ending letter. Let $B_B$ be the set formed by switching the rightmost $A$ of every word in $B$ to $B$ and $B_C$ formed by switching that letter to $C$, and define $C_B$ and $C_C$ similarly. Then chop off the rightmost letter of these four sets as well as the set $A$, forming $A'$, etc. It is easy to see that if some consonants differ in some position between $A$ and $B$ then they still differ between $A'$, $B_B'$, and $B_C'$, and the same is true for $A'$, $C_B'$, and $C_C'$. Thus every pair of (distinct) elements in the multiset $|A' \cup B_B' \cup B_C'|$ differ at some consonant, so by hypothesis $$|A'|+|B_B'|+|B_C'|=|A|+2|B| \leq 2^{40}$$and likewise $|A|+2|C| \leq 2^{40}$. Summing and dividing by $2$, we get $|A \cup B \cup C|=|A|+|B|+|C| \leq 2^{40}$, as desired. $\blacksquare$ Remark: The induction is not necessary at all but I approached the problem from that angle, and it doesn't make the problem harder either (this has happened with a few other combinatorics problems where my solution is the same as everyone else's but I have induction for some reason). The inductive approach was motivated by the following consideration: after playing around a bit it seems like the answer is $2^{40}$ regardless of how many letters are $A$, and the bound is obvious if none of the letters are $A$. It seems slightly easier to work with the case, for instance, where $1$ of the letters is $A$, and after playing around a bit more we find the idea of performing the partition and then switching the letter $A$ in the sets $B$ and $C$ to the letters $B$ and $C$ (personally, at the very first I only had the idea to swap the letter $A$ in set $B$ to letter $B$, but from there it's a quick adaptation), and this can then be made to work for a general inductive step.
18.03.2023 07:19
Here is a solution using the Probabilistic Method (motivated by the fact that a very similar problem was encountered in 18.226): We reformulate the problem as follows: reformulation wrote: Consider some subsets $A_1, B_1, \ldots A_k, B_k$ of $[100]$, where for all $i$, $|A_i| + |B_i| = 40$, $A_i \cap B_i = \emptyset$, and for every $i \neq j$, at least one of $A_i \cap B_j$ and $A_j \cap B_i$ is nonempty. Find the maximum value of $k$. The proof now proceeds as follows. Choose independently and uniformly at random a real from $[0, 2]$ for each element in $[100]$. Call $i$ nice if the labels of all the elements in $A_i$ are $< 1$ and the labels of all the elements in $B_i$ are $\geq 1$. Clearly every $i$ has a probability of $2^{-40}$ of being nice, so the expected number of nice $i$s is $k2^{-40}$. But the condition that at least one of $A_i \cap B_j$, $A_j \cap B_i$ is nonempty guarantees that at most one $i$ can be nice, so thus $k2^{-40} \leq 1$, and therefore $k \leq 2^{40}$. Equality is given when all the words have $A$ as their last $60$ letters.
27.03.2023 09:18