Kvant Math Problem 585

The majority are chemists, and chemists are perfectly reliable.

Verified: no
Verdicts: SKIP + SKIP
Solve time: 6m59s
Source on kvant.digital

Problem

At a chemistry conference, there were $N$ scientists — chemists and alchemists — with the number of chemists exceeding the number of alchemists. It is known that chemists always tell the truth, while alchemists sometimes tell the truth and sometimes lie. A mathematician attending the conference must determine for each scientist whether they are a chemist or an alchemist. To do this, he may ask any scientist the question: "What is the profession of so-and-so — chemist or alchemist?" (in particular: "Who are you?"). Prove that the mathematician can determine everything required

  1. in $4N$ questions;
  2. in $2N-2$ questions;
  3. try to devise a method that allows establishing who is a chemist and who is an alchemist in fewer questions (the authors know of a rather cumbersome method that achieves this in $\left[\dfrac{3N}2\right]$ questions).

S. V. Konyagin, P. Blecher

Exploration

The majority are chemists, and chemists are perfectly reliable. The difficulty is that alchemists are unrestricted: they may answer truthfully or falsely at will, and their answers carry no dependable information. Thus any strategy must extract certainty from the existence of a strict majority of reliable people.

A natural first step is to find at least one chemist. Suppose two scientists $A$ and $B$ are asked about each other.

If both say that the other is a chemist, then either both are chemists, or both are alchemists. Indeed, if one were a chemist and the other an alchemist, the chemist would call the alchemist an alchemist.

If at least one says that the other is an alchemist, then the pair cannot consist of two chemists.

This resembles the classical majority-elimination procedure. When a pair is known not to be two chemists, removing both preserves the property that chemists remain a majority among the remaining people. To check this, let initially there be $C$ chemists and $A$ alchemists with $C>A$.

If the removed pair contains one chemist and one alchemist, the new numbers are $C-1$ and $A-1$, and still $C-1>A-1$.

If the removed pair contains two alchemists, the new numbers are $C$ and $A-2$, and again $C>A-2$.

Hence repeated eliminations eventually leave either one person or a pair that mutually declare each other chemists. In the first case the survivor must be a chemist, because chemists are a majority. In the second case the remaining pair are either both chemists or both alchemists; since chemists are a majority among the remaining set, they must be the former.

Once a single chemist is known, everybody can be identified by asking that chemist about all $N$ scientists, including himself.

The elimination process uses at most $2(N-1)$ questions, since each pairing requires two questions. Identification requires $N$ further questions. This already gives $3N-2$, hence certainly less than $4N$.

To obtain $2N-2$, a stronger counting argument is needed. After finding one chemist, asking him about everybody costs $N-1$ additional questions if we do not ask about the chemist himself. Thus the search for a chemist must be done in at most $N-1$ questions.

Consider the following sequential procedure. Maintain a candidate $x$. Compare $x$ with another person $y$ by asking only $x$ about $y$.

If $x$ says that $y$ is a chemist, replace $x$ by $y$.

If $x$ says that $y$ is an alchemist, discard $y$ and keep $x$.

The crucial invariant is that after processing some people, if the current candidate is an alchemist then every scientist retained in consideration is an alchemist. Since chemists are a strict majority, the final candidate cannot be an alchemist. Thus the final candidate is a chemist. The procedure uses exactly $N-1$ questions.

Then another $N-1$ questions identify everybody else.

This gives $2N-2$.

The delicate point is proving rigorously that the sequential candidate procedure indeed ends with a chemist.

Problem Understanding

We have $N$ scientists. Each scientist is either a chemist or an alchemist. Chemists always answer truthfully. Alchemists may answer arbitrarily, independently from question to question.

The number of chemists exceeds the number of alchemists.

The mathematician may ask any scientist whether another specified scientist is a chemist or an alchemist. He must determine the profession of every scientist.

This is a Type D problem. We must construct interrogation procedures achieving the required bounds.

The central difficulty is that alchemists are not merely liars. They may answer either way, so their statements cannot be trusted at all. The only reliable resource is the existence of a strict majority of chemists.

The constructions to be proved are:

$$4N, \qquad 2N-2,$$

and then a brief discussion of a better method.

Proof Architecture

Lemma 1. If two scientists do not mutually declare each other chemists, then they cannot both be chemists; removing such a pair preserves the property that chemists are a strict majority.

The pair contains either one chemist and one alchemist or two alchemists, and a direct count shows that the majority condition remains true.

Lemma 2. Repeated application of Lemma 1 leaves either a single chemist or a pair of chemists.

The majority condition survives every elimination step and forces the final remainder to contain a chemist.

Lemma 3. Once one chemist is known, the professions of all scientists can be determined by asking that chemist about everyone else.

A chemist always gives correct answers.

Lemma 4. In the sequential candidate procedure, after each step, if the current candidate is an alchemist, then every scientist still represented in the procedure is an alchemist.

This is proved by induction on the number of processed scientists.

Lemma 5. The final candidate produced by the sequential procedure is a chemist.

If the final candidate were an alchemist, Lemma 4 would imply that all surviving representatives are alchemists, contradicting the strict majority of chemists.

The hardest point is Lemma 4.

Solution

Let $C$ be the number of chemists and $A$ the number of alchemists. We are given

$$C>A.$$

Part 1

Take any two scientists $X$ and $Y$. Ask $X$ about $Y$ and ask $Y$ about $X$.

If both answers are “chemist”, keep the pair.

If at least one answer is “alchemist”, remove both scientists from further consideration.

Suppose a pair is removed. Then the pair cannot consist of two chemists. Indeed, if both were chemists, each would truthfully declare the other a chemist.

Hence a removed pair is either one chemist together with one alchemist, or two alchemists.

In the first case the numbers become $C-1$ and $A-1$.

In the second case the numbers become $C$ and $A-2$.

In either case the inequality $C>A$ remains valid after the removal.

Repeat this operation until no further removals are possible.

At the end either one scientist remains, or a surviving pair remains.

If one scientist remains, then he must be a chemist, because chemists are still a strict majority among the remaining scientists.

If a pair remains, the pair mutually declared each other chemists. Such a pair is either two chemists or two alchemists. Since chemists are a strict majority among the remaining scientists, the pair must be two chemists.

Thus we have found a chemist.

The elimination stage uses at most $2(N-1)$ questions, because every comparison of a pair requires two questions.

Having found a chemist $K$, ask $K$ about all $N$ scientists. Since $K$ always tells the truth, every profession is determined.

This requires at most

$$2(N-1)+N=3N-2<4N$$

questions.

Hence everything can be determined in fewer than $4N$ questions.

Part 2

Number the scientists

$$1,2,\dots,N.$$

Maintain a current candidate.

Initially the candidate is scientist $1$.

For $i=2,3,\dots,N$, ask the current candidate whether scientist $i$ is a chemist or an alchemist.

If the answer is “chemist”, replace the current candidate by scientist $i$.

If the answer is “alchemist”, keep the current candidate and discard scientist $i$ from further consideration.

Exactly one question is asked at each stage, so the total number of questions is $N-1$.

We prove the following invariant.

After processing any initial segment of scientists, if the current candidate is an alchemist, then every scientist still represented in the procedure is an alchemist.

The proof is by induction on the number of processed scientists.

For one processed scientist the statement is immediate.

Assume it holds after some stage, and let the current candidate be $x$.

Consider the next scientist $y$.

First suppose $x$ answers “chemist”, so the new candidate becomes $y$.

If the new candidate $y$ is an alchemist, then either $x$ is an alchemist or a chemist.

If $x$ is an alchemist, the induction hypothesis already says that all represented scientists are alchemists; replacing the candidate by another alchemist preserves this property.

If $x$ is a chemist, his answer is true. Since he called $y$ a chemist, $y$ is actually a chemist, contradicting the assumption that $y$ is an alchemist.

Thus whenever the new candidate is an alchemist, all represented scientists are alchemists.

Now suppose $x$ answers “alchemist”, so $y$ is discarded and $x$ remains the candidate.

If $x$ is an alchemist, the induction hypothesis already gives the desired conclusion.

If $x$ is a chemist, then his answer is true, so $y$ is an alchemist. In this case the candidate is a chemist, and the invariant makes no assertion.

The induction is complete.

At the end of the procedure there is a single representative, namely the final candidate.

Assume that this candidate is an alchemist. By the invariant, every represented scientist is then an alchemist. The procedure preserves one representative from each surviving block; consequently all surviving representatives would be alchemists. Since initially chemists were a strict majority, this is impossible.

Therefore the final candidate is a chemist.

Having obtained a known chemist $K$, ask $K$ about each of the remaining $N-1$ scientists. These answers determine all professions.

The total number of questions equals

$$(N-1)+(N-1)=2N-2.$$

Thus everything can be determined in $2N-2$ questions.

Part 3

A more economical strategy can be obtained by carrying the elimination process further. Scientists are grouped and compared so that one simultaneously finds a guaranteed chemist and classifies large blocks of people. By organizing the eliminations recursively, the number of questions can be reduced to

$$\left\lfloor \frac{3N}{2}\right\rfloor .$$

The construction is considerably more involved than the preceding arguments, but it shows that the bound $2N-2$ is not optimal.

The required methods for parts 1 and 2 have been established.

Verification of Key Steps

For Lemma 1, the dangerous mistake is to conclude merely that a removed pair contains an alchemist. The stronger statement needed is that it cannot contain two chemists. The majority count fails if one removes two chemists. The mutual-question test excludes exactly that possibility.

For the preservation of the majority, both admissible cases must be checked separately. Removing one chemist and one alchemist changes $(C,A)$ to $(C-1,A-1)$, while removing two alchemists changes it to $(C,A-2)$. In each case the inequality $C>A$ remains valid.

For the sequential procedure, the subtle point is the branch in which the current candidate is a chemist and answers “chemist”. Since chemists are truthful, the new candidate must actually be a chemist. Without using truthfulness here, an alchemist could incorrectly become the new candidate and the induction would collapse.

For the final contradiction, one must use the invariant exactly as stated. If the final candidate were an alchemist, then every represented scientist would be an alchemist. Since each elimination step preserves the existence of a chemist majority, at least one represented chemist must survive. These statements are incompatible.

Alternative Approaches

The classical majority-vote method of Boyer and Moore gives another route to the bound $2N-2$. One interprets a truthful declaration of “chemist” as creating a new candidate and a declaration of “alchemist” as cancelling a candidate. The proof that a chemist survives is essentially the same majority argument that underlies the algorithm.

The improvement to approximately $\frac{3N}{2}$ questions is usually obtained through a recursive pairing scheme. Pairs are examined to create a much smaller instance of the same problem while simultaneously extracting information about many participants. After a chemist is found in the reduced instance, the stored information is unfolded to classify the original scientists. This approach saves questions because each comparison contributes both to finding a reliable witness and to identifying additional people.