# LeetCode 578: Get Highest Answer Rate Question

## Problem Restatement

We are given a table called `SurveyLog`.

Each row records one user action on one question. The action can be:

```sql
'show'
'answer'
'skip'
```

The answer rate of a question is:

```text
number of answer actions / number of show actions
```

We need to return the `question_id` with the highest answer rate. If multiple questions have the same highest answer rate, return the smallest `question_id`. The returned column should be named `survey_log`.

## Table

### SurveyLog

| Column | Type | Meaning |
|---|---|---|
| `id` | int | User ID |
| `action` | varchar | One of `show`, `answer`, or `skip` |
| `question_id` | int | Question ID |
| `answer_id` | int | Answer ID, not null only when action is `answer` |
| `q_num` | int | Order of the question in the current session |
| `timestamp` | int | Time of the action |

## Example

Input:

| id | action | question_id | answer_id | q_num | timestamp |
|---|---|---|---|---|---|
| 5 | show | 285 | null | 1 | 123 |
| 5 | answer | 285 | 124124 | 1 | 124 |
| 5 | show | 369 | null | 2 | 125 |
| 5 | skip | 369 | null | 2 | 126 |

For question `285`:

```text
answer count = 1
show count = 1
answer rate = 1 / 1 = 1.0
```

For question `369`:

```text
answer count = 0
show count = 1
answer rate = 0 / 1 = 0.0
```

Output:

| survey_log |
|---|
| 285 |

## First Thought: Count Answers Only

A first attempt might be to count only answer rows:

```sql
SELECT question_id
FROM SurveyLog
WHERE action = 'answer'
GROUP BY question_id
ORDER BY COUNT(*) DESC
LIMIT 1;
```

This is not enough.

The problem asks for answer rate, not answer count.

A question shown `100` times and answered `50` times has rate `0.5`.

A question shown `2` times and answered `2` times has rate `1.0`.

The second question should rank higher even though it has fewer total answers.

## Key Insight

For each question, we need two counts:

```text
answer_count = number of rows where action = 'answer'
show_count = number of rows where action = 'show'
```

Then compute:

```text
answer_rate = answer_count / show_count
```

In SQL, we can compute conditional counts using `SUM(CASE WHEN ... THEN 1 ELSE 0 END)`.

## Algorithm

1. Group rows by `question_id`.
2. Count how many times each question was answered.
3. Count how many times each question was shown.
4. Sort by answer rate from highest to lowest.
5. If rates tie, sort by `question_id` from smallest to largest.
6. Return the first row.

## Correctness

The query groups all rows with the same `question_id`, so every aggregate is computed per question.

For each group, the answer count includes exactly the rows whose action is `answer`. The show count includes exactly the rows whose action is `show`. Therefore, their ratio is exactly the answer rate defined by the problem.

Sorting by this ratio in descending order places the question with the highest answer rate first.

If multiple questions have the same answer rate, sorting by `question_id` in ascending order places the smallest `question_id` first.

Therefore, selecting the first row returns exactly the required question.

## Complexity

Let:

```text
N = number of rows in SurveyLog
Q = number of distinct questions
```

| Metric | Value | Why |
|---|---|---|
| Time | `O(N + Q log Q)` | Scan rows, group by question, then sort grouped results |
| Space | `O(Q)` | Store one aggregate row per question |

The exact execution plan depends on the database engine.

## Implementation

```sql
SELECT
    question_id AS survey_log
FROM SurveyLog
GROUP BY question_id
ORDER BY
    SUM(CASE WHEN action = 'answer' THEN 1 ELSE 0 END)
    / SUM(CASE WHEN action = 'show' THEN 1 ELSE 0 END) DESC,
    question_id ASC
LIMIT 1;
```

## Code Explanation

This groups all actions for the same question:

```sql
GROUP BY question_id
```

This counts answer actions:

```sql
SUM(CASE WHEN action = 'answer' THEN 1 ELSE 0 END)
```

This counts show actions:

```sql
SUM(CASE WHEN action = 'show' THEN 1 ELSE 0 END)
```

The ratio is used for sorting:

```sql
ORDER BY
    answer_count / show_count DESC
```

The query writes the expression directly inside `ORDER BY`.

The tie-breaker is:

```sql
question_id ASC
```

So if two questions have the same answer rate, the smaller question ID is returned.

The output column is renamed here:

```sql
question_id AS survey_log
```

## MySQL Integer Division Note

In MySQL, division with `/` returns a decimal result, so the implementation above works.

A more explicit version is:

```sql
SELECT
    question_id AS survey_log
FROM SurveyLog
GROUP BY question_id
ORDER BY
    SUM(CASE WHEN action = 'answer' THEN 1 ELSE 0 END) * 1.0
    / SUM(CASE WHEN action = 'show' THEN 1 ELSE 0 END) DESC,
    question_id ASC
LIMIT 1;
```

Multiplying by `1.0` makes the ratio clearly numeric.

## Testing

Sample data:

```sql
CREATE TABLE SurveyLog (
    id INT,
    action VARCHAR(20),
    question_id INT,
    answer_id INT,
    q_num INT,
    timestamp INT
);

INSERT INTO SurveyLog (id, action, question_id, answer_id, q_num, timestamp) VALUES
(5, 'show', 285, NULL, 1, 123),
(5, 'answer', 285, 124124, 1, 124),
(5, 'show', 369, NULL, 2, 125),
(5, 'skip', 369, NULL, 2, 126);
```

Query:

```sql
SELECT
    question_id AS survey_log
FROM SurveyLog
GROUP BY question_id
ORDER BY
    SUM(CASE WHEN action = 'answer' THEN 1 ELSE 0 END) * 1.0
    / SUM(CASE WHEN action = 'show' THEN 1 ELSE 0 END) DESC,
    question_id ASC
LIMIT 1;
```

Expected result:

| survey_log |
|---|
| 285 |

Additional test cases:

| Case | Expected behavior |
|---|---|
| One question has rate `1.0` | That question should rank first |
| One question has many answers but lower rate | It should not win by count alone |
| Two questions have the same rate | Smaller `question_id` should win |
| Question has shows but no answers | Its answer rate is `0` |
| Several users see the same question | All rows for that question are grouped together |

