# LeetCode 585: Investments in 2016

## Problem Restatement

We are given an `Insurance` table.

Each row represents one policyholder and includes:

1. Their policy ID.
2. Their total investment value in 2015.
3. Their total investment value in 2016.
4. Their location as latitude and longitude.

We need to report the sum of `tiv_2016` for policyholders who satisfy both conditions:

1. Their `tiv_2015` value is shared by at least one other policyholder.
2. Their `(lat, lon)` location is unique in the table.

The final sum must be rounded to two decimal places.

## Table

### Insurance

| Column | Type | Meaning |
|---|---|---|
| `pid` | int | Policy ID |
| `tiv_2015` | float | Total investment value in 2015 |
| `tiv_2016` | float | Total investment value in 2016 |
| `lat` | float | Latitude of the policyholder's city |
| `lon` | float | Longitude of the policyholder's city |

`pid` is the primary key.

## Example

Input:

| pid | tiv_2015 | tiv_2016 | lat | lon |
|---:|---:|---:|---:|---:|
| 1 | 10 | 5 | 10 | 10 |
| 2 | 20 | 20 | 20 | 20 |
| 3 | 10 | 30 | 20 | 20 |
| 4 | 10 | 40 | 40 | 40 |

Output:

| tiv_2016 |
|---:|
| 45.00 |

Policy `1` has `tiv_2015 = 10`, which is shared by policies `3` and `4`. Its location `(10, 10)` is unique, so it qualifies.

Policy `2` has `tiv_2015 = 20`, which is not shared by another policyholder, so it does not qualify.

Policy `3` has `tiv_2015 = 10`, but its location `(20, 20)` is shared with policy `2`, so it does not qualify.

Policy `4` has `tiv_2015 = 10` and unique location `(40, 40)`, so it qualifies.

The sum is:

```text
5 + 40 = 45
```

So the result is:

```text
45.00
```

## First Thought: Sum All 2016 Investments

A first attempt might be:

```sql
SELECT ROUND(SUM(tiv_2016), 2) AS tiv_2016
FROM Insurance;
```

This sums every policyholder's 2016 investment.

But the problem asks for only policyholders that pass both filters:

1. Repeated `tiv_2015`.
2. Unique `(lat, lon)`.

So we need to filter rows before summing.

## Key Insight

The row qualifies based on two group-level facts.

First, the `tiv_2015` value must appear more than once:

```sql
GROUP BY tiv_2015
HAVING COUNT(*) > 1
```

Second, the `(lat, lon)` pair must appear exactly once:

```sql
GROUP BY lat, lon
HAVING COUNT(*) = 1
```

Then the main query keeps only rows whose values belong to those qualifying groups.

## Algorithm

1. Find all `tiv_2015` values that appear more than once.
2. Find all `(lat, lon)` pairs that appear exactly once.
3. Select rows from `Insurance` that satisfy both conditions.
4. Sum their `tiv_2016` values.
5. Round the result to two decimal places.

## Correctness

The first subquery groups rows by `tiv_2015` and keeps only groups with count greater than `1`. Therefore, a row passes this condition exactly when its 2015 investment value is shared by at least one other policyholder.

The second subquery groups rows by `(lat, lon)` and keeps only groups with count equal to `1`. Therefore, a row passes this condition exactly when no other policyholder has the same location.

The main query applies both filters with `AND`, so it keeps exactly the policyholders that satisfy both requirements. It then sums `tiv_2016` over exactly those rows and rounds the final result to two decimal places.

Therefore, the query returns the required value.

## Complexity

Let:

```text
n = number of rows in Insurance
```

| Metric | Value | Why |
|---|---|---|
| Time | `O(n)` to `O(n log n)` | Depends on how the database implements grouping |
| Space | `O(n)` | Grouping may store distinct `tiv_2015` values and distinct locations |

The exact execution plan depends on indexes and the SQL engine.

## Implementation

```sql
SELECT
    ROUND(SUM(tiv_2016), 2) AS tiv_2016
FROM Insurance
WHERE tiv_2015 IN (
    SELECT tiv_2015
    FROM Insurance
    GROUP BY tiv_2015
    HAVING COUNT(*) > 1
)
AND (lat, lon) IN (
    SELECT lat, lon
    FROM Insurance
    GROUP BY lat, lon
    HAVING COUNT(*) = 1
);
```

## Code Explanation

This subquery finds repeated 2015 investment values:

```sql
SELECT tiv_2015
FROM Insurance
GROUP BY tiv_2015
HAVING COUNT(*) > 1
```

A `tiv_2015` value qualifies only if at least two policyholders have it.

This subquery finds unique locations:

```sql
SELECT lat, lon
FROM Insurance
GROUP BY lat, lon
HAVING COUNT(*) = 1
```

A `(lat, lon)` pair qualifies only if exactly one policyholder has that location.

The main `WHERE` clause combines both requirements:

```sql
WHERE tiv_2015 IN (...)
AND (lat, lon) IN (...)
```

Then the query sums the qualifying 2016 investment values:

```sql
SUM(tiv_2016)
```

Finally, the result is rounded:

```sql
ROUND(SUM(tiv_2016), 2) AS tiv_2016
```

## Window Function Alternative

We can also compute the two counts with window functions.

```sql
WITH counted AS (
    SELECT
        tiv_2016,
        COUNT(*) OVER (PARTITION BY tiv_2015) AS same_tiv_2015_count,
        COUNT(*) OVER (PARTITION BY lat, lon) AS same_location_count
    FROM Insurance
)
SELECT
    ROUND(SUM(tiv_2016), 2) AS tiv_2016
FROM counted
WHERE same_tiv_2015_count > 1
  AND same_location_count = 1;
```

This version often reads more directly:

1. Count how many rows share the same `tiv_2015`.
2. Count how many rows share the same location.
3. Keep rows where the first count is greater than `1` and the second count is `1`.

## Testing

Sample data:

```sql
CREATE TABLE Insurance (
    pid INT PRIMARY KEY,
    tiv_2015 FLOAT,
    tiv_2016 FLOAT,
    lat FLOAT,
    lon FLOAT
);

INSERT INTO Insurance (pid, tiv_2015, tiv_2016, lat, lon) VALUES
(1, 10, 5, 10, 10),
(2, 20, 20, 20, 20),
(3, 10, 30, 20, 20),
(4, 10, 40, 40, 40);
```

Query:

```sql
SELECT
    ROUND(SUM(tiv_2016), 2) AS tiv_2016
FROM Insurance
WHERE tiv_2015 IN (
    SELECT tiv_2015
    FROM Insurance
    GROUP BY tiv_2015
    HAVING COUNT(*) > 1
)
AND (lat, lon) IN (
    SELECT lat, lon
    FROM Insurance
    GROUP BY lat, lon
    HAVING COUNT(*) = 1
);
```

Expected result:

| tiv_2016 |
|---:|
| 45.00 |

Additional test cases:

| Case | Expected behavior |
|---|---|
| `tiv_2015` appears once | Row is excluded |
| `tiv_2015` appears multiple times | Row passes the first condition |
| `(lat, lon)` appears once | Row passes the second condition |
| `(lat, lon)` appears multiple times | Row is excluded |
| Row passes both conditions | Its `tiv_2016` is included in the sum |

