# LeetCode 193: Valid Phone Numbers

## Problem Restatement

We are given a text file named `file.txt`.

Each line contains one phone number candidate.

We need to print only the valid phone numbers.

A valid phone number must match one of these two formats:

```text
xxx-xxx-xxxx
```

or:

```text
(xxx) xxx-xxxx
```

Here, each `x` is a digit.

For example:

```text
987-123-4567
123 456 7890
(123) 456-7890
```

The valid lines are:

```text
987-123-4567
(123) 456-7890
```

## Input and Output

| Item | Meaning |
|---|---|
| Input | A file named `file.txt` |
| Output | All valid phone number lines |
| Format 1 | `xxx-xxx-xxxx` |
| Format 2 | `(xxx) xxx-xxxx` |
| Language | Bash shell script |

## Examples

Input file:

```text
987-123-4567
123 456 7890
(123) 456-7890
```

Output:

```text
987-123-4567
(123) 456-7890
```

The first line is valid because it has three digits, a dash, three digits, a dash, then four digits.

The second line is invalid because it uses spaces instead of the required dash format.

The third line is valid because it has the area code inside parentheses, followed by a space, then three digits, a dash, and four digits.

## First Thought: Match the Valid Pattern

This is a pattern matching problem.

We do not need to transform the file. We only need to print lines that match one of two exact regular expressions.

The two valid formats are:

```text
[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]
```

and:

```text
([0-9][0-9][0-9]) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]
```

In regular expression form, we can write repeated digits more compactly as:

```text
[0-9]{3}-[0-9]{3}-[0-9]{4}
```

and:

```text
\([0-9]{3}\) [0-9]{3}-[0-9]{4}
```

The parentheses must be escaped because parentheses have special meaning in extended regular expressions.

## Key Insight

We must match the whole line, not just part of the line.

That is why we use:

```text
^
```

for the beginning of the line, and:

```text
$
```

for the end of the line.

Without these anchors, a line like this could be incorrectly accepted:

```text
abc 987-123-4567 xyz
```

It contains a valid-looking phone number inside the line, but the whole line is not a valid phone number.

## Algorithm

Use `grep` with extended regular expressions.

The full pattern is:

```text
^(\([0-9]{3}\) |[0-9]{3}-)[0-9]{3}-[0-9]{4}$
```

This means:

| Part | Meaning |
|---|---|
| `^` | Start of the line |
| `(` | Start of a regex group |
| `$[0-9]{3}$ ` | Match `(xxx) ` |
| `|` | Or |
| `[0-9]{3}-` | Match `xxx-` |
| `)` | End of the regex group |
| `[0-9]{3}` | Match the next three digits |
| `-` | Match a dash |
| `[0-9]{4}` | Match the final four digits |
| `$` | End of the line |

So the first group chooses between the two allowed prefixes:

```text
(xxx) 
```

or:

```text
xxx-
```

Then the rest is shared:

```text
xxx-xxxx
```

## Correctness

The regular expression starts with `^`, so matching must begin at the first character of the line.

The first group matches exactly one of the two valid prefixes: either an area code in parentheses followed by one space, or three digits followed by a dash.

After that, the expression requires three digits, one dash, and four digits.

The expression ends with `$`, so no extra characters may appear after the final digit.

Therefore, a line is printed if and only if the whole line has one of the two valid phone number formats.

## Complexity

Let `n` be the number of lines in the file, and let `m` be the maximum line length.

| Metric | Value | Why |
|---|---|---|
| Time | `O(nm)` | `grep` checks each line against the pattern |
| Space | `O(1)` | The command streams the file line by line |

## Implementation

```bash
# Read from the file file.txt and output all valid phone numbers to stdout.
grep -E '^(\([0-9]{3}\) |[0-9]{3}-)[0-9]{3}-[0-9]{4}$' file.txt
```

## Code Explanation

The command uses `grep -E`:

```bash
grep -E
```

The `-E` option enables extended regular expressions. This lets us use `{3}`, `{4}`, grouping, and `|` more cleanly.

The first valid prefix is:

```text
\([0-9]{3}\) 
```

This matches an opening parenthesis, three digits, a closing parenthesis, and one space.

The second valid prefix is:

```text
[0-9]{3}-
```

This matches three digits followed by a dash.

The shared suffix is:

```text
[0-9]{3}-[0-9]{4}
```

This matches three digits, a dash, and four digits.

The anchors:

```text
^
```

and:

```text
$
```

make sure the whole line matches the phone number format.

## Testing

Create a sample file:

```bash
cat > file.txt << 'EOF'
987-123-4567
123 456 7890
(123) 456-7890
EOF
```

Run the solution:

```bash
grep -E '^(\([0-9]{3}\) |[0-9]{3}-)[0-9]{3}-[0-9]{4}$' file.txt
```

Expected output:

```text
987-123-4567
(123) 456-7890
```

Test invalid extra characters:

```bash
cat > file.txt << 'EOF'
abc 987-123-4567
987-123-4567 xyz
987-123-4567
EOF
```

Expected output:

```text
987-123-4567
```

Test both valid formats:

```bash
cat > file.txt << 'EOF'
111-222-3333
(111) 222-3333
111 222 3333
(111)-222-3333
EOF
```

Expected output:

```text
111-222-3333
(111) 222-3333
```

