Skip to content

1.2 Input and Output Models

An algorithm does not operate on an abstract idea of data.

An algorithm does not operate on an abstract idea of data. It operates on a chosen model of data. The same problem can look easy or hard depending on how the input is represented and how the output must be produced. For this reason, every serious algorithm description should state not only what the input means, but also how the input is presented to the algorithm.

The input model defines the operations the algorithm may perform on the data. An array model allows constant time access by index. A linked list model allows cheap insertion after a known node but does not allow constant time random access. A graph represented by adjacency lists supports efficient traversal of outgoing edges, while a graph represented by an adjacency matrix supports constant time edge queries but uses more space. These choices are not implementation details. They are part of the algorithmic contract.

The output model is equally important. Some algorithms return a value, such as a length, a count, or a minimum cost. Others return a witness, such as a path, matching, schedule, or subset. A witness output carries more information and often requires extra bookkeeping. For example, computing the length of a shortest path is simpler than reconstructing the path itself.

Problem

You have a problem statement, but the representation of the input and output is unclear. You need to choose a model that supports both a clean algorithm and a correct complexity analysis.

Solution

Write the input and output model explicitly before designing the algorithm.

Use this template:

Input model:
  Data:
  Access operations:
  Size parameters:
  Guarantees:

Output model:
  Result type:
  Acceptance condition:
  Witness requirements:
  Ordering or tie-breaking:

For an array problem:

Input model:
  Data: array A[0..n-1] of integers
  Access operations: read A[i] in O(1)
  Size parameters: n
  Guarantees: n >= 1

Output model:
  Result type: integer
  Acceptance condition: equals the maximum value among all valid subarray sums
  Witness requirements: none
  Ordering or tie-breaking: not applicable

For a graph problem:

Input model:
  Data: directed graph G = (V, E), stored as adjacency lists
  Access operations: iterate outgoing neighbors of v in O(outdegree(v))
  Size parameters: |V| = n, |E| = m
  Guarantees: vertices are labeled 0..n-1

Output model:
  Result type: list of vertices
  Acceptance condition: a shortest path from s to t
  Witness requirements: the list must describe consecutive edges in G
  Ordering or tie-breaking: any shortest path is acceptable

Discussion

The input model controls what counts as a primitive step. If an algorithm says “look up the middle element”, that is constant time for an array and linear time for a linked list. If an algorithm says “test whether edge (u, v) exists”, that is constant time for an adjacency matrix, expected constant time for a hash set of edges, and linear in the degree of u for a plain adjacency list.

A careful output model prevents under-specification. Suppose a shortest path algorithm returns distance only. Its output type can be a number or infinity. If the same algorithm must return the actual path, it must store predecessor information during the search. The complexity may remain the same asymptotically, but the implementation changes.

Tie-breaking should be stated whenever many outputs may be valid. If any shortest path is allowed, the algorithm can stop as soon as it proves optimality. If the lexicographically smallest shortest path is required, the traversal order and data structures become part of correctness.

Example: Maximum Element

The problem “find the maximum element” depends on the model.

For an unsorted array:

Input:
  array A[0..n-1]

Output:
  value x such that x >= A[i] for every i

A correct algorithm must inspect every element in the worst case. Otherwise, an unseen element could be larger than all inspected elements.

max_value(A):
  best = A[0]
  for i = 1 to n - 1:
    if A[i] > best:
      best = A[i]
  return best

The time cost is (O(n)), and the extra space cost is (O(1)).

For a sorted array in nondecreasing order:

Input:
  sorted array A[0..n-1]

Output:
  maximum value in A

The maximum is immediately available as (A[n-1]). The time cost is (O(1)), assuming constant time indexing.

The problem did not change mathematically. The input model changed.

Example: Graph Reachability

Consider the problem “determine whether t is reachable from s”.

With adjacency lists, breadth-first search or depth-first search runs in:

[ O(n + m) ]

where (n) is the number of vertices and (m) is the number of edges.

With an adjacency matrix, scanning all possible outgoing edges from each visited vertex costs:

[ O(n^2) ]

even when the graph has few edges.

The algorithmic idea is the same. The representation changes the cost model.

Common Pitfalls

Do not hide expensive operations inside words such as “find”, “lookup”, or “choose”. State the data structure that makes the operation cheap, or count its cost explicitly.

Do not return a witness unless the algorithm stores enough information to construct it. A distance table alone may prove the value of a shortest path, but it may not identify the path.

Do not assume random access when the input is streamed. Streaming input can usually be read once in order. Many algorithms that are simple on arrays become impossible or require more memory in a streaming model.

Do not ignore output size. An algorithm that prints all pairs, all subsets, or all paths may require large time simply because the output is large.

Takeaway

An input and output model fixes the meaning of primitive operations, the form of the result, and the cost of producing that result. Before choosing an algorithm, state the representation of the data and the required shape of the answer.