Data Structures

Data structures organize values so that operations on them are cheap. Picking the right structure usually beats micro-optimizing the wrong one.

Overview

Trade-offs are universal: structures fast at lookup are often slow at insertion; ordered structures cost more to maintain than unordered; cache-friendly contiguous layouts beat “asymptotically better” ones at small n.

Linear Structures

Array — O(1) index, O(n) insert/delete in middle. Cache friendly.
Dynamic array (vector / list) — amortized O(1) append.
Linked list — O(1) insert/delete given node; O(n) lookup. Poor cache behavior.
Stack — LIFO; push/pop O(1).
Queue / Deque — FIFO; ring buffer or linked list.
String — usually a specialized array; immutable in many languages.

Trees

Binary tree — generic recursive structure.
BST — ordered; O(log n) avg, O(n) worst (degenerate).
Balanced BST — AVL, Red-Black; guaranteed O(log n).
B-tree / B+ tree — high fan-out; backbone of databases & filesystems.
Heap — priority queue; insert & extract O(log n).
Trie — prefix search on strings.
Segment / Fenwick tree — range queries.

Hash-Based

Hash map / set — expected O(1) lookup, insert, delete.
Collision strategies: chaining, open addressing (linear, quadratic, double).
Load factor target: ~0.5–0.75 for open addressing, higher for chaining.
Bloom filter — probabilistic set; no false negatives.
HyperLogLog — cardinality estimation in O(1) memory.

Graphs

Representation: adjacency list (sparse) or adjacency matrix (dense).
Directed / undirected; weighted / unweighted; cyclic / acyclic.
Union-Find (DSU) — near-O(1) connectivity queries.

Choosing One

Need ordered iteration → balanced BST or sorted array.
Need fastest lookup by key → hash map.
Need range queries → B-tree, segment tree.
Need FIFO scheduling → deque or priority queue.
Need duplicate detection at scale → Bloom filter + DB.