Data structures
- Continuously-allocated: array, matrix, heap, hash table
- Linked data structure: distinct chunks of memory connected by pointers. lists, trees, graph adjacent nodes.
Arrays
Fixed size, O(1) access, space efficient, dynamic array doubles when reaching limit. Java array max_len<=2^31-1 as researved
Linked List
Single/Doubly-linked list. O(1) insert/delete. extra memory for pointers. Not efficient for randomly access.
Stack and Queue
Stack: push, pop; LIFO.
Queue: enqueue, dequeue. FIFO. searches in graphs.
Dictionary structure:
BST:
L < M < R. Min: leftmost descedent; Max: rightmost descedent.
Searching: O(h) = ceil(O(logn)) if balanced. Insertion O(logn). Deletion O(logn) 0/1/2 children.
Use BST over hash table for:
- Range searches (closest element to some key)
- Traverse elements in sorted order
- Find predecessor/successor to element
Tree Traversal: O(n) All nodes.
DFS (depth first): Pre-order: root, left, right; In-order: left, root, right; Post-order: left, right, root
BFS (breadth first): Implemented using a queue, visit all nodes at current level, advance to next level (doesn't use recursion).
Java TreeMap, Red-Black Tree: root and all null leaves are black, both children of every red node are black, every simple path from a given node to any of its descendant leaves contains the same number of black nodes.
Splay Tree: recently accessed moved closer to top for quicker access again. Implement cache, GC algorithm. Worst h=n though.
AVL Tree: Re-balance if height differ >1. So insert/delete slower, but faster retrieval.
HashMap.
Heap is a Priority Queue
Fibonacci heap: deleteminO(logn). insert, findMin, decrease-key, merge all O(1)
insert: add to end of array, bubble up.
Graphs
m edges E and n vertices V.
Directed Graphs: inner city streets (one-way), program execution flows
Undirected graphs: family hierarchy, large roads between cities.
Traverse graph (BFS I DFS) use undiscovered, discovered, processed states.
Adjacency matrix: n * n matrix. [i,j] = 1 if is edge, o.w 0; fast update(insert/delete) and if 2 edge connects. too many spaces wasted though.
Adjacent lists: vertex, neighborLists.
Trie I prefix_tree: root: null string. each edge 1 char. each path -- string. for auto-completion I auto-correction.
Ternary search tree: Trie with nodes arranged as BST with up to 3 children, eq, lo, hi.
Radix Tree: more memory efficient (compressed) with parents with single child nodes merged.
Suffix Tree: fast find if S contains q.
B-Tree: balanced tree used by DBs that optimize disk IO.
Interval Tree:
DAWG: directed acyclic word graph representing subset of substrings of a string.
Bloom filter: Probabilistic data structure to tell if a member is in a set. False positives possible. Query: possible or sure not in!