Memcache
Key Point: Consistency Hashing
Dynamo
Configurable Consistency
- R = Minumum number of nodes that participate in a successful read
- W = Minumum number of nodes that participate in a successful write
- N = Replication factor
- If R + W > N, you can claim consistency
- But R + W < N means lower latency
CouchDB
BigTable
Other Google Systems
- HBase
- …
Pig
Data Model
- Atom: Integer, string, etc
- Tuple:
- Sequence of fields
- Each field of any type
- Bag:
- A collection of tuples
- Not necessarily the same type
- Duplicates allowed
- Map:
- String literal keys mapped to any type
Example
t = <1, {<2,3>,<4,6>,<5,7>}, [‘apache’:’search’]>
- f1: atom
- f2: bag
- f3: map
expression | result |
---|---|
$0 | 1 |
f2 | Bag{<2,3>,<4,6>,<5,7>} |
f2.$0 | {<2>,<4>,<5>} |
f3#'apache' | Atom "search" |
sum(f2.$0) | 2+4+5 |