NoSQL

Memcache

Key Point: Consistency Hashing

Dynamo

Configurable Consistency

  • R = Minumum number of nodes that participate in a successful read
  • W = Minumum number of nodes that participate in a successful write
  • N = Replication factor
  • If R + W > N, you can claim consistency
  • But R + W < N means lower latency

CouchDB

BigTable

Other Google Systems

  • HBase

Pig

Pig Intro

Data Model

  • Atom: Integer, string, etc
  • Tuple:
    • Sequence of fields
    • Each field of any type
  • Bag:
    • A collection of tuples
    • Not necessarily the same type
    • Duplicates allowed
  • Map:
    • String literal keys mapped to any type

Example

t = <1, {<2,3>,<4,6>,<5,7>}, [‘apache’:’search’]>

  • f1: atom
  • f2: bag
  • f3: map
expression result
$0 1
f2 Bag{<2,3>,<4,6>,<5,7>}
f2.$0 {<2>,<4>,<5>}
f3#'apache' Atom "search"
sum(f2.$0) 2+4+5

Pig Functions


上一篇     下一篇