Posts

OLTP/OLAP and Tuple Storage Schemes


Database Workloads

On-Line Transaction Processing (OLTP) is a type of workloads that DBMS tends to (1)focus on writing more than reading and/or (2)deal with small amount of data each time.

While On-Line Analytical Processing (OLAP) is another type that DBMS tends to (1)focus more on reading and/or (2)access amount of data each time.

There is another type named Hybrid Transactional/Analytical Processing (HTAP) that hybrids these two types of workloads togetger.

Tuple Storage Schemes

There are two common schemes of tuple storage: column-oriented storage and row-oriented storage.

Let’s say we have the following tuples (logically):

# (name, age, sex, department)
[
  ("Anna", 25, "F", "Comp. Sci."),
  ("Betty", 23, "F", "Software Eng."),
  # ...
  ("Candy", 24, "F", "Software Eng."),
  ("David", 24, "M", "Comp. Sci."),
  # ...
]

In row-oriented storage scheme, these tuples are stored in almost the same way:

# Block 1
[
  ("Anna", 25, "F", "Comp. Sci."),
  ("Betty", 23, "F", "Software Eng."),
  # ...
],
# Block 2
[
  ("Candy", 24, "F", "Software Eng."),
  ("David", 24, "M", "Comp. Sci."),
  # ...
]

While in column-oriented storage scheme, values of a single attribute are stored together:

# Block 1 (for attribute "name")
[
  "Anna", "Betty", # ...
  "Candy", "David", # ...
]
# Block 2 (for attribute "age")
[
  25, 23, # ...
  24, 24, # ...
]
# Block 3 (for attribute "sex")
[
  "F", "F", # ...
  "F", "M", # ...
]
# Block 4 (for attribute "department")
[
  "Comp. Sci.", "Software Eng.", # ...
  "Software Eng.", "Comp. Sci.", # ...
]
Row-OrientedColumn-Oriented
Pros1. Reduce IO to insert/update/delete tuple
2. Reduce multiple IO to access an entire tuple
1. Each block contains more values of an attribute (good for queries only access a subset of attrbutes)
2. Column level compression
Cons1. Waste IO when a few attributes are needed
2. Waste IO to scan large poritions of data
1. Multiple IO caused by insert/update/delete
2. Reduce CPU time to reconstruct tuples and/or decompress columns
Works ForOLTPOLAP