Sep 24, 2015 - Notes - Feral Concurrency Control

Comments

This post is a compilation of my notes on Peter Bailis et al’s SIGMOD’15 paper: Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity.

Background

Modern relational DBMs offer a range of primitives to help the developer ensure application integrity, even under the presence of concurrency: built-in integrity constraints (e.g: UNIQUE, NOT NULL, FOREIGN KEY), triggers, stored procedures to implement application-specific integrity constraints, optimistic/pessimistic locking of rows, serializable transactions etc. However, modern web application frameworks (e.g: Ruby on Rails), which promote ORM-based programming, refrain from using most of the primitives provided by the DB layer, instead choosing to reimplement them in the application layer. Consider Ruby on Rails for instance, which is an MVC web framework. Ruby on Rails lets application developers specify their data model (M in MVC) as a set of classes that implement Rails’ native ActiveRecord class. The data model can also contain application integrity constraints in form of validations and associations. For example, presence and uniqueness are a couple of validations, and belongs_to and has_many are a couple of associations supported natively by Rails. Rails implements object-relational mapping (ORM) by automatically mapping the classes (resp. objects) that constitute the data model (resp. data) to relations (resp. tuples) in the underlying DB, provided that the application code adheres to some pre-defined naming conventions. However, validations and associations are not mapped to corresponding itegrity constraints at the DB level. For instance, uniqueness validation is not mapped to MySQL UNIQUE primitive, but instead implemented by rails by performing a SELECT query and making sure that the record being inserted is not already present (all this is done within a transaction). Likewise, instead of mapping belongs_to association to FOREIGN KEY constraint native to DB, rails implements it by running a SELECT WHERE query to check that the key being referred to is indeed present.

Eschewing native database integrity and concurrency control solutions, Rails has developed a set of primitives for handling application integrity at the application level itself, thereby building, from the perspective of the underlying DB, a feral concurrency control system. The motivation behind this apparent irrationality is to enhance maintainability of the system, and to facilitate testing. Let me elaborate:

  • Maintainability: The set of available integrity and concurreny control mechanisms at the database layer depends on the data model employed by the underlying database, and also, in some cases, on the vendor. For instance, while relational data model supports foreign key constraints as a means to implement referential integrity, data models employed by weakly consistent data stores do not. Even among relational data stores, some (e.g: PostgresSQL) support foreign key constraints, while some other (e.g: MySQL’s MyISAM and NDB storage engines) do not. Likewise, the CHECK constraint used to check domain-specific integrity constraints, such as bal ≥ 0, is supported by PostgresSQL, but not supported by MySQL. Futhermore, besides standard validations and associations, such as uniqueness and foreign key constraints, Rails allows developers to define application-specific validations and associations, for which the corresponding native primitives may not exist, and even if they exist, it is not clear how to do the mapping. To make matters worse, DB systems usually silently ignore any constraint that they do not support. Therefore, an application that relies on DB’s native constraints to enfore integrity risks correctness, is not portable, and is consequently difficult to maintain.
  • Testing: Database’s referential integrity constraints “get in the way of” testing by insisting that test data be inserted/deleted into/from the database in a specific order such that the integrity is never violated. In many cases, developers want to simply dump the sample data (known to be consistent) into the database and get along with testing their application. Requiring that such data still be inserted/deleted respecting referential integrity might be an overkill.

The aforementioned reasons, along with some personal convictions of its creators, has motivated Rails to eschew concurrency and integrity controlling mechanisms at the database layer, and view it simply as a table storage. This approach of Rails has been hugely successful, as evident from its large-scale adoption.

Replacing database native primitives with feral mechanisms may improve maintainability of the application, but does it really work? Are the feral invariants correctly enforced in Rails? Do they work in practice? This paper performs theoretical analysis and emperical studies to answer these questions.

How prevalent are feral mechanisms in practice?

They are quite prevalent. Analyzing a corpus of 67 popular opensource web applications, authors found that applications, on average, used just 3.8 transactions against 52.4 validations and 92.8 associations (median could’ve been a better metric). The overwhelming number of associations relative to transactions indicates that Rails developers use associations to perform referential integrity preserving insertions into multiple tables, which is otherwise performed in a transaction.

Rails Concurrency Control Mechanisms

Implementing validations (e.g: uniqueness) and associations (e.g: belongs_to) in the application layer is trivially safe in the sequential setting. But, in the presence of concurrent client requests attempting to modify the database simultaneosly, some concurrency control mechanisms are needed in order to ensure that validations and associations are indeed being enforced correctly. Otherwise, data integrity is jeopardized. Towards this end, Rails does the following:

  • When a model state is being updated (for e.g., during an insertion), Rails runs all the validations on the model sequentially, in a (database-backed) transaction. The rationale is to rely on ACID guarantees of the transaction to ensure the correctness of validation. For instance, uniqueness validation (by issuing a SELECT query to the database) is performed within a transaction so as to preempt concurrent insertions that may result in duplicates.
  • Associations are validated in the similar way: by enclosing the the corresponding validations inside a transaction.

If transactions are serializable as expected, then validations are indeed safe. However, databases do not offer serializable transactions by default (the default isolation level in PostgresSQL is RC. In MySQL (InnoDB), it is RR.), and, in some cases, they do not offer serializability at all. Given the possibility of concurrent transactions under weaker isolation levels (for e.g., RC ensures that visible transactions are entirely visible. It doesn’t guarantee total order among transactions.), Rails validations may not really ensure the validity of data. Examples:

  • Concurrent uniqueness validations can independently succeed, thus allowing concurrent duplicate insertions.
  • Association (e.g: foreign key) checking in one transaction, and record deletion in a concurrent transaction can independently succeed, thus compromising referential integrity.

Rails acknowledges these anomalies. The documentation warns that uniqueness validation can still occassionally admit duplicates, and association validation can occassionally result in violation of referential integrity.

How likely are these anomalies in practice?

Not very likely. For instance, in LinkBench workload capturing Facebook’s MySQL record access pattern, with 64 concurrent clients issuing 100 insert requests per second on 1000 different keys, an average of 10 duplicate records were observed when uniqueness validation is turned on. In less adversarial production workload, we may get much less number of duplicates, or maybe no duplicates. Therefore, it may be argued that validations are “good enough” for many web applications, where correctness is not a top priority.

However, concurrency is not the only reason for incorrect application behaviour. Quite often, anomalies might also result because of the incorrect implementation of a validation/association in Rails framework, or because of some non-trivial interactions between various application-level validations that Rails developers haven’t foreseen. For instance, in Rails, delete_all operation, unlike the delete operation, doesn’t create objects for rows it is deleting, thus failing to perform validations on the data being deleted (note: validations are performed only when a data model object is updated). This could result in the violation of referential integrity, resulting in http error messages being shown to user, as in the case of thoughtbot, a popular blogging platform. Thoughtbot has since started relying on PostgresSQL’s foreign key constraints to enforce referential integrity. Similar experiences have prompted few other Rails users to start a forum called Rails devs for data integrity (ref) that advocates strengthening Rails validations with database-backed checks.

What can be done about the anomalies?

Most databases natively support UNIQUE constraint, which ensures absence of duplicates at no extra expense. It is therefore a shame if one is forced to choose an incorrect implementation instead of a similar performing correct implementation just for software engineering reasons. Can something be done about this?

One solution is to insist that all validations are done inside serializable transactions (i.e., choose serializability instead of the default RC or RR isolation level at the database layer). This trivially ensures correctness of all feral implementations. Unfortunately, serializability comes at the cost of availability, which is a more important in the context of web applications. Moreover, the study finds that 75% of application invariants do not need serializability for correct enforcement. Imposing serializability by default is therefore unjustified.

The paper concludes that there is currently no database-backed solution that “respects and assists with application programmer desires for a clean, idiomatic way means of expressing correctness criteria in domain logic”. Authors believe that “there is an opportunity and pressing need to build systems that provide all three criteria: performance, correctness, and programmability.” To domesticate feral mechanisms, the authors argue, application users and framework authors need a new database interface that will enable them to:

  1. Express correctness criteria in the language of their domain model, with minimal friction, while permitting their automatic enforcement. Any solution to domestication must respect ORM application patterns and programming style, including the ability to specify invariants in each framework’s native language. An ideal solution for domestication should provide universal support to applications’ feral invariants with no additional overhead for application developers.
  2. Only pay the price of coordination when necessary. As already discussed, serializable transactions only when needed.
  3. Easily deploy the framework/application across multiple database backends. For example, the interface must allow Rails to talk to a relational store, as well as a key-value store.

Notes

“Rails can’t be trusted to maintain referential integrity, but you know what’s really good at doing that? Our relational database.” - thoughtbot

“Even when both systems are configured to one of the strict levels of transaction locking, the differences between the two implementations are subtle enough that which implementation will work better for a particular application is hard to state definitively. “- postgresql wiki.

Continue reading

Jul 31, 2015 - Atomicity vs Isolation

From the perspective of a transaction,

  • Isolation: How should I see effects of other transactions.
  • Atomicity: How other transactions see my effects.

Jul 25, 2015 - ML Type Inference

One of the most useful features of ML-family languages (OCaml, Standard ML, F# etc) is the type inference. This post contains the notes I took when I was trying to understand the foundational principles of ML type inference.

Damas-Milner Algorithm for Type Inference

Let us take a look at the original type inference algorithm produced by Damas and Milner. Their language is lambda calculus with let expression:

e ::= x | λx.e | e e | let x=e in e 

Type language composed of a set of type variables (α) and primitive types (ι).

τ = α | ι | τ → τ
σ = τ | ∀α.σ

A Type substitution is [τ1/α1, τ2/α2, ... , τn/αn]. It distinguished between an α, a type, and α, a type variable.

We define a relation (≥) that relates a more polymorphic type to a less polymorphic type. For eg, ∀α.α→α ≥ ∀β.β list → β list.

Typing judgment is Γ⊢e:σ.

Instantiation rule is also subtyping rule, where subtyping is defined in terms of “more polymorphic”:

Γ⊢e:σ₁   σ₁≥σ₂
--------------
    Γ⊢e:σ₂

Since there is no explicit big-lambda expression, generalization rule is important:

Γ⊢e:σ  α∈FV(σ)  α∉FV(Γ)
-----------------------
       Γ⊢e:∀α.σ

Basically, ∀α.σ is closure of σ w.r.t Γ (denoted \(\bar{\Gamma}(\sigma)\)).

Let Polymorphism

There is only one rule to introduce polymorphic type into the context Γ:

Γ⊢e:σ       Γ,x:σ⊢e':τ 
----------------------
  Γ⊢(let x=e in e'):τ
Continue reading

May 10, 2015 - SAT solving puzzles

Few months ago, Cheryl birthday puzzle has has been an internet phenomenon. If you found the puzzle tricky to solve, we are in the same boat. However, if you observe, the puzzle only requires us to apply simple logic; not number theory or complex arithmetic, just simple logic. What, then, makes the puzzle tricky?

I attribute its trickiness to the chain of reasoning involved - repeated application of simple logic to the existing knowledge to make new deductions until we arrive at the solution. Since the reasoning is quite involved, it would be great if we can enlist the services of a computer to help us in the process. But, how can a computer assist us in making deductions? The answer lies in the well-known SAT problem.

Knights and Knaves Puzzle

Consider the following simple instance of the famous Knights and Knaves class of puzzles:

There is an island in which certain inhabitants, called knights
always tell the truth, and others, called knaves always lie. It is
assumed that every inhabitant of this island is either a knight or 
a knave. 

Suppose A says, “Either I am a knave or B is a knight.” 

What are A and B?

The puzzle is easy to solve mentally. Nonetheless, if you have a 2-SAT solver handy, you can also let solver do the work. Here is how:

Let AisKnight and BisKnight be boolean variables, denoting the propositions “A is a Knight”, and “B is a Knight”, respectively. Since an inhabitant can either be Knight or Knave but not both, boolean formulas ¬ AisKnight and ¬ BisKnight denote the propositions that “A is a Knave”, and “B is a Knave”, respectively.

Now, consider the statement made by A: “Either I am a knave or B is a knight”. Symbolically, it is equivalent to: (¬ AisKnight) ∨ BisKnight. Since Knights always speak truth, and Knaves always lie, the statement is true if A is a Knight, and false if A is a Knave. In other words, the statement holds if and only if A is a Knight. Symbolically, it means:

AisKnight ⇔ (¬ AisKnight) ∨ BisKnight

Observe that the above boolean formula completely describes the constraints imposed by the puzzle, and does so quite succinctly. If the above formula is satisfiable, then the constraints imposed by the puzzle are also satisfiable, which means that puzzle has a solution. Furthermore, the satisfying assignment to literals in the formula (i.e., their truth values) give us the solution to the puzzle. To get those satisfying assignments is what the 2-SAT solver is for. If we feed the above formula to a 2-SAT solver, we get AisKnight = trueand BisKnight = true as satisfying assignments, telling us that A and B are both knights.

Note that satisfying assignment only tells us that “if A is knight, then the puzzle is solved”. Maybe the puzzle is also solved if A is a knave. How do we check for this possibility? Again, using the solver. We explicitly encode the constraint that A is Knave, and ask if our constraints are satisfiable. The boolean formula is shown below:

(AisKnight ⇔ (¬ AisKnight) ∨ BisKnight) ∧ (¬ AisKnight)

A 2-SAT solver cannot find any satisfying assignment to the above formula, indicating that our constraints are unsatisfiable. Hence, A can never be a Knave.

Now, lets take a step back and summarize what we have done. We were given a puzzle that is not difficult, but not obvious either. Instead of doing the reasoning ourselves, we simply described the puzzle as a boolean formula, and let a SAT solver do the reasoning for us. In other words, we have outsourced the reasoning to a solver.This facility to outsource some or all of the reasoning to a solver can be quite handy when we are solving logical puzzles that require more complex reasoning, like Cheryl’s birthday puzzle.

Continue reading

May 4, 2015 - Notes - Static Contract Checking for Haskell

I latex’d some notes while reading Dana N. Xu et al’s POPL’09 paper: Static contract checking for Haskell.


Continue reading

Jan 15, 2015 - Notes - A Data-Driven Approach for Algebraic Loop Invariants

This post contains my notes on Rahul Sharma et al’s ESOP’13 paper: A Data-Driven Approach for Algebraic Loop Invariants.

The paper proposes a neat approach of inferring algebraic loop invariants by observing concrete program states, and making use of techniques from linear algebra to gain insights.

Following is an example take from the paper:

assume (x=0 && y=0);
writelog (x,y);
while(nondet()) do
  y := y+1;
  x := x+y;
  writelog (x, y);

For the above example, we are interested in finding the loop invariant for the while loop. Assume we already know that invariant can be non-linear, but no complex than degree 2. Let \(f(x,y)\) denote a polynomial of degree less than or equal to 2. The equation \(f(x,y)=0\) is a polynomial equation or algebraic equation in variables x and y. Since x and y are the only two variables in scope at the head of the loop, we know that loop invariant will be of form:

where each \(f_i\) is a unique polynomial of degree utmost 2. A polynomial of degree 2 is simply a linear combination of monomials of degree utmost 2. In this case, we have 6 unique monomials: \({1,x,y,x^2,y^2,xy}\). Therefore, each polynomial will be of the following form:

The invariant inference problem then is to infer the rational constants \(w_1-w_6\) such that \(f(x,y)=0\).

The paper proposes a data-driven approach to solve this problem. Informally, a loop invariant over-approximates the set of all possible program states that are possible at the loop head. For the above example program, the program state consists of values of x and y. We instrument the code to observe the program state (i.e., values of x and y) at the beginning of every iteration of the loop, when the program is executed over test inputs (user-provided or random). We record these observations in a log file.

Next, for every observed program state, we calculate the values of all 6 monomials and record them in a matrix, where a row denotes a program state (i.e., fixed values for x and y), and a column denotes values of a fixed monomial in various program states. A sample datamatrix for the above example is shown below:

Now, any valid solution for \(w_1-w_6\) (hence the loop invariant problem) should explain the observed program states. Therefore, following has to hold:

Observe that any solution to the above equation gives us an invariant that holds over observed program states. However, it may not hold over all possible program states. Therefore, solving above equation gives us candidate invariants, which we can then verify for validity. But, how do we solve the above equation. Here is where techniques from linear algebra help.

Let us denote the datamatrix with \(A\), and the matrix of coeffecients (\(w_1-w_6\)) with \(W\). The matrix \(W\) has only one column, and is referred to as column vector or simply a vector. Set of all possible solutions to \(W\) (each is a vector) is called the null space of \(A\). Note that NullSpace(A) is a set of vectors. Span of any set of vectors \({x_1,..,x_n}\) is set of all vectors that can be expressed as a linear combination of \({x_1,…,x_n}\). In other words, span of a set of vectors is the entire vector space represented by those vectors. Observe that span of a NullSpace(A) is NullSpace(A), as every linear combination of vectors in NullSpace(A) results in a vector \(x\), which satisfies \(Ax=0\). Hence, NullSpace(A) is a vector space. Basis of any vector space is a set \(B\) of linearly independent vectors such that every other vector in the space can be expressed as a linear combination of vectors in \(B\). For a 3-D cartesian vector space, the set \({(3,0,0), (0,2,0), (0,0,1)}\) can be considered a basis. The cardinality of the basis is the dimension of the vector space, as is evident from the basis set of 3-D cartesian space.

For the matrix \(A\), Let \(B_w\) denote the basis of NullSpace(A). What does set \(B_w\) represent? Each vector in the set is a unique solution to \(W\) that cannot be expressed as linear combination of other solutions. Further, any solution to \(W\) can be expressed as a linear combination of vectors in \(B_w\). Effectively, this means that by calculating basis set of NullSpace(A), we derive _all unique algebraic invariants of degree utmost 2 that hold on observed program states. The invariants are unique as any other invariant can be expressed as a linear combination of polynomials denoted by the basis set.

Once candidate solutions are generated by calculating the basis of null space of \(A\), they are verified via symbolic execution and SMT solvers. Verification failure results in a counterexample, which is simply a valuation for x and y for which candidate invariant fails. In such case, loop is rerun with these values of x and y and more observations are recorded, thereby increasing the datamatrix \(A\). Invariants generated by calculating the basis of null space of larger datamatrix will now definitely account for the counterexample generated by the solver in the previous step. This counterexample guided iterative refinement (which they call \({\sf Guess-And-Check}\)) is similar in flavour to CEGAR, but there is one fundamental difference. Recall that CEGAR starts with a formula that over-approximates program behaviours, and gradually refines the formula until it is valid, or a bug is found in the program. In contrast, \({\sf Guess-And-Check}\) starts with under-approximation of program behaviours (represented by the datamatrix), and adds more behaviours until we have sufficient representation of the all possible program behaviours. In this sense, \({\sf Guess-And-Check}\) is more similar to CEGIS. The only difference is that, while CEGIS relies on an SMT oracle to generate positive samples, \({\sf Guess-And-Check}\) relies on tests.

In practice, the number of possible monomials at the loop head could be large. The paper uses a heuristic to discard monomials that are unlikely to be part of the final invariant. The heuristic is that monomials which grow at a rate much larger than the loop counter \(i\) (i.e., at a rate \(\omega(i^d)\)) are most often red herrings.

Continue reading

Nov 30, 2014 - Sequential Consistency and Datarace freedom in Weak Memory Models

A natural view of execution of a multi-threaded program is as follows:

exn = []
while (there is an unfinished thread) {
  t = select an unfinished thread;
  instr = first(t);
  exn := exn ++ [instr];
  t := rest(t);
}
return exn;

Observe that exn preserves program order of all threads. This is sequential consistency. However, there is still non-determinism on line 3. When there are shared variables, this is the source of data-races. For example, consider the following program (assume that initially x is 0):

[
  [STORE (x,3)],
  [STORE (x,4); PRINT (LOAD x)]
]

Since there is non-determinism, either of the STOREs can be executed first. Possible sequentially consistent (SC) executions are:

1. [STORE(x,3); STORE(x,4); PRINT(LOAD x)]
2. [STORE(x,4); STORE(x,3); PRINT(LOAD x)]
3. [STORE(x,4); PRINT(LOAD x); STORE(x,3)]

Notice that STORE(x,3) conflicts with STORE(x,4) and also with LOAD x, as each pair accesses same memory location, and STORE writes to the memory location. Further, notice that there exists at least one pair of SC executions, where one execution is obtained from other by just reordering an adjacent pair of conflicting operations. Hence the program has data-race. The above SC executions print 4, 3 and 4, respectively. Therefore, the set of acceptable outputs in a sequentially consistent execution (even in presence of data-races) is {3,4} (note: the set does not include 0).

Let us assume we have locks in our language. Then, the execution of a multi-threaded program can be constructed as following:

exn = []
while (there is an unfinished thread) {
  do {
    t = select an unfinished thread;
  } while (first(t) = LOCK(c) and 
           exn = exn'++ [LOCK(c)] ++ exn'' where
           UNLOCK(c) notin exn'')
  instr = first(t);
  exn := exn ++ [instr];
  t := rest(t);
}
return exn;

Locks help us control the amount of non-determinism in the program. For example, for the following program:

[
  [LOCK(c); STORE (x,3); UNLOCK(c)],
  [LOCK(c); STORE (x,4); PRINT (LOAD x); UNLOCK(c)]
]

Possible executions are:

1. [LOCK(c); STORE(x,3); UNLOCK(c); 
    LOCK(c); STORE(x,4); PRINT(LOAD x); UNLOCK(c)]
2. [LOCK(c); STORE(x,4); PRINT(LOAD x); UNLOCK(c); 
    LOCK(c); STORE(x,3); UNLOCK(c)]

The only acceptable output of the program now is 4. Further, there are no adjacent conflicting operations in any of the above SC executions. So, the program is DR-free.

A data-race-free model guarantees SC only for data-race-free programs. A program is DRF if and only if it is impossible to construct an SC execution with concurrent conflicting memory accesses, where one of them is a STORE. (When do we say two memory accesses are concurrent?)

Continue reading

Sep 23, 2014 - SC vs Linearizability

Sequential consistency requires that all data operations appear to have executed atomically in some sequential order that is consistent with the order seen at every individual process.

If instead of individual data operations, we apply sequential consistency to transactions, the resultant condition is called serializability in database theory.

Linearizability imposes more constraints on an order decided by sequential consistency - the order needs to make sense to an external observer. Let us say that an external observer is running two threads A and B. If thread A performs action a0, then syncs with thread B following which B performs b0. A sequentially consistent execution can order b0 before a0, as long as both A and B perceive the same order. On the other hand, such an execution is not valid under linearizability condition. Linearizability dictates that an operation should take effect instantaneously before the perceived end of the operation.

Evidently, linearizability is a stronger constraint than sequential consistency (or serializability). Nevertheless, it is said that linearizability is a local property (i.e., composable), sc is not. (I don’t understand this. If you know what this means, please drop a comment.)

Continue reading

Sep 15, 2014 - Notes - McCarthy's Lisp and Reynolds's Definitional Interpreters

This post is a compilation of my notes on two landmark papers:

  1. The Lisp paper: McCarthy J, Recursive Functions of Symbolic Expressions and Their Execution by Machine, CACM’60
  2. Interpreters paper: Reynolds J C, Definitional Interpreters for Higher-Order Programming Languages, ACM’72

1.The Lisp Paper

This paper by John McCarthy introduces the Lisp programming language and describes its implementation on IBM 704 machines. Lisp stands out among the programming languages of its day (eg: Algol 60) as the first high-level programming language that abstracts the underlying machine completely. Instead, Lisp allows symbolic computation by letting programmers define (recursive) functions over symbolic expressions (called S-expressions). An S-expression is either an atomic symbol (hence “symbolic”), or an ordered pair of S-expressions. Reserving an atomic symbol NIL to denote an empty list, we can construct a list of S-expressions:

[s1,s2,s3]

using ordered pairs as:

(s1,(s2,(s3,NIL)))

Observe that above is also an S-Expression. Hence, a list of S-expressions is also an S-expression. Since most functions operate on a list of symbols, programming such functions in Lisp is effectively LISt Programming (hence, the name “Lisp”).

Lisp defines five primitive S-functions to construct & destruct lists, and to check equality among atomic symbols:

  1. atom
  2. eq
  3. car
  4. cdr
  5. cons

Lisp, as defined in paper, includes a class of expressions called Meta-expressions (written M-expressions) to let us define and apply S-functions over S-expressions. An M-expression is either

  1. A function application. For eg, f[2; EMPTY] is an M-expression that denotes application of function f over S-expressions of atomic symbols 2 and EMPTY.
  2. A function definition with grammar: fname[x0, …, xn] = expr, and
  3. A conditional expression: [expr1a -> expr1b; …; exprna -> exprnb ]

Using M-expressions and primitive S-functions, we can define a subst function to substitute An S-expression (x) for an atomic symbol (y) in another expression (z) as:

subst[x;y;z] = [
  atom[z] -> [
    eq[z;y] -> x;
    T -> z
  ]
  T -> cons [subst[x; y; car[z]]; 
             subst[x; y; cdr[z]]]
]

This example demonstrates that while M-expressions are control structures (functions & conditional expressions), S-expressions represent data to be manipulated by M-expressions.

The definition (semantics) of Lisp are given as an interepreter program in Lisp itself. McCarthy’s list interpreter, implemented as mutually recursive eval and apply functions, is often cited as an example of metacircular evaluation - defining the evaluation semantics of embedded language in terms of evaluation semantics of host-language, which happens to be same as embedded languge. Interestingly, McCarthy defines Lisp as a dynamically scoped language (Paul Graham says so, but is it really? Reynolds’s 2nd interpreter, which he says is similar to McCarthy’s, is lexically scoped.).

In order to use metacircular lisp interpreter to interpret a lisp program, the program first has to be converted to data. In other words, all M-expressions need to be translated to S-expressions (Note: this means that M-expressions are only a syntactic sugar). A simple algorithm to perform the same has been described in the paper. The algorithm converts all function names and their arguments in the user lisp program (which are meta-variables) to atomic symbols. The interpreter (eval function, to be specific) maintains an environment which defines bindings for these atomic symbols. However, note that the original program may also contain atomic symbols, and these symbols have no attached interpretation. To distinguish between atomic symbols that are present in the user program, and atomic symbols that are introduced while convering M-expressions to S-expressions, the former set of atomic symbols are quoted.

As mentioned previously, Lisp is the first language that abstracted the underlying machine completely. In order to be able to do that, the run-time should be able to manage scarce machine resources, such as memory. Towards this end, Lisp introduced automated memory reclamation, which is now commonly known as “Garbage Collection”.

2.Definitional Interpreters Paper

The Lisp paper defines semantics of Lisp using a metacircular interpreter written in Lisp itself. The definitional interpreters paper by Reynolds exposes subtle aspects of such definitional interpreters. In particular, it shows how semantics of the embedded language (called the defined language) can depend on semantics of the host language (called the defining language), such as its treatment of higher-order functions, and the order of function application . To demonstrate this point, the paper first constructs a metacircular interpreter for a higher-order (defined) language in another higher-order (defining) language, by simply translating an expression (eg: function application) in defined language as the corresponding expression in defining language. The paper makes two observations about this interpreter:

  1. The nature of higher-order functions is not clear.
  2. The order of application is not explicit; the order of evaluating function applications in defined language depends on whether the defining language is call-by-value or call-by-name.

Subsequently, Reynolds proposes two transformations to make the treatment of higher-order functions, and order of function applications explicit in the semantics of defined language:

  1. Defunctionalization: A function in defined language is not a function in defining language. Instead, functions and function closures are represented explicitly as data structures (For eg, as S-expressions in Lisp interpreter), and the semantics of a function application is defined explicitly, as a transformation over data. Defunctionalization effectively lets us write a definitional interpreter for a higher-order language in a first-order language.
  2. CPS transformation: A continuation is a function that explicitly represents “rest of the computation”. At every atomic step of evaluation in a definitional interpreter, if we explicitly describe what “rest of the evaluation” is, then the semantics of defined language no longer depends on order of evaluation of defining language; the order of evaluation is now explicit in the interpreter. This kind of interpreter is said to be written in continuation passing style (CPS).

The paper also describes how imperative features, such as assignments, can be encoded explictly in a definitional interpreter written in applicative style.

Continue reading

Sep 9, 2014 - CAP Theorem and Related

My intention in writing this note is to understand the relation between conventional model of distributed systems that they usually teach in the distributed systems course and the the distributed web services hosting replicated datatypes. Fault tolerance is a concern in the former, and it is studied separately from communication link failures. On the other hand, partition tolerance is a concern in the later, and it looks like this means tolerance to communication failures. Consistency and availability are major concerns in distributed web services, whereas they don’t figure anywhere in the formal study of distributed systems. What is the connection? How can we express CAP theorem in conventional models?

The System Model

The conventional model of a distributed system is that of a replicated state machine (automaton), where the automaton models a local computation that reacts to external messages. This makes it clear that the focus in conventional model is on how to perform a replicated computation rather than on how to maintain replicated data.

TRLifeCycle

The above image is taken from Reliable Distributed Systems textbook.

This view is also corroborated by Eric Brewer in his PODC’2000 talk, where he conjectured CAP. Here are some exerpts from his (slides):

Inktomi builds distributed systems ... but very little use of
classic DS research.

"Distributed Systems" don't work

Persistent state is hard. Classic DS focuses on computation, not
data. This is WRONG; computation is the easy part.

The semantics of a distributed system that is intented to maintain replicated data is best captured in Burckhardt et al’s POPL’14 paper. Since we are familiar with the model, I am not going to reproduce it here.

In the conventional model, process is the unit of communication and the the unit of failure. It is assumed that there is a one-to-one communication link between every pair of processes in the system. Conventional model allows for processes to fail. The widely used model of process failure is crash-stop, where a process crashes and stops, and therefore becomes inaccessible. A communication link may also fail, but this failure shows up as a process failure - processes at both ends of a link preceive each other as failed processes. However, other processes may still view them as correct processes, and the conventional model allows for this disparity in processes’ view of what other processes have failed.

Replicated data stores are usually geo-distributed, so there do not exist one-to-one links between processes. Processes are organized into subnets, which are often connected through a single communication link. Therefore, in distributed web services model, a communication link is considered a unit of failure. Failure of a link can cause network to be paritioned into disconnected subnets, and web-services are often required to tolerate these partitions. For most purposes, a process failure can also be modeled as network partitioning by simply assuming that the failed process is in its own parition. In summary, in distributed webservices model, faults occur in communication links thereby leading to network partitions, and fault tolerance effectively means partition tolerance.

Timing Assumptions

Both conventional and current models of a distributed system are asynchronous models - they do not make any assumptions about time bounds on communication delays or (relative) process speeds. In both the models, we rely on logical clocks (eg: vector clocks) and causal ordering to understand the behaviour of the system with respect to passage of time.

Consistency & Availability

Ideally, a distributed web-service has to be consistent in the sense that it should behave the same as if it is operating on an atomic data object. Any operation performed on the data object should see the effects of all previous operations. For example, consider a web-service for a simple read/write register. The operations allowed are write and read, which write a value to the register and read the current value in the register, respectively. Let us assume that the client performs two successful writes of values 20 and 40 to the register. Then a subsequent read should return 40, failing which the system is deemed inconsistent. This kind of consistency guarantee is called strong consistency or sequential consistency. Under this consistency guarantee, there must exist a total order on all operations such that each operation looks as if it were completed at a single instance.

Along with being consistent, a distributed web-service needs to be available - every request made to the web-service should be met with a response, given that the web-service remains accessible on the network. To be available, no non-failing node in the distributed system implementing the web-service should wait infinitely for an event to occur before responding to a client request.

CAP theorem says that it is impossible to guarantee (strong) consistency and availability in a system that needs to be partition tolerant. First, let us see if this makes sense intuitively. Consider the read/write register web-service described previously. Assume that it is being implemented by a distributed system with two geo-distributed nodes, each holding a replica of the register. Consider a client which makes following requests to the web-service in the order shown:

1. write 20
2. write 40
3. read

The order is called session order. Let us tag each request with is serial number in session order. Assume that write 20 goes to first node, which immediately writes 20 to the local register and simlutatenously forwards the request to second node. Now, should the first node wait for the acknowledgement from second node before responding to the client? Given that we are operating in an asynchronous environment and there is no time bound on the delivery of acknowledgement, the wait time for first node could potentially be infinite. This leads to the violation of the availability guarantee. Therefore, the node should respond immediately to the client, and use timeouts and retransmissions to eventually propagate client request to the second node.

Consider a scenario where second write request (write 40) also goes to first node, but the read request goes to the second node. Assume that by the time read request was made to second node, it already received the write 20 request forwarded by the first node. Looking at their serial numbers (1 for write 20 and 3 for read), the second node knows that there is a request made by the client before the read that it has not yet received. This missing request could be a write (it is indeed a write in this case). Therefore, if the node has to respond with a correct value for the read request , it has to wait until it receives the missing request, in which case it might have to wait infinitely as network may get partioned in meantime. This violates availability. The other option is to respond with the current value of the register without waiting for the second request to be delivered. In this case, the client reads 20 instead of the expected 40, which means that the system is no longer (strongly) consistent. Therefore, it is impossible for our service to be both available and consistent in presence of network partitions.

Consistency & Avalability in Conventional Model

The conventional model accounts for network partitioning through process failures - A faulty process that stops communicating with rest of the processes in the system effectively leads to network partitioning. Fault tolerance is the ability of a system to behave in well-defined manner once faults occur. There has been an extensive research on fault tolerance in conventional models, but does this research include a study of consistency and availability properties in presence of faults?

The answer is yes. They were studied under broad categories of safety and liveness properties, respectively. A safety property of a system is usually expressed as a set of legal system configurations, commonly referred to as an invariant. To be safe, a system has to always remain in the set of safe states as defined by the safety property. Consistency is a safety property as it restricts the set of observable states of the system. For the read/write register example, consistency dictates that the observable state of the register after the first two writes is the value 40. Any other state is inconsistent or unsafe.

On the other hand, a liveness property claims that some good thing will eventually happen during system execution. Liveness properties are eventuality properties - a traffic signal should eventually allow every car waiting at an intersection to pass through. Availability is a liveness property, as it requires every request to be eventually met with a response. In literature on fault tolerance, availability property is more commonly referred to as guaranteed service.

Recall that fault tolerance is the ability of a system to behave in well-defined manner once faults occur. In his landmark survey paper on fault tolerance in distributed systems, Gartner identifies four forms of fault tolerance based on which among safety (S) and liveness (L) properties hold when faults occur in the system. The four are listed below:

  1. Masking fault tolerance: when both S and L hold in presence of faults
  2. Fail-safe: when only S holds
  3. Non-masking: when only L holds
  4. None (or fault intolerant): when none of them hold.

Gartner’s paper was published in 1999, before CAP was conjectured, and before eventually consistent data stores proliferated. The paper reflects the kind of applications that researchers had in mind when studying distributed systems. These applications are quite different from replicated data stores, which explains the disparity between conventional models and current systems. Below, I reproduce some of the relevant verbatim from the paper, which is suggestive of this gulf:

  fail-safe fault tolerance and is preferable to non-masking
  whenever safety is much more important than liveness. An example
  is the ground control system of the Ariane 5 space missile project
  [Dega 1996]. The system was masking fault tolerance for a single
  component failure, but was also designed to stop in a safe state
  whenever two successive component failures occurred [Dega 1996].
  For the latter type of faults, the launch of the missile
  (liveness) was less important than the protection of its pre-
  cious cargo and launch site (safety).

  ....

  In effect (of non-masking fault tolerance), the user may
  experience a certain amount of incorrect system behavior (i.e.,
  failures). For example, a calculation result will be wrong or a
  replication variable may not be up to date

  ....

  Research has traditionally focused on forms of fault tolerance
  that continuously ensure safety. This can be attributed to the
  fact that in most fault-tolerance applications, safety is much
  more important than liveness.

  ...

  For a long time nonmasking fault tolerance has been the “ugly
  duckling” in the field, as application scenarios for this type of
  fault tolerance are not readily visible. However, the potential of
  non-masking fault tolerance lies in the fact that it is strictly
  weaker than masking fault tolerance, and can therefore be used in
  cases where masking fault tolerance is too costly to implement or
  even provably impossible.

  ...

  (Talking about self-stabilizing programs, which are pre-cursors of
  eventually consistent programs) Examples show that such programs
  are quite difficult to construct and verify [Theel and Gärtner
  1998]. Also, their nonmasking nature has inhibited them from yet
  becoming practically relevant.

  ...

In summary, the paper says that masking fault tolerance, where both safety and liveness is preserved in presence of faults is “strictest and most costly” form of fault tolerance, and that ensuring such tolerence is a “major area of research”. Instead, fail-safe fault tolerance is preferable for most practical applications.

The CAP Theorem

Now that we are aware of Gartner’s categorization of fault tolerance, we can state the CAP theorem simply as:

It is impossible to have masking fault tolerance in an unreliable
distributed system

In the words of Gilbert and Lynch, who gave the first the proof of the theorem:

The CAP Theorem, in this light, is simply one example of the
fundamental fact that you cannot achieve both safety
and liveness in an unreliable distributed system
      - From "Perspectives on the CAP Theorem"

It should therefore be noted that oft reproduced formulation of the CAP theorem as “pick any two among Consistency, Availability and Partition Tolerance” is misleading at best. A better formulation is:

A distributed system that is network partition tolerant cannot be
consistent and available at the same time.

So, it is more about picking one among the two rather than picking two among the three.

Eventual Consistency & Its Impact on Availability

Bailis et al’s paper on Highly Available Transactions: Virtues and Limitations, classifies operations on a replicated data store as unavailable, sticky-available and available. An operation, such as a write on read/write register, that has no requirements on consistency is classified as available. This is expected, as write can be applied to any replica without any need to wait for an event, and the client can be informed of success/failure of write

Continue reading