We discuss the design of the Lasp runtime system -- implemented in Erlang -- aimed for high scalability and fault-tolerance. We challenge the reference architecture, and use cutting-edge techniques, with the ultimate goal of targeting clusters of 10k - 20k nodes
Talk objectives:
- Challenge assumptions of modern programming languages
Target audience:
- Experts, Language Designers
SlidesCloud-scale applications must be highly available and offer low latency responses while serving millions of users around the world. To meet this need, applications have to carefully choose a high performance distributed database. NoSQL-style data stores - providing high throughput and availability even under network partition - replaced in this setting traditional databases. They usually exhibit a low-level key-value API and expose data inconsistencies that arise due to asynchronous communication among the servers. It takes significant effort and expertise from programmers to deal with these inconsistencies and develop correct applications on top of these databases.
For example, consider that your application stores a counter which counts the number of ads displayed to a user. For scalability, the database replicates all data in different locations. What will be the value of the counter, when it is incremented at two locations at the same time? As an application programmer, you have to detect such concurrent updates and resolve conflicting modifications.
The Antidote datastore provides features that aid programmers to write correct applications, while providing high performance and horizontal scalability, from a single machine to geo-replicated deployments, with the added guarantees of causal Highly-Available Transactions, and provable absence of data corruption due to concurrency.
CRDT support:
Highly-Available Transactions:
For example, in a social networking application, a reply to some post should be visible to a user only after observing the post. Antidote maintains such relations by providing causal consistency across all replicas and atomic multi-object updates. Thus, programmers can program their application on top of Antidote without worrying about the inconsistencies arising due to concurrent updates in different replicas.
Geo-replication:
In this tutorial we will give a guided tour of Antidote’s API and will navigate you through its rich semantic features by hands-on demos.
Target audience:
When building applications that must operate in a distributed environment, application developers have to embrace the challenges of concurrency. One of the most difficult challenges of concurrency is maintaining consistency, when state can be accessed and modified by multiple actors.
Traditionally, developers have had two choices for dealing with the problem. The first approach is, under concurrent modification, one of the updates wins through arbitration: picking one of the concurrent values by some other metric, such as when the operation was performed. This approach is usually problematic because the arbitration mechanism is usually nondeterministic and leads to applications that are prone to race conditions or data loss. The second approach is storing multiple values and leaving “semantic resolution”, or choosing how to merge or reduce the values to a single value, to the application developer. Again, this approach is problematic because programming deterministic merge functions is non-trivial, ad-hoc, and error-prone.
Conflict-free Replicated Data Types (CRDTs) provide an alternative to this problem. CRDTs provide data structures that are equipped with deterministic merge functions that can be applied by the system automatically during concurrent modification. This mechanism alleviates the user from having to program complicated merge functions and allows the user to work with just single values: simplifying the task of concurrent programming.
During this tutorial, we introduce a open source reference implementation of CRDTs in Erlang, and walk through understanding where the concurrency issues arise and how CRDTs can be applied to
make your concurrent applications easier to program and understand.
Christopher Meiklejohn loves distributed systems and programming languages. Previously, Christopher worked at Basho Technologies, Inc. on the distributed key-value store, Riak. Christopher develops a programming language for distributed computation, called Lasp. Christopher is currently a Ph.D. student at the Université Catholique de Louvain in Belgium. GitHub:
cmeiklejohn
Twitter:
@cmeik