Provisioning Sharding for Smart Contracts: A Design for Zilliqa
Note: ➤ We’re on Slack! Join our community, ask us questions, and get updated on the latest (and hopefully the greatest!)
Since the start of the Zilliqa project, we have been talking about two very important topics that are core to Zilliqa’s blockchain infrastructure:
- Sharding, on which, we have published a series of three blogs posts, that explains how sharding works and how it provides linear scalability.
- Smart contracts, on which, we recently started another series of blog posts to explain the design rationale behind Scilla — a new smart contract language that we are developing for Zilliqa.
(Although this article is self-contained, members of the community who are new to Zilliqa are strongly encouraged to give those articles a quick read.)
Up until now, we have been discussing these topics (i.e., sharding and smart contracts) in isolation. The goal of this blog post is to put them together and explore exciting problems that appear when one attempts to run smart contracts on a sharded architecture. We also discuss some strawman approaches to solving the problems and their limitations. By improving upon those solutions, we then build a complete one that we employ in Zilliqa. We conclude by presenting the implementation status of our solution.
So, without any further build up, let us jump to the problem statement that this blog post seeks to address:
What are the challenges that a sharded architecture brings into picture with respect to (wrt) running smart contracts that a non-sharded architecture doesn’t? How can they be resolved?
This article assumes familiarity with the basics of smart contracts and how they run on an architecture such as that of Ethereum. Readers are however not expected to know about sharding, as the next section will cover the basic differences between a sharded and a non-sharded network.
Sharded vs non-sharded architectures
In a non-sharded architecture, each (full) node in the blockchain network processes and validates every transaction. Typical examples of non-sharded networks are Bitcoin and Ethereum. In these networks, whenever an end user creates a transaction, she first sends it to the network, whereupon each node checks the validity of the transaction, i.e., it is not an instance of double spend and does not display any other signs of malfeasance, then the node reaches consensus on it. Once the transaction is validated, all nodes store the corresponding block in their local copy of the blockchain.
In a sharded network, on the other hand, only a sufficiently large subset of the network will process and validate a given transaction. Each of these subsets of nodes is called a shard. For instance, in the case of Zilliqa, the size of the actual network can be of a magnitude in the order of tens of thousand of nodes; however, only 600 nodes (a single shard) will validate a specific transaction and run consensus on it. The advantage of sharding is that if the initial network can be divided into several smaller subsets of nodes, then, each subset can process non-conflicting transactions in parallel. Ability to process transactions in parallel yields linear scalability.
Sharding in the absence of smart contracts
To make sure that we are adequately warmed up for the rest of the content (which is rather long), let us first see how we may handle non-smart contract-related transactions on a sharded architecture. This will also help establish a very key idea — that is, the means by which the network deterministically decides which shard should handle a given transaction. In the rest of the blog, we refer to this process as transaction assignment.
In the absence of smart contracts, the network can only receive payment-like transactions, in which, say, a user Alice pays 5 ZIL to another user Bob. For the sake of simplicity, let us further assume that the network has only two shards, denoted S1 and S2. So, the question is whether the transaction from Alice should be processed in S1 or S2. It’s obvious that the transaction cannot be assigned to both the shards; otherwise, we lose the entire benefit of sharding.
Strawman solution. The simplest idea would be to compute a hash of the transaction and check the last bit of the hash. If the last bit of the digest is 0, then the transaction will be handled in S1, else if it is 1, then the transaction will be handled in S2. This transaction assignment strategy is very simple and can ensure that the load of transaction processing in S1 will be roughly the same as that in S2. However, there is an issue with this approach. What if Alice had only 5 ZIL in her account and she created an additional transaction to pay 5 ZIL to Charlie? With 50% chance, the hash of the second transaction will have a different last bit from the hash of the first transaction. Hence, they may end up being processed in different shards. If the two shards do not communicate directly or via an intermediary, then it becomes impossible to detect that Alice is trying to mount a double spend attack.
One simple and easy way to prevent double spend attacks is to ensure that transactions that lead to conflicting situations get handled in the same shard, hence eliminating the need to communicate with other shards. This can easily be done by changing the assignment strategy. Instead of deciding the assignment based on the transaction hash, one may assign transactions based on the last bit of the sender’s address. As a consequence, all transactions from a sender end up in the same shard, and therefore, double spends can be detected right within a shard.
To summarize, in order to assign payment transactions in a network with two shards S1 and S2, we first check the sender’s address. If the sender’s address ends with 0, then the transaction should be assigned to S1, else it should be assigned to S2.
Sharding in the presence of smart contracts
Now comes the juicy and the most interesting part of the blog, where we add smart contracts to the infrastructure. Let us continue with the transaction assignment strategy from the previous section, i.e., assign transactions based on the sender’s address to check whether everything works as expected. Spoiler alert: This assignment strategy won’t work with smart contracts.
To have an example in hand, we take the following simple contract written in Scilla. Note that the actual language in which you write the contract does not really matter. What matters is that the contract should have a state that can be modified. The contract given below has a state variable
x initialized to
0 and the contract provides an interface
set to change the value of
contract Test ()field x : Int32 = Int32 0transition set (y : Int32) x := yend
We assume that a copy of this contract is present in each shard and hence each shard is potentially capable of processing any transaction that invokes the smart contract.
Strawman solution I. Let us now apply our transaction assignment strategy (based on the sender’s address) for a transaction T1 that comes from Alice in which she calls
set with an input
1. This transaction will go to say shard S1 and the shard will change the value of
1. So far, so good.
Imagine another user Bob issuing another transaction T2 at the same time as Alice in which he calls
set with an input
2. Again with 50% chance, T2 may end up being processed in S2. So, T1 gets processed in S1, while T2 gets processed in S2. In this case, S1 will set
1, while, S2 will set
2. This leads to an inconsistent contract state.
Again, the two shards will have to communicate directly or via an intermediary to make sure that the two transactions get an ordering that will allow the contract’s state to change in a deterministic manner and not remain in a “dirty” state.
One may notice that the problem we are facing with smart contracts is similar to the one with regular payments when the transaction assignment strategy was based on transaction hashes. In the earlier case, the conflicting cases were on the sender’s state (i.e., balance), while with smart contracts, the conflicting cases are on the recipient’s state, i.e., the contract’s state.
Strawman solution II. So, how about we change the assignment strategy to assign all transactions going to smart contracts based on the recipient’s address and not the sender’s address. And as previously, we may continue to assign payment transactions based on the sender’s address.
So, what do you think? Does it work?
Well, the new assignment strategy works as long as it is possible to separate payments and smart contract executions. It does not work for contracts that accept payments and also change their state, e.g., an ICO contract that sells tokens (and hence changes the state) in exchange for money. Such transactions can neither be assigned solely based on the sender’s address nor on the recipient’s address. If assignment is based on sender’s address, then the contract’s state will become inconsistent. If instead, it is based on the recipient’s address, then it becomes possible to mount a double spend by participating in two ICOs simultaneously.
One may think of several fixes to the above problem. We encourage readers to come up with their own improvements and fixes. However, it is to be noted that up until now we have only considered simple calls to smart contracts. One must also consider the case where a contract calls other contracts in the form of a chain and conflicts may not necessarily occur wrt to the first contract. So, any assignment strategy must be generic so as to handle all contracts without incurring too much cross-shard communication.
A Complete Solution
With the experience earned so far, readers may have noticed that transaction assignment needs to depend on transaction types. For instance, it seems preferable to handle payment-like transactions in a different way compared to transactions that call smart contracts.
In light of this, we first dig a bit deeper to check whether we can categorize transactions so that we can have a separate assignment strategy for each category.
Transactions received by the network can be classified into the following categories depending on the type of accounts involved. Below, we call an account a user account (or a non-contract account) if it is controlled by a user and does not hold contract code. As an extension, an account that holds contract code will be referred to as a contract account.
- Category I [U1 -> U2]: A user account sending some money to another user account, i.e., regular payment transactions that do not involve smart contracts. An example transaction would be a user Alice sending some ZILs to another user Bob.
- Category II [U -> C]: A user calling a smart contract that does not call any other smart contract, neither does the contract transfer funds to another user. I.e., transactions that involve only a user account and a smart contract. An example transaction would be a user Alice donating 5 ZIL to a crowdfunding contract.
- Category III [U1 -> C1 -> … -> Cn [-> U2]]: Any other transaction. This category includes transactions originating from a user that can invoke a chain of contracts and potentially terminate with a user account. An example transaction will be Alice calling a travel agent contract (C1) that calls an airline contract (C2) that in turn calls an insurance contract (C3).
Despite the fact that Category III transactions represent the most generic of all, they in fact may not necessarily be the most popular ones. The reason being that most transactions do not make chain calls to others contracts. And even if they do, the call depth is low and in most cases, it is possible to rewrite contracts in a way that a transaction of Category III can be broken down into several Category II transactions.
Before you read any further, try to convince yourself that these are the only three possible types of transactions. This is important so as to ensure that any assignment strategy build on top of these categories will be generic enough to handle all types of possible transactions. Furthermore, we also want to ensure that we haven’t missed any transaction types that may create conflicting scenarios.
Are you convinced?
What if a user (employer) calls a transition (function) in a contract that sends out salaries to all the employees in a company? In which category this transaction should fall?
The answer is the third category. But, for the sake of simplicity, let us assume for now that transaction call graph is linear and does not have branches.
In the rest of the article, we assume without loss of generality that at any given point of time, there are k = 2^n shards in the network, where n is a natural number.
At any given time, each shard will only handle transactions of Category I and II, while a special shard will handle transactions of Category III only after transactions of Categories I and II have been validated by the rest of the shards. The special shard may also handle certain transactions of Category I or II. Do note that the special shard will not handle transactions in parallel (to the shards); otherwise, it will become possible to double spend.
The exact assignment strategy is as follows:
- For a network with k = 2^n shards, each shard will only process transactions of Category I and II in which both the sender’s and the recipient’s address have the same last n bits.
- Any other transaction (including of Category III)) will be processed by the special shard after the other shards have finished processing transactions meant for them.
An obvious question would be how do we select the special shard? Well, a simple answer is that the special shard could be the DS committee in Zilliqa. Recall that the DS committee is a special set of nodes elected using PoW that helps orchestrates the consensus protocol. In the rest of the discussion we assume that the DS committee is the special shard.
Remark: A client may simply create a transaction and may not be aware of whether or not the transaction should go to the DS. In case, the transaction is incorrectly routed to a wrong shard, the shard will end up processing the transaction and at runtime will detect that the transaction should actually be processed by the DS. It may then simply forward the transaction to the DS (and possibly charge some gas fee). However, a hint from the client may help here as this would eliminate processing of the same transaction twice, first, at the shard-level and then at the DS level. The protocol may require the client to pay a higher fee for transactions that need to be handled by the DS.
Example [A network with 2 shards S1 and S2, and a DS committee]: S1 will only process transactions in which both the sender’s and recipient’s addresses end with 0, while S2 will only process transactions in which both the sender’s and recipient’s addresses end with 1. Any other transaction will be processed by the DS committee.
The table below presents this example with four user accounts and four contract accounts. For the sake of simplicity, we do not consider Category III transactions in this example.
It is easy to check that double spends cannot occur and that there will be no conflict for smart contract-related transactions. To check whether double spends from a given user are properly handled, we need to look at the corresponding row. For instance, if we want to check whether U1 (0) can double spend, we look at the first row of the table and observe that transactions from U1 (0) either go to S1 or to the DS. So, conflicts can only occur between S1 and DS. However, as DS processes transactions only after shards have finished processing their transactions, conflicts cannot occur.
Similarly, to check whether conflicts wrt smart contracts states are handled correctly, one needs to check the column corresponding to smart contracts. For instance, if we want to check whether states for C3(1) are handled without conflicts, we observe that transactions touching it are either handled in S2 or by the DS. And again, since DS processes transactions after shards have finished processing, no conflict can occur.
Analysis: If we assume that we have k = 2^n shards and each shard has a fixed say m active user accounts and m active contract accounts, then the total number of possible transactions generated by the network will be: (m*k)*(2*m*k). This comes from the fact that the transaction assignment matrix will have (m*k) rows and 2*(m*k) columns. The number of transactions that will be processed by shards will be m*(2*m)*k as each user account in a shard can make 2*m transactions that can be handled within the shard itself. For reasons mentioned above, this computation does not consider Category III transactions.
Fraction of TX processed by shards = m*(2*m)*k / (m*k)*(2*m*k) = 1/k
This not-so-optimized version puts quite a bit of load on the DS committee. However, one may notice that regular payments from say U1 (0) to U3 (1) are also being handled by the DS committee. But, as we have seen earlier, we can safely handle such payment transactions in the sender’s shard. With this simple optimization, we modify the assignment strategy as described in the following section.
We apply the same assignment strategy as earlier except that we allow payment transactions of Category I to get assigned with respect to the sender’s address. So, overall the assignment strategy will be the following:
- For a network with 2^n shards, each shard will process transactions of Category II in which both the sender’s and the recipient’s address have the same last n bits.
- Each shard will also process transactions of Category I wrt the last n bits of the sender’s address.
- Any other transaction (including of Category III)) will be processed by the DS committee after shards have finished processing transactions of Category I and II.
Remark: This will require differentiating transactions of Category I and II. The differentiation could either be done by nodes themselves by first checking whether there is a code associated with the recipient’s account, or clients could provide a hint that nodes can use. A proper penalty mechanism may need to be built.
Continuing with our previous example, the transaction assignment table with the optimized strategy will look as follows.
Analysis: As earlier, the total number of possible transactions generated by the network will be: (m*k)*(2*m*k). The number of transactions that will be processed by shards will be m*(m*k+m)* k as each user account in a shard can make (m*k+m) transactions that can be handled within the shard itself. This computation does not consider Category III transactions.
Fraction of TX processed by shards = m*(m*k+m)*k / (m*k)*(2*m*k) = (1+k)/2k.
A comparison of the two assignment strategies is given in the graph above. Clearly, the optimized version does much better distribution that the unoptimized one.
For both the strategies, the more the shards are, the more transactions will need to be processed by the DS committee. From the looks of it, it may appear that even the optimized one does not assign transaction load in an even manner. However, this is mostly because of transactions in which the sender and the recipient “reside” in different shards. In order to reduce the burden on the DS committee, it suffices for users to create accounts in shards in which the smart contracts reside. This effectively will convert most of the Category III transactions into Category II transactions and hence lead to a much balanced load distribution.
The Zilliqa codebase (dating back to a month) only handled payment transactions. In parallel, we had a separate testnet that supported smart contracts on a non-sharded architecture. In the last few weeks, we have finished implementing all the core features to handle smart contracts on the sharded architecture. For instance, in the earlier versions, the DS committee did not process any transaction. But, now the DS can process all the required transactions and hence effectively implementing the optimized assignment strategy. As of writing of this blog, we are running tests to make sure that the implementation is correct and bug-free.
You may very soon run your Scilla contracts on a sharded architecture!
Here’s how you can follow our progress — we would love to have you join our community of technology, financial services, and crypto enthusiasts!
➤ Follow us on Twitter,
➤ Subscribe to our Blog,
➤ Ask us questions on Slack, Gitter or Reddit.
Icons used in this blog post are made by creators at https://www.flaticon.com/authors. We thank the creators for making these icons available.