Zilliqa is always happy to witness instances where our technology is in high use and demand. Recently, we experienced an all-time high on number of smart contract transactions on our blockchain as Unstoppable Domains went live on July 9th, 2019 receiving overwhelming demand from the community.
Such an event also creates heavy smart contract processing requirements on the Zilliqa mainnet. In this case, it resulted in a few incidents of instability. This technical update addresses these recent issues, outlines the fixes implemented, and upcoming technical enhancements which will strengthen our mainnet. After all, as we’ve always said, constant enrichment of the platform is key to growth and development.
Understanding smart contract states
Every smart contract has its own state. For instance, let’s have a crowdfunding smart contract that records each donor’s contribution. A simple smart contract state can be represented as follows:
As long as the crowdfunding contract is still accepting new donors’ contributions, we can expect it to grow bigger with time.
As you can see, the smart contract state is expected to grow bigger over time. Generally, we expect most smart contracts to perform reasonably fast on Zilliqa mainnet. However, Unstoppable Domains state size has increased very rapidly, resulting in stalled network availability on our blockchain while processing these smart contract transactions.
In the Zilliqa blockchain, there are a few instances when smart contract processing is cut off:
- When the gas limit of a microblock is reached
- When the time limit for transaction processing for a microblock is reached
In this case…
- As the Unstoppable Domains smart contract state grew, we noticed a linear increase in the time needed to process each transaction. It was observed that the smart contract processing time increased from negligible processing time to 500ms, 700ms and finally exceeding 1s
- Due to heavy smart contract processing, we hit the time limit at each node level, after which no transaction could be processed
A deep-dive into the incident
In an ideal world, if all nodes cut off smart contract processing exactly at the same transaction, there wouldn’t be any issues. However, in the decentralized blockchain world, nodes do have differing machine specifications. This means that each node may process different amounts of smart contracts per microblock — and this is where it may affect the performance of Zilliqa mainnet.
As you may know, Zilliqa uses PBFT for consensus. As part of the consensus protocol (in this case for microblock), the leader first proposed a microblock with all the smart contract transactions processed for that microblock. The backup validates what has been proposed. In such a situation, if more than ⅓ of the backup in the network shard disagree with what the leader proposes, no consensus will be reached for that microblock.
This is what we are currently observing.
The impact of such incidents depends on the level it occurs at. If it occurs at the shard level, the microblock from that specific shard will not be formed for that final block. The network will move on with a new shard leader at the next Tx Epoch. If it occurs at the DS committee, view change will occur to elect a new leader. However, after the election of a new leader, a similar issue may reoccur, resulting in another view change to re-elect another leader.
What is the root cause?
There are two main reasons this happened, as detailed below:
- Cutting off smart contract processing should be handled first by the gas limit. In this case, the gas limit we set was too high: This setting means that the time limit will always be hit first. This will cause the time limit to always be hit first. If the gas limit was hit first, we will not face this issue as all nodes will process exactly the same set of smart contract transactions instead of differing from each other, resulting in no consensus reach for that microblock.
- Inefficient state access: As the state size for Unstoppable Domains smart contract state grows, the processing time will be linearly increased. This was due to an inefficient implementation on our Scilla interpreter’s backend. Currently, in this particular smart contract, the bulk of the data stored inside a map. The node will need to load in the entire map (which was growing in size), deserialize it and finally update the map. This is the main reason why the smart contract process is taking more time to process as the smart contract state size increases.
Implemented fixes and upcoming upgrades
- Reduction of gas limit constants (since version v4.7.1)
We have significantly reduced the gas limit such that it will be hit before the time limit for smart contract execution. We will gradually increase it back up again after Scilla IPC and efficient state access (explained below) is done.
2. Automatic reduction of gas limit after view change (will be released in v4.8.0)
View change is an important component in the Zilliqa blockchain protocol. There is only one objective for view change — keep the network alive and running. View change involves the changing of a non-performing or malicious leader to a new leader. This allows the protocol to resume operations after view change has occurred.
However, in this particular corner case, after a view change occurred, the new leader together with the backups failed to agree on a new microblock. This happened numerous times. Again, this is due to the time limit of the smart contract processing being reached first (as explained in this blogpost).
In the upcoming v4.8.0 upgrade, we will be introducing the automatic reduction of gas limit via exponential back-off. Simply put, this means the gas limit will be reduced by half each time the view change occurs.
For instance, if there are five consecutive view changes without progress in the network, the gas limit will be reduced by half each of the five times. That is 2⁵, which is 32 times. For that particular Tx Epoch, the new gas limit for the microblock is “the gas limit divided by 32”. By exponentially reducing the gas limit, the network will be able to eventually reach a consensus on the microblock since the gas limit will be triggered eventually. This will allow the network to progress.
After a successful consensus, the original gas limit will be restored so as to ensure network processing capability will not be affected at the next Tx epoch.
3. New efficient state access for Scilla (slated for release next month)
This will be one of the next big updates for the Scilla smart contract processing.
Recall that in the Unstoppable Domains smart contract, loading in such data will require loading the JSON file, deserializing it and finally processing it. With this new enhancement, there is no need to load in the entire state of the contract. Rather, only the record(s) that will change need to be read. This will greatly reduce the overhead of loading unnecessary data for smart contract processing.
4. Scilla inter-process communication (IPC) protocol (slated for release next month)
This is another big update scheduled for smart contract processing. Currently, to run the Scilla smart contract, the node needs to invoke the Scilla interpreter binary to read in the existing states, execute the contract and output the new states. The process of invoking binary however, incurs some performance overhead. With the upcoming enhancement, we will move this process of invoking the binary to interprocess communication, thereby reducing unnecessary overhead.
To sum up, while we hit speed-bumps on the path to decentralising the world, the technical team remains as committed as ever to delivering a robust and high-performance platform to support applications such as Unstoppable Domains. Such issues have given the technical team a good insight into issues which may surface, and we will continue to make enhancements with each version we release. Our work is never done.. Thanks to our amazing team, we continue improving and enriching the Zilliqa blockchain, day after day.
For further information, connect with us on one of our social channels: