Introducing the ZIL Cashflow Smart Contract Analyser
Smart contracts provide means to implement a business logic on a blockchain. As is the case with most businesses, smart contracts often handle money and very often a lot of money (typically in hundreds of millions of dollars); “money” being the native token of the platform, e.g., ZIL in Zilliqa, ETH in Ethereum, etc. For example, in a kickstarter contract, backers send money to the contract to help kickstart a project. Or, in an auction contract, bidders transfer money to the contract acting as the auctioneer, which eventually selects the highest bidder as the winner.
Money held by or transferred to a contract often undergoes numerous transformations and checks that stress the integrity of the underlying code that manages it. The money may also “flow” across the contract through internal variables and operators, and one must ensure that it happens in a secure and consistent way.
Naturally, an unchecked money flow can become a point of failure and may potentially lead to a loss of money. Moreover, as blockchains are immutable, bugs in a smart contract are not patchable, particularly ones that handle “money”. Therefore, it is crucial to get it right the first time. As it turns out, a majority of existing smart contracts indeed have holes that have led to money getting leaked or frozen in a contract.
Scilla, the new intermediate-level smart contract language that we designed for Zilliqa addresses some of these pitfalls. One of the benefits of Scilla is that it makes it easier to write static analysers that can identify potential bugs in the applications before they go live. Leveraging the language design, we have created the first ever Cashflow analyser for smart contracts written in Scilla. The cashflow analyser allows developers to track and tag the usage of money, i.e., ZILs, in a smart contract before the contract gets deployed. The goal of doing so is to identify code errors that could potentially leak, mishandle or misappropriate money in applications — all of which can lead to devastating results.
To fully appreciate the power of the cashflow analyser, how about we first try to manually find a bug in a contract that (mis)handles money? See if you can identify the bug. HINT: The bug fix literally requires changing a single character in the contract.
Note that the cashflow analyzer can automatically detect the issue with this contract.
In the remainder of this article, we get into the details of how cashflow analyser works behind the scenes. We will also see how the cashflow analyser reports the bug in the above contract.
To try out the analyser, you will have to use the
scilla-checker binary that can be built from the source available on the Scilla GitHub repo. We are planning to add support for it in the Savant IDE which will make it much easier for smart contract developers to try it out on their contracts.
If you have any questions or if you wish to discuss how we can help you develop your own smart contract application on Scilla, do reach out to us on our technical forum or in one of our social channels below.
Gitter: https://gitter.im/Zilliqa/ (Dev-related topics including the Ecosystem Grant)
A Quick Primer to Scilla Contracts
Before we present the details of the cashflow analyser, let us familiarize ourselves with the basic structure of a Scilla contract. Every Scilla contract has the following key parts:
- Immutable parameters: These are the variables that get initialized when the contract gets deployed. Once initialized, their values cannot be changed.
- Mutable state variables: These are state variables that represent the contract state at a given point of time. The value of these fields can be changed by invoking transitions.
- Transitions: These are functions that allow changing the value of the mutable fields.
- Local variables: These are variables declared within a transition and used in order to operate on immutable parameters or fields or perform mathematical computations.
- Implicit “money” variables: There is an implicitly defined state variable
_balancein Scilla that stores the current balance of the contract. Also, any incoming or outgoing money is handled using a special variable
- Implicit “non-money” variables: There are several implicitly defined “non-money” variables in a Scilla contract. These include:
_sender, the caller who called this contract (it represents an account address, and hence not money),
_this_address, the address of the contract,
_recipient, the address of a user account or a contract account that a contract wishes to communicate to, etc.
Diving into the Cashflow Analyser
Cashflow analyser is a static analysis tool which attempts to track the usage of ZILs within a given contract. The tool takes the smart contract code as an input and returns the result of its analysis in a static manner — static meaning that the tool does not execute the contract code at all.
The result of the analysis is a set of tags for different parts of the contract, whereby each tag indicates in what way this particular part of the contract handles ZILs. If the tag indicates inconsistent use, or if the tag does not correspond to what the developer intended, the contract probably contains a bug and should not be deployed in its current form.
The analyser employs foundational approach to do domain-specific static analyses (known as Abstract Interpretation — which has been employed for multiple other properties, such as, for instance, data race detection at scale by Facebook Infer.
How do tags work?
The assumption behind cashflow analysis is that if a local variable or a field, is used to represent an amount of ZILs in a certain way in one part of the contract, then that is how that variable should represent an amount of ZILs in all parts of the contract. The analyser therefore assigns an individual tag to each field, immutable parameters, and local variables in the contract.
Each tag represents the ZIL-related information that has been gathered about the object being tagged. The cashflow analyser uses some basic tags to represent this information. The 5 basic tags are as follows:
NoInfo: No ZIL-related information has been gathered about the object.
Money: The object is used to represent an amount of ZILs directly.
NotMoney: The object is used to represent something other than ZILs.
Inconsistent: The object is sometimes used to represent ZILs, and sometimes to represent something other than ZILs.
Map t: The co-domain of the Map is tagged with the tag
tcan be one of the above tags . It is assumed that Map keys represent something other than ZILs.
In the figure below, we take our example contract and manually attempt to tag the different parts of the contract using our intuitive understanding of what the contract does. Convince yourself that the intuitive tagging is correct. If the contract handles money correctly, then, we should expect the analyser to return the same tags that we have intuitively assigned to different parts of the code.
From Intuition to Automation
Clearly, moving from intuitive tagging to automated tagging is a big leap as the analyser does not have any contextual information to assign tags the way we had in the above example. Therefore, the analyser has to extract information from the code itself, and then use that information to assign tags to various parts of the code.
The analysis requires several passes of the contract code. At the very beginning of the analysis, nothing is known about any field, parameter or variable. Hence, everything is initially assigned a
NoInfo tag. The analysis then proceeds to analyze each variable in the contract in turn for how the tagged objects are used. If any part of the contract code uses a tagged object in a way that has not been seen before, the tag of that object is updated.
Once a full traversal of the contract code has been completed, the analysis checks whether any of the contract’s objects have had their tags updated. If that is the case, then the contract is traversed again using the updated tags as the starting point. This continues until a full traversal has been made without seeing any updates to the tags.
An object’s tags is updated according to the way the object is used. The two basic uses that give information about a variable is when reading to and writing to implicit “money” and “non-money” variables. Recall from the earlier section that these variables are language constructs for which we definitely know whether or not they handle money.
To illustrate how the analysis extracts information from outgoing messages, consider the example in the figure below. The example constructs a message with 4 fields,
_custom. As explained in the figure, local variables tied to implicit variables have the same tag as that of the implicit variables.
Similarly, information can be extracted when reading from implicit variables:
x <- _amount;
y <- _sender;
z <- _balance;
w <- _this_address;
The example reads from 4 different implicit variables. The first two:
_sender are part of the message that invoked the current transition. The
_amount field contains the amount of ZILs being transferred to the current contract, and the
_sender field contains the address of the sender of the message. Consequently,
x is used as
y is used as
The last two are implicit state variables that are present in all contracts. The
_balance field contains the amount of ZILs present in the contract’s account, and the
_this_address field contains the address of the contract. Consequently,
z is used as
w is used as
NotMoney in this example.
Other uses of variables give rise to more elaborate updates of tags. For example, consider the following statement which adds the values of the variables
z, and stores the result in the variable
x = builtin add y z;
If nothing is known about
z, then no information can be extracted from this statement.
However, suppose we know that
x is tagged with
x is used to represent ZILs somewhere else in the contract. The only way that we can reasonably add two numbers and get an amount of ZILs as a result is if the two numbers both also represent an amount of ZILs, so if
x is assigned the
Money tag, then so should
z. Similarly, if either
z has been assigned the
Money tag, then so should the other two variables because adding an amount of ZILs to a number unrelated to ZILs is unlikely to make sense, and the result of adding two amounts of ZILs is also an amount of ZILs.
A similar argument applies if one of the variables has been assigned the
NotMoney tag. The only way that the result of adding two numbers is not an amount of ZILs is if the two numbers are both also not amounts of ZILs, and the only meaningful way to add something that is not an amount of ZILs to another number is if the other number is also not an amount of ZILs.
Putting the Analyser to the Test
Now that we have an understanding of how the Cashflow analyser works behind the scenes, let us try to test it on the buggy contract from the earlier section and see if the analyser is able to detect the bug. Running the analyser is pretty straight-forward, we use the binary
scilla-checker with the flag
-cf (for cashflow analyser) and pass the contract file as an input. The
-libdir flag is required to inform the checker where the libraries imported in the contract are to be read from.
The analyser returns a JSON formatted result that lists the deduced tags for each state variable.
Bug analysis. The analyser tags the variable
Money and the variable
backers stores a mapping between account addresses and the amount of ZILs deposited. So, the tag for
backers matches the contextual knowledge that we have on the contract. However,
count is a simple counter to count the number of backers. Hence, it should not be tagged as
Money. This implies that there is a bug in the contract where
count is directly or indirectly being used to handle money.
Now, to zero-in on the bug, let us take another look at the contract. In the transition
buggy_withdraw(), we have the statement
c <- counter, where we read the current value of
c. Obviously, there is nothing wrong here. So far so good.
If we go a bit further in the code, we see that
c is used in constructing
msg. And this is where we have the following code:
_amount : c. As
_amount is an implicit money variable,
c gets the tag
Money which eventually flows up to
counter, tagging it as
It should be obvious at this point that
c represents the current value of
counter and hence does not correspond to the amount that a backer should be able to withdraw. Congratulations! We have found the bug.
As promised, the bug fix literally requires changing a single character in the code. In fact,
Some a branch represents money that the backer can withdraw. And hence, the code
_amount : c should be replaced by
_amount : a.
We make this change to the contract and then re-run the analyser on the fixed contract. With this change, the analyser now correctly tags
NoInfo as it could not obtain enough information to deduce whether or not
count represents money. Bingo!
This blog post introduces cashflow analyser that detects inconsistencies in a Scilla contract regarding how incoming and outgoing money are handled. The analyser is available if you wish to try and play around with it. One limitation that the analyser has (as explained in this post) is that it mainly handles the native tokens, i.e., ZILs. It does not handle other forms of money, e.g., fungible tokens.
However, it is possible to give hints to the analyser by telling it which variables can be considered money. With this added hint from the developer, the cashflow analysis can proceed as usual. We have recently added this feature to the analyser. Do check it out.
To discuss any aspect of this article (or any other technical topics about Zilliqa and Scilla) please visit the Zilliqa Technical Forum or the Zilliqa Gitter.