GhostVariables

Ghost variables are variables that do not appear in the program. They are created dynamically (i.e., during the analysis) by abstract domains to express complex properties which typically help to improve the expressiveness of the analysis. In Crab, memory and array domains create ghost variables. For instance, memory domains use ghost variables to represent the offset of a pointer or the size of an allocated object. Array domains can represent each array cell with a ghost variable that represents its contents. Ghost variables are also very useful when low-level properties are modeled involving precise bitwise reasoning (e.g., to represent the k least or most significant bits separately) although Crab does not provide such a domains at the moment.

However, the implementation of ghost variables in Crab has been tricky because there is a tradeoff between modularity and performance. Modularity is a very important property for an abstract domain. This is the idea that complex abstract domains can be composed of a stack and/or products of simpler abstract domains without making any assumptions apart from providing sound abstract operations. However, when one of the simpler domains operates on a ghost variable created by another domain, modularity can be broken. Whenever the domain that created a ghost variable renames it with another ghost variable (this can happen during binary operations such as join, meet, etc) the simpler domain that uses that ghost variable needs to be notified. One way of doing this is by heavily using the abstract rename operation which is implemented by all the Crab abstract domains. At each binary operation, the domain that creates ghost variables generates a new map from program variables to their ghost variables and communicate recursively to all its dependent domains by calling rename. Unfortunately, this operation can be very expensive. For instance, environment maps which are implemented by Patricia trees do not implement this operation efficiently and their sharing properties can be completely destroyed if too many rename operations take place.

How Crab manages ghost variables

In Crab, we can provide three different strategies to manage ghost variables that when uses with care it can ensure modularity without performance degradation:

Crab variable factory provides a method get that returns always the same variable for a given fixed pair of variable and string. With this method, we can create ghost variables from an existing variable v and a string that defines the role of the ghost variable. For instance, get(v, ".offset"), get(v, ".size"), etc create ghost variables to represent the offset of a pointer, the size of an allocated object, etc. However, this solution only works under two assumptions:

a. The roles can be efficiently named. For instance, this might not be the case if we use ghost variables to represent symbolically array slices and these slices are determined by complex linear constraints which cannot be efficiently extracted.

b. This solution is sound only if the variable v is a program variable or if v is another ghost variable then it cannot be changed during the analysis.

Currently, Crab memory domains use this approach because they never act as subdomains or reduced product with the other domains. Note that this might change in the future and modularity in that case should be revisited.

[NOT IMPLEMENTED] Avoid the use of ghost variables whenever possible. To help with that, a Crab variable has a normal type that represents the type of the variable as defined in the program CFG, and an abstract type that represents the type as seen by the abstract domain. Initially, the normal and abstract types coincide but an abstract domain can change the abstract type of a variable during the analysis.

For instance, the Array Smashing domain is a subdomain of the Array Adaptive domain which in turn, it is a subdomain of the Region domain. The Array Smashing domain only uses one (numerical) ghost variable to represent the content of a whole array. That ghost variable can be avoided if the Region domain changes the abstract type of a region to an array and the Array Adaptive domain changes the abstract type of an array to a scalar type.

Keeping an internal map from program variables to ghost variables and use abstract rename operation to communicate to other domains changes in the ghost variables. This should be used when 1 and 2 cannot be used.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GhostVariables

How Crab manages ghost variables

Other solutions

Clone this wiki locally