-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[SMT] Add declare_const, assert, solver, and check operations #6806
Conversation
include/circt/Dialect/SMT/SMTOps.td
Outdated
Unlike the SMT-LIB version, this operation allows for a `fresh` attribute to | ||
indicate that a new unique value should be generated. The provided string | ||
will then be used as a prefix for the newly generated identifier. If this | ||
attribute is not present and there are two such operations | ||
`%a = smt.declare_const "a"` and `%b = smt.declare_const "a"` then `%a` and | ||
`%b` are equivalent, i.e., they refer to the same symbolic value. | ||
|
||
Note that this operation cannot be unconditionally marked as Pure since two | ||
operations with the `fresh` attribute and the same identifier string could | ||
then be CSEd, leading to incorrect behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to have two separate SSA values that refer to the same declared constant ever arise? Could it make sense to always create a fresh constant, and then mark this operation pure? I could imagine that the aliasing/reuse of same name constants is more of a SMT-LIB writing convenience, but if you're lowering to that from an SSA representation, you might as well enforce a single definition rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I agree that we should always create a fresh symbolic value. This non-fresh behavior was the initial implementation modeled too closely to an AST.
We cannot mark it 'pure' though since two symbolic values with the same prefix string will represent separate symbolic values and them being pure would allow CSE to merge them, i.e., change the behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now declare the declare_const
operation's result as a memory allocation, so it is DCEd but not CSEd.
bc8aa12
to
eb69911
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I'm just having a question about smt.solver
, but otherwise looks good to me!
// CHECK: %a = smt.declare_const "a" {smt.some_attr} : !smt.bool | ||
%a = smt.declare_const "a" {smt.some_attr} : !smt.bool | ||
// CHECK: smt.declare_const {smt.some_attr} : !smt.bv<32> | ||
%b = smt.declare_const {smt.some_attr} : !smt.bv<32> | ||
|
||
// CHECK: smt.assert %a {smt.some_attr} | ||
smt.assert %a {smt.some_attr} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, this supposedly has undefined semantics, right?
As there is no toplevel solver
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends. In this file, no entry point function is defined, so if you insert a main
function that has an smt.solver
that has a call to this function in its body, it would be well-defined. If you run this function directly, it's undefined (in practice, it will crash because no SMT context is initialized). I didn't focus on this here because it's just a regression test for the parser and printer, but I can add one if you want.
I was also thinking about requiring all smt operations being (transitive) children of the smt.solver
operation, but that would make things a lot more rigid (making it more difficult to reuse code, forcing inlining of everything, etc.). On the positive side, we could properly verify a context is there.
func.func @declare_const_cse(%in: i8) -> (!smt.bool, !smt.bool){ | ||
// CHECK: smt.declare_const "a" : !smt.bool | ||
%a = smt.declare_const "a" : !smt.bool | ||
// CHECK-NEXT: smt.declare_const "a" : !smt.bool | ||
%b = smt.declare_const "a" : !smt.bool | ||
// CHECK-NEXT: return | ||
%c = smt.declare_const "a" : !smt.bool | ||
|
||
return %a, %b : !smt.bool, !smt.bool | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, should this be put in a smt.solver
?
|
||
Example: | ||
```mlir | ||
%0:2 = smt.solver (%in) {smt.some_attr} : (i8) -> (i8, i32) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the input/output correspond to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The body of the smt.solver
is isolated from above. The reason is that SMT constants cannot be hoisted out of the solver and we have a bit more leverage to check that no SMT value from another solver is used in this solver. Both of which is a big problem in the LLVM lowering. The earlier because no smt context is initialized outside the solver's body and just attempting to build an expression node would lead to a crash, the latter because the nodes would have been built in different contexts with potentially other configs, i.e., are not compatible.
Still, it's useful to be able to pass in concrete values to influence the construction of the SMT expressions. And the output is useful to report the result of the solver (was is sat or unsat, a counter example, etc.). Since they are just passthrough values, they must match the block arguments and yield operands respectively.
}]; | ||
|
||
let arguments = (ins Variadic<AnyNonSMTType>:$inputs); | ||
let regions = (region SizedRegion<1>:$bodyRegion); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to allow scf.if
in your smt.solver
, then you'll need to relax this and allow arbitrary CFGs, right?.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if I want to already lower them to the CF dialect before lowering the SMT dialect, right? Not sure if we need that. On the other hand, I find it quite handy when we can just assume there is only one block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah that makes sense, I also got reminded of scf.execute_region
which can be used in case we really need that.
def SolverOp : SMTOp<"solver", [ | ||
IsolatedFromAbove, | ||
SingleBlockImplicitTerminator<"smt::YieldOp">, | ||
]> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it would make more sense to have a solver SSA Value instead, that is passed to the other operations (smt.check, or smt.assert for instance)? If I recall, this is the design you used to have in the draft PR?
I feel it makes the restriction "being called inside an smt.solver
" more easier to check, as this is just an SSA value that is passed.
Or does this design makes it easier to lower to LLVM? Or what else does it simplify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The smt.solver
does not only instantiate and limit the live-time of a solver that can be passed to the assert and check operations (like in the Z3 python API), but also the overall SMT context that stores all the config flags, nodes, reference counters, etc. Every operation in this dialect lowers to one (or multiple) calls to a Z3 C API function, all of which take this context as an argument.
Having the region defining the SMT context's lifespan is quite important because when lowering to LLVM, you need to know when to create and initialize a context instance and when to deallocate it again. We could have a solver value provided as a special block argument in this region. But we still need the region checking that no solver value can enter or exit the region, otherwise it would be possible to mix contexts (pass SMT expression nodes built with different contexts to a constructor call for a new node) which would also be UB.
As you saw in the draft branch, we had this solver type (which we can lower to a struct containing a pointer to the context and another one to the solver) that was passed to the assert and check operations, which could be used to support multiple solvers next to each other. However, to solve the issue of guaranteeing that a context is available, we need this pointer to the context present at all operations. So, we could also lower all the SMT values (bool, bitvector, int, etc.) to a pair of pointers (one for the AST node, one for the context) which means we don't need to add the solver value to all operations, but only to the constant operations (bool, bitvector, int constants, declare_const, etc.) which don't get the pointer from an operand in the first place.
The current design basically targets lowering the context and solver pointers to LLVM globals. At the start of the solver region, we initialize them, and at the end, we deallocate them. All operations just access those globals, and if they follow the specification, they get a legal context; otherwise, they get an invalid pointer, thus UB.
Do you think we should add a solver/context type, add an implicit block argument to the solver region with this type, and require it as an operand for all operations that currently don't have an SMT value operand (i.e., constants)? This means we need to funnel the new solver value through all function calls and other control flow ops, i.e., slightly more complicated lowering, and have to add additional operands to the constant operations, but we can get rid of the globals, and get stronger guarantees.
Or leave it as is and have a somewhat easier lowering, require globals, and get fewer guarantees?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, @math-fehr @fabianschuiki and I discussed this offline and came to the conclusion that the current design and implementation make the most sense.
e1f3a8d
to
b20ecdf
Compare
eb69911
to
f9a324f
Compare
Is it okay for you if I merge this as is, or do you want me to change something? |
f9a324f
to
a44a99b
Compare
The
declare_const
operation could be pure iffresh
is not specified. Iffresh
maybe an allocate side-effect could work? I think tablegen does not have the traits to specify this so fine-grained, we'd need to fall back to C++, so I'd leave it to another PR. Also to have it as the sole focus. There might be some corner cases to think about, e.g., what if there aredeclare_const
with the same identifier, but different sort/type. The lack of such traits and optimizations makes it behave like SMT-LIB, though.