Assurance Level #37

mkannwischer · 2024-05-15T01:54:51Z

According to the PQCP charter, the goal for projects within PQCP is to
build high-assurance production-ready software implementations of forthcoming post-quantum cryptography standards, starting with the ML-KEM algorithm

During the first PQCP TSC meeting it was discussed that we should be more explicit on what we mean by high-assurance production-ready. This differs between different sub-projects of PQCP and it would be good to come up with some assurance levels to unify between different project.
For the next TSC meeting (May 23), each sub-project should document the corresponding interpretation of high-assurance production-ready and document the intended level of assurance.

Let's discuss here.

hanno-becker · 2024-05-20T06:03:12Z

Ultimately, I would like to see at least:

C code is verified to not exhibit undefined behaviour (assuming to-be-documented preconditions)
ASM code is verified for functional correctness
Some argument is made why all code is constant-time

hanno-becker · 2024-05-21T03:36:24Z

@mkannwischer @cothan Thoughts?

mkannwischer · 2024-05-21T07:25:55Z

What do you mean by "ASM code is verified for functional correctness"? You mean by passing testvectors?
I think we need to distinguish between the ultimate goal and the initial goal for a first release.
Initially, I'd aim for no formal verification, i.e., likely the lowest assurance level of the PQCP
This would aim to

keygen/encaps/decaps tested against official testvectors and extended testvectors (supplied through a separate PQCP project)
Each function implemented in assembly is tested against testvectors
Make use of static and dynamic analysis to make sure there is no UB/out-of/bounds memory access and such
Automated tests for checking constant-time behavior (e.g., ct-grind or similar) - this may include scanning for instructions that have data-dependent timing
External code audits

All of the tests should be integrated into CI.
For the code audits there is going to be a process to go through in the PQCA. There is some budget for this in the PQCA and we need to talk to the TAC to get this started.

hanno-becker · 2024-05-21T08:16:22Z

What do you mean by "ASM code is verified for functional correctness"? You mean by passing testvectors?

No, I meant formal verification against a specification on the basis of some architecture model.

I think we need to distinguish between the ultimate goal and the initial goal for a first release.

Agreed. I described above what I would like to see as the mid-term assurance level for this repository, but not necessarily as the baseline upon first release. We should clarify at the next TSC meeting what we're talking about.

cothan · 2024-05-21T09:03:31Z

I don't know much about formal verification, at our first release we can use some model checker to verify the code.
I'm learning how to use CBMC, but I'm not sure if it can check SIMD instructions. I will try soon.

mkannwischer · 2024-05-21T12:13:30Z

CBMC would only be useful for the C sources, right?
I think there is still plenty of functions that are going to stay in C as they are not performance critical for which this is going to be useful. Those are likely the same functions that can be shared with other implementations within the PQCP.
In case we are considering to use CBMC, I think we should discuss this in the TSC as well to make sure everyone is on-board with that.

hanno-becker · 2024-05-21T12:46:22Z

CBMC will be a useful tool for verifying the absence of certain classes of undefined behaviour in the C sources (only). One can also specify custom assertions and have CBMC attempt to prove those, but I am unsure if this would we practical for the functional specifications encountered in MLKEM, esp. modular arithmetic. See e.g. how in LibMLKEM the correctness of modular multiplication has to be deferred to interactive provers

hanno-becker · 2024-05-21T12:49:32Z

In case we are considering to use CBMC, I think we should discuss this in the TSC as well to make sure everyone is on-board with that.

I don't object to bringing this up, but it's not urgent, and ultimately our choice, isn't it?

cothan · 2024-05-21T20:58:38Z

Honestly, I have no idea what can and what cannot work in formal verification for ML-KEM. So my opinion about high assurance tool set is naive. I can learn from your PRs.
Definitely, it CBMC isn't our ultimately choice, even the pure C implementation has to deal with modular arithmetic correctness so it may not be the only tool we use. But again, I have no idea and no background in formal. So @hanno-becker and @mkannwischer please take the lead.

hanno-becker · 2024-05-22T03:47:13Z

We don't have to sort the how at this point, only agree on what we want to achieve.

In the interest of progress, I think @mkannwischer points above make sense for an initial pre-release. At the same time, I don't think we qualify as "high assurance" and "production-ready" if we don't handle functional correctness of at least the assembly.

rod-chapman · 2024-05-22T05:28:27Z

I concur with Hanno. To expand a little, reasonable targets for formal (static) verification would be:

Absence of undefined behaviour.
Absence of all other memory- and type-safety defects, including arithmetic overflows.
Absence of dependence on unspecified behaviour (e.g. no dependence on expression evaluation order).

These are basic targets that must come first before other, more advanced proofs could be attempted.

(PS.. allow me to introduce myself: I am Rod, colleague of Hanno at AWS. I am responsible for the LibMLKEM SPARK/Ada implementation and its proof.)

mkannwischer · 2024-05-22T05:42:19Z

Thanks @rod-chapman!

Would you see these basic targets as required for any kind of deployment, or do you see value in having a fast implementations first and adding these (basic) proofs later?

rod-chapman · 2024-05-22T05:53:37Z

Basic proofs of all the C code would be a good starting point, and should come first. Additional proofs of optimized assembly code can come later as and when needed.

(There are some additional full correctness proofs in my SPARK code, for example ... the correctness of "*" (mod Q) on Zq for example.)

rod-chapman · 2024-05-23T15:17:54Z

One concern I have is that there appears to be repetition of C code across all our proposed repositories? Why? I see no need at all to have the same C code repeated and/or copied across several repo's. The non-time-critical stuff should identical for all platforms anyway.

hanno-becker · 2024-05-23T15:20:51Z

@rod-chapman I agree that a shared C code base is preferred, but it is TBD if it can be achieved. For example, the AArch64 repository will require changes to the C code when introducing the batched Keccak API, which the maintainers of mlkem-c-generic may or may not want to upstream.

rod-chapman · 2024-05-23T15:28:14Z

"Batched" or "not batched" is an implementation detail. Surely, the API should be the same regardless...

hanno-becker · 2024-05-23T15:32:49Z

I thought you're talking about the C code, which is part of the implementation.

mkannwischer · 2024-05-24T03:13:43Z

I agree that we want to share as much as possible between the C projects (embedded, generic, aarch64 for now). Maybe we even want to combine them into a single project at some point.

The reason for starting off with separate code bases was the different goals of the projects: In particular, the generic project wants to stay in sync with the Kyber teams reference implementation and they want to make as few changes as possible. That's why we started off our own copy and postponed the discussion of code sharing to a later point. (Since nothing has been merged into the generic project so far, I believe this was a good approach).

I'm happy to kick-off the discussion of code sharing between embedded and aarch64 projects now. Let's do that in a separate issue: #42

hanno-becker · 2024-06-17T04:18:38Z

Related: #63

mkannwischer · 2024-11-26T07:27:31Z

We've settled what we want to achieve in the long term:

All C code is verified to be memory and type safe using CBMC. This is already done for all C code in mlkem directory leaving only the FIPS202 code.
All assembly formally verified. Initial experiments are underway to verify AArch64 using HOL-Light.

This is reflected in the README and we'll update there if the status changes. Closing this now.

planetf1 mentioned this issue May 16, 2024

Adopt a definition of assurance levels pq-code-package/tsc#3

Open

hanno-becker added the documentation Improvements or additions to documentation label May 21, 2024

mkannwischer mentioned this issue May 21, 2024

Assurance Level pq-code-package/mlkem-c-embedded#37

Open

hanno-becker mentioned this issue May 22, 2024

Explore the use of CBMC #39

Closed

mkannwischer closed this as completed Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assurance Level #37

Assurance Level #37

mkannwischer commented May 15, 2024

hanno-becker commented May 20, 2024

hanno-becker commented May 21, 2024

mkannwischer commented May 21, 2024

hanno-becker commented May 21, 2024

cothan commented May 21, 2024

mkannwischer commented May 21, 2024

hanno-becker commented May 21, 2024 •

edited

Loading

hanno-becker commented May 21, 2024

cothan commented May 21, 2024

hanno-becker commented May 22, 2024 •

edited

Loading

rod-chapman commented May 22, 2024

mkannwischer commented May 22, 2024

rod-chapman commented May 22, 2024

rod-chapman commented May 23, 2024

hanno-becker commented May 23, 2024 •

edited

Loading

rod-chapman commented May 23, 2024

hanno-becker commented May 23, 2024 •

edited

Loading

mkannwischer commented May 24, 2024

hanno-becker commented Jun 17, 2024

mkannwischer commented Nov 26, 2024

Assurance Level #37

Assurance Level #37

Comments

mkannwischer commented May 15, 2024

hanno-becker commented May 20, 2024

hanno-becker commented May 21, 2024

mkannwischer commented May 21, 2024

hanno-becker commented May 21, 2024

cothan commented May 21, 2024

mkannwischer commented May 21, 2024

hanno-becker commented May 21, 2024 • edited Loading

hanno-becker commented May 21, 2024

cothan commented May 21, 2024

hanno-becker commented May 22, 2024 • edited Loading

rod-chapman commented May 22, 2024

mkannwischer commented May 22, 2024

rod-chapman commented May 22, 2024

rod-chapman commented May 23, 2024

hanno-becker commented May 23, 2024 • edited Loading

rod-chapman commented May 23, 2024

hanno-becker commented May 23, 2024 • edited Loading

mkannwischer commented May 24, 2024

hanno-becker commented Jun 17, 2024

mkannwischer commented Nov 26, 2024

hanno-becker commented May 21, 2024 •

edited

Loading

hanno-becker commented May 22, 2024 •

edited

Loading

hanno-becker commented May 23, 2024 •

edited

Loading

hanno-becker commented May 23, 2024 •

edited

Loading