Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Standard grpc mTLS #3909

Open
wants to merge 79 commits into
base: main
Choose a base branch
from
Open

Standard grpc mTLS #3909

wants to merge 79 commits into from

Conversation

georgeliao
Copy link
Contributor

@georgeliao georgeliao commented Jan 30, 2025

MULTI-1765
Public side of https://github.com/canonical/multipass-private/pull/719

A few things to note
The make_cert_key_pair function originally handles the generation of both server and client key-certificate pairs . The server_name parameter determines the use case, when server_name is an empty string, it indicated the client key-certificate case. In the standard gRPC TLS scheme, the root key-certificate pair is added.

Now, the role of the make_cert_key_pair function is still generating and holding server and client key-certificate pairs. However, the server certificate has evolved into a signed server certificate. As the result, the server side branch must first generate root key-certificate pair and use it to sign the newly created server certificate.

Previously the constructor of X509Cert only handled server and client certifcate generation, distinguishing between them based on whether server_address was empty. Now, with the introduction of root, server and client certifcate, the dispatch is managed by using the CertType enumeration.

Additionally, the X509Cert constructor has been refined to generate certificates in a standard format. Key differences between before and after include adjustments to the serial number format and the inclusion of X509v3 extensions. To inspect a certificate, you can use the following command: openssl x509 -in <cert_path> -noout -text

The certificate paths are as follows:

  • Server certificate:/root/.local/share/multipassd/certificates/localhost.pem
  • Client certificate: /home/<user name>/.local/share/multipass-client-certificate/multipass_cert.pem
  • Root certificate: /usr/local/share/ca-certificates/multipass_root_cert.pem

Snap environment considerations
In the snap environment, the root certificate is stored at /var/snap/multipass/common/data/multipassd/certificates/multipass_root_cert.pem
Unlike /root/, /var/snap/ directory allows other users view its files, making this setup feasible.

Certificate regeneration and migration
The root certificate and server key-certificate pairs area automatically regenerated if either is missing. This mechanism ensures a smooth server key-certificate pair migration when updating Multipass. Upon upgrading, the server startup process will automatically generates root key-certificate pair and use it to sign a fresh server certificate. Consequently, the original server key-certificate pair will be overwritten, enabling successful verification under the new standard grpc TLS.

The Multipass upgrade process should be included in the functional testing as well, both the cmd and gui clients should be tested.

Unit test adaptations
The unit tests have been modified to accommodate changes in the gRPC TLS verification process. The key adjustments include:

  1. Mocking the get_root_cert_path to allow usage of string-based certificates.
  2. Merge the two server and client key-certificate pairs into one, With MockCertProvider being used on both server side and client to provide key-certificate pair. In the unit testing environment, they can be the same.

Following the commit history and messages is also a helpful way to understand changes that have been made.

@georgeliao georgeliao changed the title Standard grpc mTLs Standard grpc mTLS Jan 30, 2025
@georgeliao georgeliao marked this pull request as draft January 30, 2025 11:42
Copy link

codecov bot commented Jan 31, 2025

Codecov Report

Attention: Patch coverage is 90.27778% with 14 lines in your changes missing coverage. Please review.

Project coverage is 89.13%. Comparing base (2424799) to head (bb0154b).

Files with missing lines Patch % Lines
src/cert/ssl_cert_provider.cpp 89.28% 12 Missing ⚠️
src/daemon/daemon_config.cpp 0.00% 1 Missing ⚠️
src/utils/utils.cpp 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3909      +/-   ##
==========================================
+ Coverage   89.11%   89.13%   +0.02%     
==========================================
  Files         255      255              
  Lines       14603    14663      +60     
==========================================
+ Hits        13013    13070      +57     
- Misses       1590     1593       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@georgeliao georgeliao force-pushed the standard_grpc_mTLS branch 5 times, most recently from 5957e41 to c26c20c Compare February 7, 2025 09:03
@georgeliao georgeliao marked this pull request as ready for review February 7, 2025 09:59
Copy link
Contributor

@andrei-toterman andrei-toterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick pass with some nitpicks, didn't test functionally yet.

@andrei-toterman andrei-toterman self-requested a review February 7, 2025 15:01
xmkg
xmkg previously approved these changes Feb 11, 2025
Copy link
Contributor

@xmkg xmkg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except for one small nit.


std::filesystem::path mp::platform::Platform::get_root_cert_path() const
{
constexpr auto* root_cert_file_name = "multipass_root_cert.pem";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for using auto* instead of just plain auto? Both should result in const char*.

Copy link
Contributor Author

@georgeliao georgeliao Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they are equivalent in this case. But normally I prefer appending * or & to add a bit expressiveness of data type.

Copy link
Contributor

@andrei-toterman andrei-toterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, @georgeliao! So now, in the local environment, the GUI works fine both when upgrading from the old certificates and when starting with no certificates. But in the case of the CLI, it works when starting with no certificates, but if I upgrade from old certificates, I get the following error

E0211 14:23:49.406977684   36809 ssl_transport_security.cc:1316]       Handshake failed with fatal error SSL_ERROR_SSL: error:0A000086:SSL routines::certificate verify failed.

And in the snap, both when starting fresh or when upgrading, I get the following

[error] [client] Caught an unhandled exception: failed to open file '/var/snap/multipass/commondata/multipassd/certificates/multipass_root_cert.pem': No such file or directory(2)

@georgeliao
Copy link
Contributor Author

E0211 14:23:49.406977684 36809 ssl_transport_security.cc:1316] Handshake failed with fatal error SSL_ERROR_SSL: error:0A000086:SSL routines::certificate verify failed.

About this root certificate is unsync with server certificate issue in development environment, maybe we can add a check at here, which checks not only the existence of the certificates but also whether the root certificate is the one who signed the server certificate. If not, everything will be re-created. This check can be done by openssl c-api. However, not sure the juice worth the squeeze.

[error] [client] Caught an unhandled exception: failed to open file '/var/snap/multipass/commondata/multipassd/certificates/multipass_root_cert.pem': No such file or directory(2)

This is fixed in the latest version of the PR.

@andrei-toterman
Copy link
Contributor

Yeah, for the dev environment, I don't think it's worth doing the effort. We already have to remove stuff manually, so this is just an extra thing to remove.

Copy link
Contributor

@andrei-toterman andrei-toterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, now both the CLI and the GUI work as expected, both locally and in the snap, both when upgrading or installing fresh. Thanks, @georgeliao!

Now, one thing to discuss with everyone: the purpose of these changes was so that we could remove our grpc patches, and that goal is accomplished by this PR. And now the CLI verifies the server certificate, which couldn't be avoided without the patches. But the GUI can avoid verifying the server certificate, using the plain vanilla grpc dart library. My question is: should the GUI also verify the server certificate, to be in line with the CLI, or should we keep the existing, functioning behavior of not having the GUI verify the server certificate?

@ricab
Copy link
Collaborator

ricab commented Feb 12, 2025

As discussed elsewhere, let's have the GUI also adhere to the new scheme, preferably in a separate PR.

Copy link
Collaborator

@ricab ricab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @georgeliao, lots of good work in here!

I am only doing a tertiary review in this case, so I glossed over most of the scary low-level certificate stuff. @andrei-toterman and @xmkg, we're relying on you on that front 💪 If you could provide assurance that you've verified sanity of all the nitty-gritty, that would be much appreciated. All of this would be for nothing if security were somehow broken on the foundation...

Other than that, I have some proposals for path derivation and the initialization. Let me know what you think.

{
if (server_name.empty())
return mp::utils::make_uuid().toStdString();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this case only ever hit in clients, before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you talking about the original cn_name_from function? That one was used both by the server certificate and client certificate. server_name non-empty indicates the server case. In our new implementation, this dispatch is preserved by the line

        const auto cn = as_vector(cert_type == CertType::Root     ? "Multipass Root CA"
                                  : cert_type == CertType::Client ? mp::utils::make_uuid().toStdString()
                                                                  : server_name);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it might be good to add a comment next to the signature of the SSLCertProvider constructor. Something like: // leave server_name empty for clients; choose a (non-empty) name for servers.

X509_gmtime_adj(X509_get_notAfter(x509.get()), 31536000L);
set_random_serial_number(cert.get());
X509_gmtime_adj(X509_get_notBefore(cert.get()), 0); // Start time: now
const long valid_duration_sec = cert_type == CertType::Root ? 3650L * 24L * 60L * 60L : 365L * 24L * 60L * 60L;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get rid of the magic numbers here in favor of something more readable:

    constexpr std::chrono::seconds one_year = std::chrono::hours{24} * 365; 
    constexpr std::chrono::seconds ten_years = one_year * 10; 
    const auto valid_duration = cert_type == CertType::Root ? ten_years : one_year;
    X509_gmtime_adj(X509_get_notAfter(cert.get()), valid_duration.count());

We can also simplify this even further when we switch to C++20 with the help of "std::chrono::years".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it is definitely better with chrono time units and automatic unit conversion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit funny that hours is the biggest unit before c++20.

georgeliao and others added 26 commits February 17, 2025 11:05
Co-authored-by: Andrei Toterman <andrei.toterman@canonical.com>
Signed-off-by: George Liao <georgeliaojia@gmail.com>
Co-authored-by: Andrei Toterman <andrei.toterman@canonical.com>
Signed-off-by: George Liao <georgeliaojia@gmail.com>
… of mp::utils::snap_common_dir() to access the multipass data storage location.

It also dispatched for the user defined storage location case.
This caused slightly malformed cert format but it  somehow passed the grpc c++ client check. On the other side, it failed the grpc dart client check. As a result, this change fixed the gui can not connect server issue.
…dir utility function.

Can not use mp::StandardPaths::AppDataLocation because client and server process interprets this variable to different paths.
… to the file.

This is a way to enable fopen with wb mode, the original hack only worked on unix but no on windows.
The "std::filesystem::create_directories(file_path_std.parent_path());" call in the WritableFile class constructor already took care of that.
        The call std::filesystem::create_directories(file_path_std.parent_path()); in WritableFile constructor already check the directory existence and create that if it is absent.
…orage location also gets the right path for root certificate.
…e initialization can be done in the initializer list.
…construction of EVPKey can be done in the initializer list.

constexpr auto* root_cert_file_name = "multipass_root_cert.pem";
return mp::utils::in_multipass_snap()
? multipass_final_storage_location() / "data" / "multipassd" / "certificates" / root_cert_file_name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants