Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Regression in session handling for PIN protected enclaves #9

Closed
sitano opened this issue Sep 4, 2018 · 10 comments
Closed

Regression in session handling for PIN protected enclaves #9

sitano opened this issue Sep 4, 2018 · 10 comments
Labels

Comments

@sitano
Copy link
Contributor

sitano commented Sep 4, 2018

  1. _, err := crypto11.Configure(&crypto11.PKCS11Config{Path: config.HSMTokenPath, TokenLabel: config.TokenLabel, Pin: config.HSMPin}) is doing fine
  2. Inside of the Configure, the
func setupSessions(ctx* pkcs11.Ctx, slot uint) error {
	return pool.PutIfAbsent(slot, pools.NewResourcePool(
		func() (pools.Resource, error) {
			return newSession(ctx, slot)
		},
		maxSessions,
		maxSessions,
		idleTimeout,
	))
}

sets an idle timeout of 30 seconds for the sessions to be wiped out off the pool.
3. Then Configure uses withSession to login for currently created session.
4. After 30 seconds pool closes a session OR after a 30 seconds of used sessions pool closes them.
5. After that user is unable to work with the library because for all further operations it returns either pkcs11: 0x82: CKR_OBJECT_HANDLE_INVALID or pkcs11: 0x101: CKR_USER_NOT_LOGGED_IN.

@sitano
Copy link
Contributor Author

sitano commented Sep 4, 2018

I think I can write a regression test for it and maybe even offer a patch.

@sitano
Copy link
Contributor Author

sitano commented Sep 5, 2018

details and tests are in #10

@cbroglie
Copy link
Contributor

This regression was my fault in #3, and I was just coming here now to fix, and saw this. My apologies for the trouble.

Closing idle sessions seems like a good idea, but is hard to get right with how the package is implemented currently (as can be seen with the scope of changes in #10). I was just going to change the idle timeout value to 0, which disables the closing of idle resources. But I'll give the changes in #10 a shot first and report back.

@cbroglie
Copy link
Contributor

I tested out #10, and it's a partial solution for us. Our code calls FindKeyPairOnSlot for each key pair, then just holds on to the return values and shares them across goroutines for Sign operations. So when idle sessions are closed, calling Sign results in CKR_OBJECT_HANDLE_INVALID errors. And the change made to withSession only retries CKR_USER_NOT_LOGGED_IN, which doesn't help here.

However, the change to proactively attempt a login when new sessions are created allows us to implement retries in our application logic, and this works for us: https://gist.github.com/cbroglie/e7e9fce242ad23c552df4c6e72755f5f.

It would be great if this library could eventually encapsulate all the retry behavior itself. Thanks again for cleaning up my mess.

@sitano
Copy link
Contributor Author

sitano commented Sep 12, 2018

I do think it is hard to provide a decent connection pool handling for everyone because usually when someone uses a pool, every aspect of the behavior is specific to the use case. From the other side, it's hard to manage context state and sessions state with pkcs#11 API as it does not expose a lot of necessary information (i.e. is the user logged in for the token) and this info must be always be marshaled first. Another point: the Login operations i.e. may be very expensive.

It's more of the library API design that it uses singleton instance, and so limited to the behavior configuration. Even the pool could be given to the user to be used by himself if required alone.

Timeouts may be exposed to the user configuration though.

Our code calls FindKeyPairOnSlot for each key pair, then just holds

We do the same. With the pool design which can eat idle connection getting CKR_OBJECT_HANDLE_INVALID is inevitable. Unfortunately, the key identity can't be retrieved after the handle got rot and thus can't be automatically retrieved without knowing associated identity information before. There is also no way to check for the handle health but calling for an Identity location via CGO proactively which is the same bad, as trying to Login for every new session. But IFIIN there is no other way to get a context state to determine if the token has been logged into, as the login state is shared among the sessions, but when the res pool closes all of them, the login state shuts down automatically.

That's why it is barely possible to handle Login state for new sessions in the pool, but not for the object handles.

In the end, it's up for the client to decide on the trade offs. It can have connection pool without idle timeouts, or he can have no cached handles. If to cache handles, the cache could proactively check handles health and restore them, or the client code could support resorting handles state handling known error. Proactive checks reduce performance, yet it depends on the use cases.

@cbroglie
Copy link
Contributor

Timeouts may be exposed to the user configuration though.

I think this is a good idea; should add it to PKCS11Config, like MaxSessions. I think it's probably best to just default it to the zero value as well, which would disable the idle timeouts by default. I think that behavior would suit most users well, and those with long running applications with special needs can take on the added complexity of handling retries.

cbroglie added a commit to cloudflare/gokeyless that referenced this issue Sep 12, 2018
cbroglie added a commit to cloudflare/gokeyless that referenced this issue Sep 12, 2018
@sitano
Copy link
Contributor Author

sitano commented Sep 13, 2018

Agree. I will add MaxSessions and IdleTimeout to the #10 then.

@sitano
Copy link
Contributor Author

sitano commented Sep 14, 2018

Personally, if I would making crypto11 the next version, I did vote for removing the singleton default session and resource pool necessity and give the user right to choose how to configure and handle session/s. That will make API clearer and easier and will shift the complexity of bringing universal solutions on the sessions handling to the domain where it can just pick and tune.

@cbroglie
Copy link
Contributor

I agree completely.

@nickrmc83 nickrmc83 added this to the v0.1.0 milestone Sep 24, 2018
@sitano sitano closed this as completed Sep 25, 2018
@sitano
Copy link
Contributor Author

sitano commented Sep 25, 2018

#10 merged

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants