Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Https server hang on close_fn #287

Closed
reinismu opened this issue Aug 22, 2023 · 3 comments
Closed

Https server hang on close_fn #287

reinismu opened this issue Aug 22, 2023 · 3 comments

Comments

@reinismu
Copy link
Contributor

reinismu commented Aug 22, 2023

Moving my investigation here espressif/esp-idf#11738

Summary: If there are multiple threads doing job when https server is running there is a chance it will hang. So far last thing that gets called is close_fn callback. It is reproduced if I put load on other thread when https call is made

It hangs at

for close_handler in close_handlers {

Note: Not sure why websocket close handler is even called when I open Https endpoint

I think I found where it locks, but not sure why it would lock.

pub struct Api {
    ...
    ws_listeners: Arc<Mutex<HashMap<i32, EspHttpWsDetachedSender>>>,
}

            ...
            .ws_handler::<_, WsError>("/ws", move |con| {
                log::info!("My ws handler is called");
                if con.is_new() {
                    let session_id = con.session();
                    log::info!("Ws connection open: {session_id}");
                    let detached_sender = con.create_detached_sender().unwrap();
                    ws_listeners
                        .lock()
                        .unwrap()
                        .insert(session_id, detached_sender);

                    return Ok(());
                }

                if con.is_closed() {
                    let session_id = con.session();
                    log::info!("Ws connection closed: {session_id}");
                    ws_listeners.lock().unwrap().remove(&session_id);
                    log::info!("Ws connection closed unlocked");
                    return Ok(());
                }
                Ok(())
            })?;
            
            ...
            
            let mut listeners = self.ws_listeners.lock().unwrap();
            for (_, sender) in listeners.iter_mut() {
                sender
                    .send(embedded_svc::ws::FrameType::Text(false), frame.as_bytes())
                    .map(|_| log::info!("send: {}", frame))
                    .map_err(|e| log::error!("Error sending ws frame: {:?}", e));
            }

Stuck at ws_listeners.lock().unwrap().remove(&session_id);. I'm not sure how it managed to deadlock tho

@reinismu
Copy link
Contributor Author

Has anyone used websockets where messages are sent out from another thread? Still not sure why it locks, but I wonder how I should change the logic for it to avoid locks. I could use try_lock, but then I risk of going into bad state

@github-project-automation github-project-automation bot moved this from Todo to Done in esp-rs Aug 23, 2023
@reinismu reinismu reopened this Aug 23, 2023
@github-project-automation github-project-automation bot moved this from Done to In Progress in esp-rs Aug 23, 2023
@ivmarkov
Copy link
Collaborator

ivmarkov commented Aug 23, 2023

Stuck at ws_listeners.lock().unwrap().remove(&session_id);. I'm not sure how it managed to deadlock tho

Perhaps because the ws_listeners mutex is held as you are trying to send via this or another sender in the meantime? However, the sending cannot happen until after you have exited the ws_handler code, because detached senders' send is processed only in-between handling the ws_handler callbacks and regular HTTP handler callbacks.

Detached senders are not really asynchronous. they seem asynchronous if you are sending very small packets through those. If the packet is bigger than a given size, they wait until they can push the whole packet.

How they are implemented is very weird: they send via a hidden socket between them and the HTTP server. The HTTP server listens both on its regular socket as well as on this hidden socket (called "commands" socket). And then processes either content from the main socket (that's how you end up in handler or ws_handler) or content from this hidden socket. But at any point in time, the HTTP server is processing one of these sockets, but not both, as it is single-threaded. (Or waits for content on either of these two sockets to pop up.)

So yeah, you should really be super careful when using mutexes which are locked both inside ws_handler (or handler) and also locked when you are trying to send stuff via WsDetachedSender, as you are trying to. You should just try to not to lock the mutex while you are sending.

The long term solution is to get away from the ESP IDF HTTP server, as it is a weird architecture - single threaded, yet blocking, and replace it with a small, embedded, pure-Rust async server (which can be trivially turned into a blocking one using thread pools). This is started in edge-net, but it needs bugfixing. And async TLS support from ESP IDF which just got initiated. Another story if it would be possible/easy to implement this async support in EspTls.

@ivmarkov ivmarkov changed the title Https server hand on close_fn Https server hang on close_fn Aug 23, 2023
@reinismu
Copy link
Contributor Author

Thanks for explanation!

Now I use try_lock for ws_listeners.lock().unwrap().remove(&session_id); it is mostly called from http call so it doesn't remove anything anyway. Hope this will be fine until a more mature async HTTP server comes along.

A while back I implemented whole async websocket stack

smol = "1.2"
async-tungstenite = { version = "0.19", features = ["async-tls", "async-std-runtime"] }
futures = "0.3"
tungstenite = { version = "0.18", features = ["rustls-tls-native-roots"] }
async-tls = "0.12.0"
rustls = "0.20.8"

It was tricky and I even had to make https://github.com/briansmith/ring compile for esp32s3. In the end I got async WS working, but rustls was just too resource-heavy for the chip, so I dropped the idea.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

2 participants