Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Refactoring of socket related functions to a separate socket_common.c #505

Merged
merged 56 commits into from
Dec 28, 2024

Conversation

Jakio815
Copy link
Collaborator

@Jakio815 Jakio815 commented Dec 16, 2024

This PR should be merged with lf-lang/lingua-franca#2449.

Motivation

The original codebase had socket-related functionality directly embedded in various parts of the application. This tight coupling led to code duplication and hindered modularity. In this refactor:

  • Socket-related code was separated into a dedicated file: socket_common.c and socket_common.h.
  • Code duplication was removed, making the codebase cleaner and easier to maintain.
  • The refactor enhances modularity, allowing for easier extension, especially for supporting additional protocols and future integration of security features.
- NOTE: The primary purpose of this PR is refactoring.
- There is almost no functionality change except for the minor logic change mentioned at the end of the document.

Overview of Approach

Three major functions were introduced to handle the socket-related operations:

  • create_TCP_server()

    • This function creates a TCP server socket, binds it to a port, and enables listening for incoming connections. It handles retries in case of failure and provides a robust way to manage the server’s socket.
  • accept_rti_socket() / accept_federate_socket()

    • This function waits for incoming connection requests on a given socket, accepting them when a valid connection is received. It ensures that temporary errors like EAGAIN or EWOULDBLOCK are handled properly and retries the connection attempt when necessary.
  • connect_to_socket()

    • This function attempts to establish a TCP or UDP connection to a remote server. It retries the connection if necessary, cycling through different port numbers and respecting a connection timeout.

These functions are used both in the RTI (Real-Time Infrastructure) server and the federate components, enhancing reusability and simplifying the network-related logic in these modules.

Detailed Approach

New integrated function: create_TCP_server() create_UDP_server()

void create_TCP_server(uint16_t port, int* final_socket, uint16_t* final_port);
void create_UDP_server(uint16_t port, int* final_socket, uint16_t* final_port);

Arguments

  • port: The port number to bind the server socket to.
    • If a non-zero value is specified, it attempts to bind to that port.
    • If 0 is specified, it starts binding at DEFAULT_PORT.
  • final_socket: A pointer to an integer where the server socket file descriptor will be stored upon success.
  • final_port: A pointer to a uint16_t where the successfully bound port number will be stored.
  • increment_port_on_retry: Boolean to retry port increment.

Return Values

  • The function modifies final_socket and final_port to reflect the created server socket and bound port.
  • On failure, it retries binding up to PORT_BIND_RETRY_LIMIT times. If all attempts fail, the function prints an error and exits.

Function Description
This function creates a TCP/UDP server socket and binds it to a port, enabling it to listen for incoming connections. The process includes:

  • Port Handling:
    • If a specific port is provided, the function attempts to bind to it.
    • If the specified port is 0 and increment_port_on_retry is true, it starts from DEFAULT_PORT (15045) and increments the port number up to MAX_NUM_PORT_ADDRESSES times, cycling back to DEFAULT_PORT if necessary.
    • If the specified port is 0 and increment_port_on_retry is false, it means the federate is trying to bind a port assigned by the OS.
  • Retries: Retries binding up to PORT_BIND_RETRY_LIMIT times with a PORT_BIND_RETRY_INTERVAL delay between attempts when cycling ports.

Where It’s Used

  • lf_create_server() in federate.c: Initializes a server socket for federates.
  • start_rti_server() in rti_remote.c: Sets up the RTI server socket for listening. Both create_TCP_server() and create_UDP_server() is called.

New integrated function: accept_rti_socket() accept_federate_socket()

int accept_rti_socket(int socket);
int accept_federate_socket(int socket, int rti_socket);

Arguments

  • socket: The server socket file descriptor that is already set up and listening for incoming connections.
  • rti_socket: The RTI socket descriptor is used to check if the RTI socket is still open/alive.

Return Values

  • Returns the file descriptor for the newly accepted connection socket (int) on success.
  • Returns -1 on failure, printing an appropriate error message and setting errno.

Function Description

This function blocks and waits for an incoming connection request on the given server socket:

  • Connection Acceptance: Calls accept() to accept an incoming client connection.
  • Error Handling:
    • If a temporary error (e.g., EAGAIN, EWOULDBLOCK, or EINTR) occurs, the function retries.
    • If a fatal error occurs (e.g., invalid socket, RTI closed), the function logs an error and exits.
  • RTI Socket Check: Verifies the RTI socket (rti_socket) to ensure it is still open during the process.

Where It’s Used

  • lf_connect_to_federates() in rti_remote.c: Waits for federates to connect to the RTI server.
  • respond_to_erroneous_connections() in rti_remote.c: Accepts connections who are attempting to join the wrong federation.
  • lf_handle_p2p_connections_from_federates() in federate.c: Accepts connections from federates that send messages directly to this federate (not through the RTI).

New integrated function: connect_to_socket()

int connect_to_socket(int sock, const char* hostname, int port)

Inputs

  • sock: The socket file descriptor created using socket().
  • hostname: The target hostname or IP address of the server to connect to.
  • port: The port number to connect to. If 0 is specified, it is trying to connect to the RTI. The function iterates through a range of default ports starting at DEFAULT_PORT.

Outputs

  • Returns 0 on success, indicating the connection was established.
  • Returns -1 on failure, printing an error message and setting errno.

Function Description

This function attempts to establish a TCP connection to the specified hostname and port:

  • Hostname Check: Uses getaddrinfo() to resolve the hostname to a valid network address.
  • Connection Attempts:
    • If the port is 0, the function iterates through a range of ports starting from DEFAULT_PORT.
    • Retries periodically until CONNECT_TIMEOUT is reached.
  • Error Handling: If a connection attempt fails, the function logs a warning and retries after a short delay CONNECT_RETRY_INTERVAL.

Where It’s Used

  • lf_connect_to_federate() in federate.c: Federates use this function to establish connections with other federates.
  • lf_connect_to_rti() in federate.c: Federates use this function to establish connections with the RTI server.

Minor Logic Changed
The original code for lf_connect_to_rti() and lf_connect_to_federate(), there was a large while loop retrying the connect() and also sending the MSG_TYPE_FED_IDS and MSG_TYPE_P2P_SENDING_FED_ID messages respectively. Below is a simplified version.

 instant_t start_connect = lf_time_physical();
  while (!CHECK_TIMEOUT(start_connect, CONNECT_TIMEOUT) && !_lf_termination_executed) {
    connect();
    // Send MSG_TYPE_FED_IDS message.
  }

The refactored version has two while loops, one inside the connect_to_socket() function itself and the original while loop. Below is the new simplified version.

 connect_to_socket(); // The while loop inside checks timeout.
 instant_t start_connect = lf_time_physical();
  while (!CHECK_TIMEOUT(start_connect, CONNECT_TIMEOUT) && !_lf_termination_executed) {
    connect();
    // Send MSG_TYPE_FED_IDS message.
  }

Follow Up

This PR yet did not handle the shutdown() and close() functions. This will be done in the next PR as soon as possible.
UPDATE: This is done in #506

@Jakio815 Jakio815 changed the title Draft:Refactor only comm type Draft: Refactor socket related functions to a separate socket_common.c Dec 17, 2024
Copy link
Contributor

@edwardalee edwardalee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much better. I've added some minor suggestions, but otherwise, this is good to merge.

core/federated/RTI/rti_remote.h Show resolved Hide resolved
include/core/federated/network/socket_common.h Outdated Show resolved Hide resolved
include/core/federated/network/socket_common.h Outdated Show resolved Hide resolved
include/core/federated/network/socket_common.h Outdated Show resolved Hide resolved
include/core/federated/network/socket_common.h Outdated Show resolved Hide resolved
include/core/federated/network/socket_common.h Outdated Show resolved Hide resolved
Co-authored-by: Edward A. Lee <eal@eecs.berkeley.edu>
@Jakio815 Jakio815 enabled auto-merge December 23, 2024 21:32
@hokeun hokeun changed the title Refactor socket related functions to a separate socket_common.c Refactoring of socket related functions to a separate socket_common.c Dec 23, 2024
@hokeun hokeun requested a review from edwardalee December 26, 2024 23:38
@Jakio815 Jakio815 added this pull request to the merge queue Dec 27, 2024
Copy link
Contributor

@edwardalee edwardalee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except for a typo.

include/core/federated/network/socket_common.h Outdated Show resolved Hide resolved
@edwardalee edwardalee removed this pull request from the merge queue due to a manual request Dec 27, 2024
@edwardalee edwardalee enabled auto-merge December 27, 2024 16:46
Co-authored-by: Edward A. Lee <eal@eecs.berkeley.edu>
@edwardalee edwardalee added this pull request to the merge queue Dec 27, 2024
Merged via the queue into main with commit 03137a0 Dec 28, 2024
30 checks passed
hokeun added a commit to lf-lang/lingua-franca that referenced this pull request Dec 28, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants