Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

MDEV-19248 Implement MASTER_BIND for CHANGE MASTER #3883

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ParadoxV5
Copy link
Contributor

@ParadoxV5 ParadoxV5 commented Mar 10, 2025

Description & Release Notes

Add CHANGE MASTER TO & SHOW REPLICA STATUS field Master_Bind that configures the MYSQL_OPT_BIND of the embedded replication client

The mariadb option MYSQL_OPT_BIND (--bind-address with MDEV-21465) chooses the network interface (e.g., eternet port) that connects to the server (the Master in this case).

What problem is the patch trying to solve?

MySQL Cluster had it for even longer than that, it was originally based on a patch I submitted internally at MariaDB AB somwhere around 2006 or so, back then due to replication between my two laptops somehow preferring slow WLAN even when having them connected by cable, too.

It was already insane to see how long this rather simple feature took to get merged into the MySQL codebase, then how long it took from making it from ndbcluster releases to the main server, and now some 15+ years later we are still having this pending here ...

@hartmut-mariadb, https://jira.mariadb.org/browse/MDEV-19248?focusedCommentId=266042#comment-266042

Knowledge Base pages that need changing

How can this PR be tested?

good question.
TODO: modify the automated test suite to verify that the PR causes MariaDB to behave as intended.
If the changes are not amenable to automated testing, please explain why not and carefully describe how to test manually.

PR quality check

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.
  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

Add CHANGE MASTER TO & SHOW REPLICA STATUS field `Master_Bind` that
configures the `MYSQL_OPT_BIND` of the embedded replication client

This is a port of MySQL 5.6.2’s WL#3127.
SELECT Connection_name, Master_Bind FROM information_schema.SLAVE_STATUS;
Connection_name Master_Bind
127.0.0.1
named localhost

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO master_bind is for network interface name. But here in the test we've IP address and hostname instead. Shouldn't we test with network interface name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Why did MySQL document as and test with IP addresses 😕?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any MySQL test which does MASTER_BIND testing. But the doc https://dev.mysql.com/doc/refman/8.0/en/change-master-to.html says MASTER_BIND = 'interface_name'.

Copy link
Contributor Author

@ParadoxV5 ParadoxV5 Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the pointers. MASTER_BIND is network interface IP address only. The doc should have MASTER_BIND = 'interface_ip_address' instead. But do we need MASTER_HOST if MASTER_BIND is set?

Copy link
Contributor Author

@ParadoxV5 ParadoxV5 Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • MASTER_BIND = address of this replica to connect with (optional)
  • MASTER_HOST = address of the primary to connect to (mandatory)

CHANGE MASTER 'named' TO master_bind='';
SELECT Connection_name, Master_Bind FROM information_schema.SLAVE_STATUS;
Connection_name Master_Bind

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a test with a non-existing network interface value. SHOW SLAVE STATUS should error out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be it interface or address, currently the field doesn’t check its string value and rely on the replicating client to check its options at START REPLICA.

And I think that’s fine: technically one might upgrade their hardware upgrade inbetween CHANGE MASTER nand START REPLICA.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Shall cover it at QA stage.

Copy link

@mariadb-SusilBehera mariadb-SusilBehera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Copy link
Contributor

@bnestere bnestere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the underlying code doesn't actually work. I set up a network namespace and a veth connection to it, started a mariadb instance (as a primary) in the namespace, and then tried to set up replication to the server using MASTER_BIND using my regular NIC and again using the veth NIC, and unfortunately it always picked the veth NIC, no matter the MASTER_BIND value. Then I tried using MySQL 8.0 and its MASTER_BIND, and the correct interface was always chosen.

Looking into our sources, it looks like MYSQL_OPT_BIND might be a dead option. We'll have to investigate further in the next feature development cycle.

@ParadoxV5 ParadoxV5 removed the request for review from knielsen March 17, 2025 20:36
Copy link
Member

@knielsen knielsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code generally looks good. But as Brandon points out, there is a problem
in libmariadb that prevents this from working currently.

You can see the problem in the function pvio_socket_connect() in
libmariadb/plugins/pvio/pvio_socket.c . The code actually tries to do so:

    /* if client has multiple interfaces, we will bind socket to given
     * bind_address */
    if (cinfo->mysql->options.bind_address)
    {
      wait_gai= 1;
      while ((gai_rc= getaddrinfo(cinfo->mysql->options.bind_address, 0,
                          &hints, &bind_res)) == EAI_AGAIN)
      {
        unsigned int timeout= mysql->options.connect_timeout ?
                              mysql->options.connect_timeout : DNS_TIMEOUT;
        if (time(NULL) - start_t > (time_t)timeout)
          break;
#ifndef _WIN32
        usleep(wait_gai);
#else
        Sleep(wait_gai);
#endif
        wait_gai*= 2;
      }
      if (gai_rc != 0 || !bind_res)
      {
        PVIO_SET_ERROR(cinfo->mysql, CR_BIND_ADDR_FAILED, SQLSTATE_UNKNOWN, 
                     CER(CR_BIND_ADDR_FAILED), cinfo->mysql->options.bind_address, gai_rc);
        goto error;
      }
    }

It looks up the specified address. But it looks like the author of the code
forgot to actually call bind(2) with the looked up results!

I think you should report this as a bug for libmariadb (CONC in Jira).

You might also try fixing the bug in the libmariadb and submitting a pull
request for that to Georg? It looks like it might simply be calling bind()
in the above code, passing the looked up address and address lenght.

And once libmariadb is fixed, we clearly need some testing done. Maybe
Brandon can help you with how he set up multiple interfaces. Basically it
needs to be tested that bind can be used and verified that traffic goes on
the specified interface. Maybe it can maybe be done simply by specifying
either the local (127.0.0.1) or the wifi/ethernet card of the machine, and
checking with tcpdump that the packets have the right client ip address.

I'm not sure an mtr testcase in the main suite is feasible, too many
assumptions needed about the environment (mtr tests are run by some users
too). Maybe with some --source include/have_multi_interface.inc, but it
might be tricky to get it to work.

I think it's ok to test manually, and describe briefly in commit comment or
pull request how it was tested.

@vuvova
Copy link
Member

vuvova commented Mar 18, 2025

I thought that replication code doesn't use libmariadb. And server implementation of the client library seems to ignore MYSQL_OPT_BIND

@knielsen
Copy link
Member

Hm, yes Serg, now that you mention it that's what I remember too. And looking closer at the libmariadb code, it does in fact seem to call bind() correctly with the bind_address.

So seems I was just confused, and it's the server-side protocol code that needs fixing to support client-side bind

@ParadoxV5
Copy link
Contributor Author

I thought that replication code doesn't use libmariadb. And server implementation of the client library seems to ignore MYSQL_OPT_BIND

Which is interesting as I could only find some functions I grepped in libmariadb. I thought Server includes libmariadb to reuse its Client code.

# for free to join this conversation on GitHub. Already have an account? # to comment
Development

Successfully merging this pull request may close these issues.

5 participants