Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Duplicated UDP responses from DNS confuse LookupClient #140

Closed
JamesKovacs opened this issue Jan 5, 2022 · 1 comment
Closed

Duplicated UDP responses from DNS confuse LookupClient #140

JamesKovacs opened this issue Jan 5, 2022 · 1 comment
Labels
Milestone

Comments

@JamesKovacs
Copy link
Contributor

Let's say you look up a TXT record followed by a SRV record using the same LookupClient. We have seen examples in the wild as well as test environments in which DNS will respond twice to a single UDP query. (We have seen this behaviour with Kubernetes CoreDNS, Azure DNS, and other containerized DNS pods.) Rather than discarding the duplicate response, LookupClient will assume that the duplicate response is the response to the next query.

var txtResponse = lookupClient.Query(service, QueryType.TXT, QueryClass.IN);
var srvResponse = lookupClient.Query("_mongodb._tcp" + service, QueryType.SRV, QueryClass.IN);

When DNS returns a single UDP response, txtResponse contains the TXT record and srvResponse contains the SRV record. But when DNS returns two duplicate UDP packets for the TXT query, both txtResponse and srvResponse contain the TXT record. Duplicate UDP packets should be discarded rather than being returned for the next query. We suspect that this is the root cause of issue #79 and that the header id mismatch was correctly throwing. (We would suggest reverting the header id mismatch change and make it an error again.)

Full Reproduction

The following Python script listens on port 53535 and returns hard-coded DNS responses. 50% of the time it will send the response twice.

#!/usr/bin/env python

from random import random
import socket
import sys

# Create a UDP socket
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
# Bind the socket to the port
server_address = ('127.0.0.1', 53535)
s.bind(server_address)
while True:
    print("####### Server is listening #######")
    data, address = s.recvfrom(4096)
    print("\n Server received: ", data, "\n")
    if '_tcp' in data.decode('windows-1252'):
      ## Hard coded SRV lookup response for "dig SRV _mongodb._tcp.cluster0.dqrcc.mongodb.net"
      response = data[0].to_bytes(1,'big')+data[1].to_bytes(1,'big')+b"\x81\x80\x00\x01\x00\x03\x00\x00\x00\x01\x08\x5f\x6d\x6f\x6e\x67\x6f\x64\x62\x04\x5f\x74\x63\x70\x08\x63\x6c\x75\x73\x74\x65\x72\x30\x05\x64\x71\x72\x63\x63\x07\x6d\x6f\x6e\x67\x6f\x64\x62\x03\x6e\x65\x74\x00\x00\x21\x00\x01\x08\x5f\x6d\x6f\x6e\x67\x6f\x64\x62\x04\x5f\x74\x63\x70\x08\x63\x6c\x75\x73\x74\x65\x72\x30\x05\x64\x71\x72\x63\x63\x07\x6d\x6f\x6e\x67\x6f\x64\x62\x03\x6e\x65\x74\x00\x00\x21\x00\x01\x00\x00\x00\x1e\x00\x2e\x00\x00\x00\x00\x69\x89\x14\x63\x6c\x75\x73\x74\x65\x72\x30\x2d\x73\x68\x61\x72\x64\x2d\x30\x30\x2d\x30\x30\x05\x64\x71\x72\x63\x63\x07\x6d\x6f\x6e\x67\x6f\x64\x62\x03\x6e\x65\x74\x00\x08\x5f\x6d\x6f\x6e\x67\x6f\x64\x62\x04\x5f\x74\x63\x70\x08\x63\x6c\x75\x73\x74\x65\x72\x30\x05\x64\x71\x72\x63\x63\x07\x6d\x6f\x6e\x67\x6f\x64\x62\x03\x6e\x65\x74\x00\x00\x21\x00\x01\x00\x00\x00\x1e\x00\x2e\x00\x00\x00\x00\x69\x89\x14\x63\x6c\x75\x73\x74\x65\x72\x30\x2d\x73\x68\x61\x72\x64\x2d\x30\x30\x2d\x30\x31\x05\x64\x71\x72\x63\x63\x07\x6d\x6f\x6e\x67\x6f\x64\x62\x03\x6e\x65\x74\x00\x08\x5f\x6d\x6f\x6e\x67\x6f\x64\x62\x04\x5f\x74\x63\x70\x08\x63\x6c\x75\x73\x74\x65\x72\x30\x05\x64\x71\x72\x63\x63\x07\x6d\x6f\x6e\x67\x6f\x64\x62\x03\x6e\x65\x74\x00\x00\x21\x00\x01\x00\x00\x00\x1e\x00\x2e\x00\x00\x00\x00\x69\x89\x14\x63\x6c\x75\x73\x74\x65\x72\x30\x2d\x73\x68\x61\x72\x64\x2d\x30\x30\x2d\x30\x32\x05\x64\x71\x72\x63\x63\x07\x6d\x6f\x6e\x67\x6f\x64\x62\x03\x6e\x65\x74\x00\x00\x00\x29\x10\x00\x00\x00\x00\x00\x00\x00"
    else:
      ## Hard coded TXT lookup response for "dig TXT cluster0.dqrcc.mongodb.net"
      response = data[0].to_bytes(1,'big')+data[1].to_bytes(1,'big')+b"\x81\x80\x00\x01\x00\x01\x00\x00\x00\x01\x08\x63\x6c\x75\x73\x74\x65\x72\x30\x05\x64\x71\x72\x63\x63\x07\x6d\x6f\x6e\x67\x6f\x64\x62\x03\x6e\x65\x74\x00\x00\x10\x00\x01\x08\x63\x6c\x75\x73\x74\x65\x72\x30\x05\x64\x71\x72\x63\x63\x07\x6d\x6f\x6e\x67\x6f\x64\x62\x03\x6e\x65\x74\x00\x00\x10\x00\x01\x00\x00\x00\x1e\x00\x2d\x2c\x61\x75\x74\x68\x53\x6f\x75\x72\x63\x65\x3d\x61\x64\x6d\x69\x6e\x26\x72\x65\x70\x6c\x69\x63\x61\x53\x65\x74\x3d\x43\x6c\x75\x73\x74\x65\x72\x30\x2d\x73\x68\x61\x72\x64\x2d\x30\x00\x00\x29\x10\x00\x00\x00\x00\x00\x00\x00"
    print(response) ## sanity check the DNS label
    s.sendto(response, address)
    if random() > 0.5:
        s.sendto(response, address)
    print("\n Duplicated DNS reply sent \n")

The following C# program creates a LookupClient and performs TXT and SRV resolution:

using System;
using System.Net;
using DnsClient;

var endpoint = new IPEndPoint(IPAddress.Parse("127.0.0.1"), 53535);
var lookupClient = new LookupClient(endpoint);

var service = "cluster0.dqrcc.mongodb.net";

var txtResponse = lookupClient.Query(service, QueryType.TXT, QueryClass.IN);
DumpDnsResponse("TXT", txtResponse);
var srvResponse = lookupClient.Query("_mongodb._tcp" + service, QueryType.SRV, QueryClass.IN);
DumpDnsResponse("SRV", srvResponse);

static void DumpDnsResponse(string title, IDnsQueryResponse response) {
  Console.WriteLine($"*** {title} ***");
  foreach (var txtRecord in response.Answers.TxtRecords().ToList()) {
    Console.WriteLine(txtRecord);
  }

  foreach (var srvRecord in response.Answers.SrvRecords().ToList()) {
    Console.WriteLine(srvRecord);
  }
}

Half the time it will output the correct result:

> dotnet run
*** TXT ***
cluster0.dqrcc.mongodb.net. 30 IN TXT "authSource=admin&replicaSet=Cluster0-shard-0"
*** SRV ***
_mongodb._tcp.cluster0.dqrcc.mongodb.net. 30 IN SRV 0 0 27017 cluster0-shard-00-00.dqrcc.mongodb.net.
_mongodb._tcp.cluster0.dqrcc.mongodb.net. 30 IN SRV 0 0 27017 cluster0-shard-00-01.dqrcc.mongodb.net.
_mongodb._tcp.cluster0.dqrcc.mongodb.net. 30 IN SRV 0 0 27017 cluster0-shard-00-02.dqrcc.mongodb.net.

The other half of the time LookupClient will return the TXT response for both the TXT and SRV lookups:

> dotnet run
*** TXT ***
cluster0.dqrcc.mongodb.net. 30 IN TXT "authSource=admin&replicaSet=Cluster0-shard-0"
*** SRV ***
cluster0.dqrcc.mongodb.net. 30 IN TXT "authSource=admin&replicaSet=Cluster0-shard-0"

NOTE: This issue is causing sporadic failures of the MongoDB .NET/C# driver in containerized environments. The related MongoDB issue is CSHARP-4001.

Please let me know if you have questions or require additional information.

@MichaCo
Copy link
Owner

MichaCo commented Jan 30, 2022

your PR will be in 1.6 /closing

Thanks again for the contribution!

@MichaCo MichaCo closed this as completed Jan 30, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants