Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Enabling mDNS causes hang, then reboot of Teensy 4.1 #23

Open
playaspec opened this issue Aug 16, 2021 · 6 comments
Open

Enabling mDNS causes hang, then reboot of Teensy 4.1 #23

playaspec opened this issue Aug 16, 2021 · 6 comments

Comments

@playaspec
Copy link

playaspec commented Aug 16, 2021

I'm putting together a device that reads two ADCs (simultaneous sampling) and sends the values over TouchOSC. I originally had all the networking stuff worked out on the esp32, but switched to the Teensy because of more flexible SPI hardware, plus availability of the esp32 boards with onboard ethernet wasn't great so it looked like getting them wouldn't meet my deadline.

I stripped my original code down to the bare minimum.

#include <NativeEthernet.h>
//#include <NativeEthernetUdp.h>

// Extract the hardware MAC from uC itself, and stuff it into an array
uint8_t mac[6];
void teensyMAC(uint8_t *mac) {
    for(uint8_t by=0; by<2; by++) mac[by]=(HW_OCOTP_MAC1 >> ((1-by)*8)) & 0xFF;
    for(uint8_t by=0; by<4; by++) mac[by+2]=(HW_OCOTP_MAC0 >> ((3-by)*8)) & 0xFF;
    Serial.printf("MAC: %02x:%02x:%02x:%02x:%02x:%02x\n", mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
}

// Mirror the enum EthernetHardwareStatus in NativeEthernet.h so we can print a human readable value.
// Shouldn't this enum be updated to have a 'NativeEthernet" value?
char HWStatus[][14] = {"EthNoHardware",
                     "EthernetW5100",
                     "EthernetW5200",
                     "EthernetW5500"};

//// buffers for receiving and sending data. Currently unused.
//char packetBuffer[UDP_TX_PACKET_MAX_SIZE];  // buffer to hold incoming packet,
//char ReplyBuffer[] = "acknowledged";        // a string to send back
//
//// An EthernetUDP instance to let us send and receive packets over UDP
//EthernetUDP udp;
//
////IP address to send UDP data to:
//const char * udpAddress = "192.168.1.50";
const int udpPort = 8000;
//
//// Not connected yet!
//boolean connected = false;

int ledPin = 13;
int value = 0;

void setup(void) {

  Serial.begin(115200);

  pinMode(ledPin, OUTPUT);
  digitalWrite(ledPin, HIGH);

  Ethernet.setStackHeap(1024 * 64);
  Ethernet.setSocketSize(1024 * 16);
  Ethernet.setSocketNum(1);

  teensyMAC(mac);
  Ethernet.begin(mac);

  Serial.print("IP  address: ");
  Serial.println(Ethernet.localIP());
  Serial.send_now();
  Serial.println(HWStatus[Ethernet.hardwareStatus()]);
  Serial.send_now();

//  This sketch will run indefinitely with mDNS commented out, hangs then reboots when enabled.
  MDNS.begin("Teensy41", 1); //.local Domain name and number of services
  MDNS.setServiceName("Teensy41_OSC"); //Uncomment to change service name
  MDNS.addService("_osc._udp", udpPort); 

}

void loop(void) {
  // Count loop iterations, toggle LED and print value every 10 million loops as an alternative to blocking with delay()
  if((value % 10000000) == 0) {
    digitalWrite(ledPin, !digitalRead(ledPin));
    Serial.print("Loop:  ");
    Serial.println(value / 10000000);
    Serial.send_now();
    fnet_service_poll();  // Is this even necessary to call by the user? Ideally, the ethernet library should
                          // run a timer to automatically execute housekeeping tasks. Is this already done?
                          // It didn't seem to make a difference when the mDNS service was enabled.
    if (value >= 1000000000 ) value = 0;
  }

  value++;
}

The number of iterations before the hang varies, and occasionally it won't reboot at all, requiring manual intervention.

@playaspec
Copy link
Author

Shortly after posing this I continued my due diligence in trying to track the problem down and may have found it. I was originally testing on a very busy university LAN. There's close to 1000 machines on this subnet, and over 50 different mDNS services being advertised, many with dozens of hosts per service. In an effort to better debug this, I direct attached the Teensy to the second ethernet on my workstation, fired up dnsmasq on that port (for DHCP and DNS I control), started Wireshark to see what was going on, and the problem went away!

I suspect that the heavy mDNS traffic eventually ran the Teensy out of memory trying to keep track of so many machines/services coming and going. I'm currently only announcing presence of my service, but may want to later add finding the client via mDNS.

Is there a way to tell the mDNS server to ignore all service types except the ones I'm interested in? Filtering/ignoring irrelevant services seems a necessity for memory constrained systems like microcontrollers. A method like subscribeService() or some such mechanism to only track services of interest is likely going to be a necessity for many situations.

@vjmuzik
Copy link
Owner

vjmuzik commented Aug 16, 2021

The mDNS code does not allocate any more memory by itself no matter how many clients try to send to it since it does not store anything about said clients. That being said, if there is a ton of network traffic it may be running out of memory in the stack, there's no way around that if it's happening before it reaches the mDNS server. As far as the stack knows everything coming into the mDNS port is valid data that you want which it very well could be. You can easily check FNET's stack size every so often in your code, if there are any memory leaks in the stack you will see it start to dwindle.

Serial.printf("FNET_Free: %d  FNET_Max: %d \n", fnet_free_mem_status(), fnet_malloc_max());
Serial.send_now();

@playaspec
Copy link
Author

You can easily check FNET's stack size every so often in your code, if there are any memory leaks in the stack you will see it start to dwindle.

Memory leaks didn't seem to be the issue as memory use seemed more or less stable, but moving the Teensy to an isolate network stopped the crashing. There's quite a bit of hash on the public network fuzzing away at the FNET stack, so maybe the problem is there. Once I get out from under this deadline, I'm going to dig into it deeper by capturing traffic until it crashes, then try drip feeding the capture back until I can isolate whats causing the crash. If it's in FNET, I'll follow up with a closing comment and take the issue over there. Thanks for getting back so quickly.

@natcl
Copy link

natcl commented Feb 28, 2022

I have the exact same issue, however it's at home so the network is much smaller than a University. Were you able to investigate this further ?
Thanks !

@natcl
Copy link

natcl commented Mar 2, 2022

Also I tested with the lines you suggested to check free memory and there doesn't seem to be a leak so it's probably blocking somewhere...

@playaspec
Copy link
Author

I have not had a chance to look deeper into it, although it would be fairly easy to set up. What would be useful is a mechanism to detect the crash so I can stop Wireshark. Captures on this network get very big, very fast.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants