Skip to content

[BUG] ACME challenge fails #400

Open
@bejchi

Description

@bejchi

Check Discussions to see if there is an answer might help you before opening an issue

Describe the bug
We recently updated our setup to use the latest version and went through all the steps. We were running on a version more than 4 years old. Things are working fine on our staging server, but for some reason new certificates cannot be issued on our production server. When looking at the logs we can see that the ACME challenge fails - and we are quite sure that this is because the server definition is wrong for the new domain.

The $is_https variable is set to false, even though the LETSENCRYPT_HOST .env variable is set for the project.

We can see that the LetsEncrypt challenge stanza not being included in <my-path>/nginx-proxy-automation/data/conf.d/default.conf and when printing out the variables in nginx.tmpl with

# Debug statements to log variable values
# https_method {{ $https_method }}

# cert {{ $cert }}

# cert exists  {{ exists (printf "/etc/nginx/certs/%s.crt" $cert) }}

# Key file exists  {{ exists (printf "/etc/nginx/certs/%s.key" $cert) }}

we get the following:

# Debug statements to log variable values
# https_method redirect
# cert <my-domain>
# cert exists  false
# Key file exists  false
server {
	server_name <my-domain>;
	listen 80 ;
	access_log /var/log/nginx/access.log vhost;
	include /etc/nginx/vhost.d/default;
	location / {
		proxy_pass http://<my-domain>;
	}
}

So for some reason the key and the cert are not in place - or the variables are not assigned the values correctly.

The main difference between the two setups is that we accidentally did not clone the submodules on prod before running the fresh-start.sh script. When we found out, we got the submodules and ran the script again.

We are considering starting over again and following the exact steps we did on staging, but it would be really nice if we could figure out why we get the current behavior.

We also get the error when just spinning up an empty container like this

docker run -d -e VIRTUAL_HOST=ssl-test.<my-domain> \
            -e LETSENCRYPT_HOST=ssl-test.<my-domain> \
            -e LETSENCRYPT_EMAIL=<my-email> \
            --network=webproxy \
            --name my_app \
            httpd:alpine

To Reproduce
Steps to reproduce the behavior:

  1. Clone the https://github.com/evertramos/nginx-proxy-automation - without the submodule inclusion
  2. Stop all containers, run compose-down for old nginx-proxy project. Remove old network.
  3. Run the fresh-start.sh script
  4. Whoops - no certificates are working
  5. git submodule update --init --recursive && git fetch
  6. Run the fresh-start.sh script again
  7. The existing sites aren't back immediately - so run compose down and compose up -d in all project folders
  8. Now existing sites are getting the certificates, but new ones are not

Server info (please complete the following information):

  • Linux release: Ubuntu 20.04.6 LTS
  • Server type: Standalone server
  • Docker version 24.0.7, build 24.0.7-0ubuntu2~20.04.1
  • Docker Compose version v2.4.1

Logs:

  • basescript.log (send only last execution log with an error) - This log is in ./bin folder: no errors
  • nginx container logs: no errors
    *** letsencrypt/acme container (if related to ssl) ***
[Thu Feb 20 08:01:51 UTC 2025] mydomain:Verify error:<myIP>: Invalid response from http://mydomain/.well-known/acme-challenge/<token>:: 503

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions