Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Campaign is marked finished though email is not sent to all subscribers #1762

Closed
jackraj97 opened this issue Mar 1, 2024 · 21 comments
Closed
Labels
needs-investigation Potential bug. Needs investigation

Comments

@jackraj97
Copy link

Version:

  • listmonk: v3.0.0

Description of the bug and steps to reproduce:
Most of my campaigns are sent to very less users and it is marked finished. I'm also unable to resend the campaign.

  1. Can someone please help check why this happens?
  2. Shouldn't listmonk retry the failed subscribers (in case of errors) instead of simply marking the campaign as finished?
  3. Also, why is the campaign send button disabled, when the campaign is not fully sent?

I have attached the campaign list page, performance settings with this issue.
I'm using brevo SMTP on port 587 with LOGIN auth protocol.

Screenshots:
image
image

@MaximilianKohler
Copy link
Contributor

What do your logs show?

@jackraj97
Copy link
Author

Logs are always empty.

@knadh knadh added the needs-investigation Potential bug. Needs investigation label Mar 5, 2024
@knadh
Copy link
Owner

knadh commented Mar 14, 2024

listmonk retries e-mails N times as configured in SMTP settings. The lower count indicates that there were errors in sending (despite retries). However, the errors should definitely be logged. You should set an error threshold so that the campaigns are paused on errors and check the error log immediately. Changing settings restarts listmonk and wipes the logs.

@rjocoleman
Copy link

I'm seeing this too FWIW, my listmonk logs simply show EOF per subscriber and the SMTP server seems to know nothing about it. I'll try to reproduce it locally when I get a bit of time next week and see what's up

@knadh
Copy link
Owner

knadh commented Mar 15, 2024

Ah, EOF indicates a broken network connection with the SMTP.

@MaximilianKohler
Copy link
Contributor

MaximilianKohler commented Mar 25, 2024

Well my issue of getting lots of errors #1717 (comment) progressed to the campaign completing/failing with no errors. This was sent to 150k subscribers at 10x10 rate:

2024/03/25 09:00:03start processing campaign (Campaign name 2024-03-25 150k)
2024/03/25 09:00:05campaign (Campaign name 2024-03-25 150k) finished

Screenshot 2024-03-25 140950

A verbose log is essential in this situation, as I have no clue who was sent the email, so the only option I have is to send it again to everyone, and people don't like being spammed with the same email.

@MaximilianKohler
Copy link
Contributor

MaximilianKohler commented Mar 26, 2024

In case it's helpful, I found this open-source app that works like a verbose log, listing all the emails sent by SES:

SES Dashboard https://sesdashboard.com/ - https://github.com/Nikeev/sesdashboard

It would be greatly preferred to have that built into listmonk though.

Supposedly it can be done in Cloudwatch, but I haven't been able to figure out a way to get it to list the emails; it only lists the number of emails.

@MaximilianKohler
Copy link
Contributor

MaximilianKohler commented Mar 26, 2024

Well I figured out a way to kind of get the list of 49 people it was sent to. Please let me know if there's a better way of doing this.

#686 gave me the idea of searching the campaign_views table. So I checked the campaign ID of the failed campaign (525) and did:

psql -U listmonk -h localhost -p 5432 listmonk
\dt
SELECT * from campaign_views where campaign_id=525;

It outputs subscriber IDs, so to get the emails you may be able to modify this command: #1629 (comment)

Or maybe these commands can be modified to do something similar #1562 (comment)

But I'm not sure how exactly for either one.

It would be better to use a command that directly saves it to a file. This might work https://stackoverflow.com/questions/5331320/psql-save-results-of-command-to-a-file.

@stephdin
Copy link

stephdin commented Mar 27, 2024

Hey everyone, i ran into the same issue. Fortunately while testing my setup locally.

Version v3.0.0 (f9120d9 2024-02-04T11:20:27Z, linux/amd64)

I am running a test setup with Mailpit (https://mailpit.axllent.org/) as an SMTP server.
Performance configuration:
Concurrency: 1
Message rate: 1
Batch size: 1000
Maximum error threshold: 25
Sliding window limit: 300 Messages/hour

While testing there were no errors shown in the log. I think there is another bug with the sliding window limit, which might be related to this issue.

Steps to reproduce:

  1. Set a sliding window limit
  2. Start your sending your campaign
  3. Pause the campaign
  4. BUG: Log shows pipe.go:122: messages exceeded (300) for the window (1h0m0s since 27 Mar 24 15:30 +0000). Sleeping for 59m12s. even if the limit was not even reached
  5. Disable the limit -> listmonk restarts
  6. Unpause the campaign
  7. The campaign is immediately finished, in my case "Sent: 186 / 345 "

screenshot

If the sliding window limit is disabled on campaign start, i can pause and unpause the campaign without issues. In the logs i see start processing campaign and stop processing campaign. Notice the second line in the logs, this was the moment i paused the campaign, there was no "stop processing campaign" logged. At 15:32:20 i disabled the sliding window. There was a warning shown in the UI, that i should pause my campaigns. In the campaign overview it showed as paused, but maybe listmonk did not paused it internally?
Notice how at 15:32:46 the campaign jumped to finished in an instant.

Here are the logs:

listmonk_app  | 2024/03/27 15:31:15 manager.go:409: start processing campaign (Copy of Copy of Testkampagne)
listmonk_app  | 2024/03/27 15:31:38 pipe.go:122: messages exceeded (300) for the window (1h0m0s since 27 Mar 24 15:30 +0000). Sleeping for 59m12s.
listmonk_app  | 2024/03/27 15:32:20 init.go:843: reloading on signal ...
listmonk_app  | 2024/03/27 15:32:20 init.go:796: HTTP server shut down
listmonk_app  | 2024/03/27 15:32:21 main.go:102: v3.0.0 (f9120d9 2024-02-04T11:20:27Z, linux/amd64)
listmonk_app  | 2024/03/27 15:32:21 init.go:150: reading config: config.toml
listmonk_app  | 2024/03/27 15:32:21 init.go:289: connecting to db: listmonk_db:5432/listmonk
listmonk_app  | 2024/03/27 15:32:21 init.go:618: media upload provider: filesystem
listmonk_app  | 2024/03/27 15:32:21 init.go:541: loaded email (SMTP) messenger: username@mailpit
listmonk_app  | ⇨ http server started on [::]:9000
listmonk_app  | 2024/03/27 15:32:46 manager.go:409: start processing campaign (Copy of Copy of Testkampagne)
listmonk_app  | 2024/03/27 15:32:46 pipe.go:217: campaign (Copy of Copy of Testkampagne) finished

I hope this helps finding the issue. I am quite satisfied with listmonk in general, but i am worried, that my campaign stops randomly and i need to send mail campaigns twice.

Let me know, if i should open another issue for the sliding window limit warning on pausing campaigns.

Thanks!

@MaximilianKohler
Copy link
Contributor

Hey @knadh good news! I think I figured out the problem. In my report a couple comments up #1762 (comment) I was sending out a campaign to 150k people.

  • I discovered the the campaigns are sent out in the reverse order of the list /admin/subscribers/lists/423. So you go to the last page (7500), and those subscribers are emailed first.
  • I clicked through a few of those pages and noticed there were a few pages with only blocklisted emails.
  • I looked in /admin/settings -> performance -> batch size, and saw the number was 500.

I'm pretty sure the issue occurs when all the subscribers in the batch are blocklisted.

@jackraj97
Copy link
Author

@knadh I finally got a chance to save the logs. Below is the error I get when a campaign runs.
At the end only 50% of the subscribers receive the emails.

Can you please let me know how I can fix this issue?

2024/06/06 09:16:55 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2376: timed out waiting for free conn in pool
2024/06/06 09:17:06 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2382: timed out waiting for free conn in pool
2024/06/06 09:17:17 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2383: timed out waiting for free conn in pool
2024/06/06 09:17:27 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2393: timed out waiting for free conn in pool
2024/06/06 09:17:37 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2397: timed out waiting for free conn in pool
2024/06/06 09:17:48 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2401: timed out waiting for free conn in pool
2024/06/06 09:17:58 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2405: timed out waiting for free conn in pool
2024/06/06 09:18:08 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2408: timed out waiting for free conn in pool
2024/06/06 09:18:20 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2412: timed out waiting for free conn in pool
2024/06/06 09:18:30 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2413: timed out waiting for free conn in pool
2024/06/06 09:18:40 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2416: timed out waiting for free conn in pool
2024/06/06 09:18:40 pipe.go:217: campaign (WordPress Collection #73 - How to Use MailChimp with WordPress and more) finished

performance configuration:
image

@MaximilianKohler
Copy link
Contributor

@jackraj97 did you look through the other issues that cover that error? https://github.com/knadh/listmonk/issues?q=is%3Aissue+timed+out+waiting+for+free+conn+in+pool

@subhash-ngowda
Copy link

Hi @knadh, We are also facing the same issue as @MaximilianKohler . The list had 200K subscribers and the campaign was marked as finished just after sending to 37 subscribers. We observed that the list has many blocklisted users and the email IDs exists in multiple list. We could not find anything in the logs.

  1. is there a way to turn on the detailed logging
  2. we observed that the total campaign count displayed on the UI includes the blocklisted users in the list and it exactly matches to the list size (it should have excluded the blocklisted users in the list)
  3. How to we extract the remaining unsent recipients so that we can re-target the campaign.
    listmonk_error
    listmonk_error2

@mitexleo
Copy link

Any solution to this problem? For my client email is not being sent to anyone.

@Cyrix126
Copy link

Cyrix126 commented Sep 5, 2024

Any solution to this problem? For my client email is not being sent to anyone.

@MaximilianKohler is right

I have this issue but without any blocked users.

@Cyrix126
Copy link

Cyrix126 commented Sep 6, 2024

As a note, if the campaign is "finished" and you update the cell "status" of the table "campaigns" to "paused", you can continue the campaign but it will stop again, sometimes after sending some, sometimes immediately.

@rkcreation
Copy link

Any updates on how to solve this ?
My campaign was marked as finished after sending 596 / 1500 emails (SES rate limiting). I want to re-send emails

@ohaeusler
Copy link

As a workaround, drastically increasing the batch size worked for me. I have only very small lists (<100 subscribers), so I do not know the consequences for larger lists.

For other updates regarding this issue, you can refer to #1931

@knadh
Copy link
Owner

knadh commented Sep 10, 2024

This is being actively tracked and investigated here: #1931 - I'll close this thread so that we can consolidate the discussions in one place.

As a workaround, drastically increasing the batch size worked for me. I have only very small lists (<100 subscribers), so I do not know the consequences for larger lists.

This seems to be a clue. I still have not been able to reproduce this (please check the thread on #1931)

@knadh knadh closed this as completed Sep 10, 2024
@MaximilianKohler
Copy link
Contributor

This seems to be a clue. I still have not been able to reproduce this

You couldn't reproduce the blocklist issue I described? #1762 (comment)

However, I think the batch size defaults to 500, and since @ohaeusler is sending to <100 at a time, it couldn't be the same blocklist issue.

@knadh
Copy link
Owner

knadh commented Sep 10, 2024

Hi @MaximilianKohler. I couldn't. Please see #1931 (comment)

The subscribers are always ordered in the ascending order of their ID when batching. The condition to pull the batch is > last_subscriber_id (which is 0 to begin with and is updated with the last ID in each batch when it's done) and then < max_subscriber_id, which is the ID of the last subscriber in all the batches to be processed, so the ID of 150k-th subscriber. The batch query cannot return 0 items because it's essentially doing: Fetch all subscribers between ID 0 and ID 150k and not unsubscribed, and then from the results, slice and get $batch number.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
needs-investigation Potential bug. Needs investigation
Projects
None yet
Development

No branches or pull requests

10 participants