-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Campaign Status "Finished" Before All Emails Were Sent #1931
Comments
Hi @subhash-ngowda , I tried to replicate the above bug with |
@nayanthulkar28 We were only able to reproduce this if the users are in multiple lists and they are blocklisted from any of the list. Take a look at this below attached image. |
Understood! |
I couldn't find the case where a subscriber is Considering this usecase, here we can fix this query accordingly:
|
I think it happens when a subscriber clicks the "unsubscribe" link in an email and chooses "permanent block list", or it may happen when you manually blocklist someone. I think it used to happen when the system would automatically blocklist them for bounce/complaint, but I think at some point that changed to only blocklisting them and not unsubscribing them. |
I'd like to add to this and state that I received the same problem. 166 of 238 e-mails were sent out. The problem with the "ubsubscribe" or "block list" theory is that this is a brand new installation with the very first campaign I've ever created. Our employees haven't even had the chance to unsubscribe yet because they've never received an e-mail like this from us in the past. |
Hi folks, just for aggregate in this issue, we was facing the same problem with our campaigns and after reading this issue we delete all our blocked users from our lists and now our campaigns do not stop in midway anymore. Today we could sent 1.5mi e-mails again without problems, so this thing about "import a lot of blocked users from csv" could be really a important fact in this investigation. |
Oops, closed this accidentally. Investigating this issue currently. |
Did quite a few tests on this, and it's trickier than I thought.
The max_subscriber_id is computed once when the campaign starts. That's the ID of the last subscriber expected to be processed when the campaign finishes. If there are 200k subscribers, assuming sequential IDs, last_subscriber_id = 0 and max_subscriber_id = 200k when the campaign starts. That's the entire range. With a batch size of 1000 (which is default), no matter how many blocklisted subscriptions exist between 0 and 200k, this query will always return non-blocklisted rows. Unless, max_subscriber_id changes after a campaign's creation, where there are subscribers beyond the max point. @subhash-ngowda In your case, were subscribers blocklisted/changed after the campaign's creation? |
@knadh No, the subscribers were not changed/blocklisted after the campaigns's creation. |
hmm, @subhash-ngowda then it is not possible that in the subIDs query, the conditions will ever return 0 rows before max_subscriber_id is reached, right? Referring to your observation here:
|
As mentioned in #1762, I've been able to mitigate the issue by increasing the batch size. As my mail server hasn't built up reputation yet and isn't used for newsletter regularly, I've decreased the batch size immediately after setting up listmonk a few months ago. Back at this time, it all worked fine and sending out a campaign to 500ish subscribers. Now I've got two new lists, both double opt-in as opposed to the last time, where I managed this aside from listmonk. Before increasing the batch size to 100, the campaign finished early at the last confirmed subscriber, if there are no confirmed subscribers in the next batch of subscribers. I can remember seeing a finished campaign with last_subscriber_id at 1102. After manually increasing the last_subscriber_id to 1119 it worked until finishing on 1150 again. Increased last_subcriber_id to 1168, did 1169, then it finished again. At this time I've increased the batch size to 100 which made it finish the campaign properly. EDIT: I've deleted any blocklisted subscribers before these steps, so that shouldn't be an issue |
In my case, it happened when I imported the list and selected "blocklisted", which automatically marks them as unsubscribed as well.
Ah! Then I think your experience does match mine. #1762 (comment) Knadh commented here #1762 (comment) about how the process works. I just checked the list of 150k subscribers that failed and there's a large (4890) number of blocklisted+unsubscribed contacts grouped up right where the campaign ended prematurely. So the last subscriber who was sent an email was the one right before the beginning of the blocklisted group. My batch size at the time was the default 500. The list of 150k subscribers includes subscribers from 10 to 1,443,624. It definitely seems to be an issue of "if there are no valid/confirmed subscribers in the current/next batch, the campaign is marked finished". |
After a marathon debugging session with @vividvilla, it looks like we've nailed the issue. Thank you everyone who debugged and shared cues and clues on this thread! This happens when:
Working on fixing both of these. @subhash-ngowda, in this screenshot, you have a blocklisted user with a non-unusbscribed list. Can you recollect how you ended up with that state? Same thing @nayanthulkar28 had pointed out. There's a clue or potentially another subtle bug there either. The only path I've found so far is bounces. Was that the case? |
In my case, where blocklisted users have list subscriptions that are not unsubscribed: I have a local CSV list of contacts. I've imported some of them as One reason I do it this way is because I usually send out 2-3k emails per day (a new list & campaign each day), but my full list is 1.2+ million users. According to various guidelines, I need to gradually increase the number of emails I send per day, so I generally start with ~5k and then double that each day till I get to ~500k per day. I create a new list+campaign for each day. |
@knadh
Note: We have tested this on both double-optin and single-optin lists and the result are the same. Subscribers are imported as confirmed users. This is very similar to our real time use case where we do our campaigns daily and also use other systems along with Listmonk. When we import our daily target segment to a new list, it is possible that we will have duplicate records in the import with respect to the data present in the system. If the duplicate user is already in the blocklisted status (due to the customer’s un-subscription action) then we end up a user who is “blocklisted” but not unsubscribed in one of the lists, he is present. |
Thanks @subhash-ngowda. I was able to reproduce the issue. Working on a fix. |
This has been a hair-pulling rabbit hole of an issue. #1931 and others. When the `next-campaign-subscribers` query that fetches $n subscribers per batch for a campaign returns no results, the manager assumes that the campaign is done and marks as finished. Marathon debugging revealed fundamental flaws in qyery's logic that would incorrectly return 0 rows under certain conditions. - Based on the "layout" of subscribers for eg: a series of blocklisted subscribers between confirmed subscribers. A series of unconfirmed subscribers in a batch belonging to a double opt-in list. - Bulk import blocklisting users, but not marking their subscriptions as 'unsubscribed'. - Conditions spread across multiple CTEs resulted in returning an arbitrary number of rows and $N per batch as the selected $N rows would get filtered out elsewhere, possibly even becoming 0. After fixing this and testing it on our prod instance that has 15 million subscribers and ~70 million subscriptions in the `subscriber_lists` table, ended up discovered significant inefficiences in Postgres query planning. When `subscriber_lists` and campaign list IDs are joined dynamically (CTE or ANY() or any kind of JOIN that involves) a query, the Postgres query planner is unable to use the right indexes. After testing dozens of approaches, discovered that statically passing the values to join on (hardcoding or passing via parametrized $1 vars), the query uses the right indexes. The difference is staggering. For the particular scenario on our large prod DB to pull a batch, ~15 seconds vs. ~50ms, a whopping 300x improvement! This patch splits `next-campaign-subscribers` into two separate queries, one which fetches campaign metadata and list_ids, whose values are then passed statically to the next query to fetch subscribers by batch. In addition, it fixes and refactors broken filtering and counting logic in `create-campaign` and `next-campaign` queries. Closes #1931, #1993, #1986.
This has been fixed and has been tested in production on our large prod instance. The fix (bf26ec8) was way more complex than I imagined, and while working on it, I stumbled upon significant performance improvements in those queries (30x speed up on subscriber batch pulling on large installations). The changes are all documented in the commit message. This fix is on the |
This has been a hair-pulling rabbit hole of an issue. #1931 and others. When the `next-campaign-subscribers` query that fetches $n subscribers per batch for a campaign returns no results, the manager assumes that the campaign is done and marks as finished. Marathon debugging revealed fundamental flaws in qyery's logic that would incorrectly return 0 rows under certain conditions. - Based on the "layout" of subscribers for eg: a series of blocklisted subscribers between confirmed subscribers. A series of unconfirmed subscribers in a batch belonging to a double opt-in list. - Bulk import blocklisting users, but not marking their subscriptions as 'unsubscribed'. - Conditions spread across multiple CTEs resulted in returning an arbitrary number of rows and $N per batch as the selected $N rows would get filtered out elsewhere, possibly even becoming 0. After fixing this and testing it on our prod instance that has 15 million subscribers and ~70 million subscriptions in the `subscriber_lists` table, ended up discovered significant inefficiences in Postgres query planning. When `subscriber_lists` and campaign list IDs are joined dynamically (CTE or ANY() or any kind of JOIN that involves) a query, the Postgres query planner is unable to use the right indexes. After testing dozens of approaches, discovered that statically passing the values to join on (hardcoding or passing via parametrized $1 vars), the query uses the right indexes. The difference is staggering. For the particular scenario on our large prod DB to pull a batch, ~15 seconds vs. ~50ms, a whopping 300x improvement! This patch splits `next-campaign-subscribers` into two separate queries, one which fetches campaign metadata and list_ids, whose values are then passed statically to the next query to fetch subscribers by batch. In addition, it fixes and refactors broken filtering and counting logic in `create-campaign` and `next-campaign` queries. Closes #1931, #1993, #1986.
Version:
Description of the bug:
We are encountering an issue where the campaign status is marked as "finished" before all emails have been sent.
Campaign Details
Analysis and Findings
Upon further investigation, we identified a potential bug in the query/code that might be causing this issue. Below are the details (Code Snippet from Pipe.go):
Campaign Scheduling Mechanism
The campaign scheduling mechanism uses
last_subscriber_id
andmax_subscriber_id
to process subscribersin batches. The
last_subscriber_id
acts as a pointer untilmax_subscriber_id
is reached.The
subIDs
query fetches distinct subscriber IDs and their subscription status fromsubscriber_lists
where the listID is among the lists from
campLists
. Additionally, it filters out subscribers with the status unsubscribed.If all subscribers that meet the criteria in
subIDs
areblocklisted
, the subs CTE will return zero records due to the condition that excludes blocklisted subscribers.Observed Behaviour in Our Case
In our scenario, during the second iteration, all 200 records were blocklisted, resulting in zero records being returned.
Consequently, the
NextSubscribers
process returned false, causing the campaign status to be changed to "finished" without sending emails to the entire list.Relevant Code in
Pipe.go
Conclusion
The issue stems from the query filtering out blocklisted subscribers and returning zero records if all remaining subscribers in a batch are blocklisted. This causes the
NextSubscribers
process to return false prematurely, changing the campaign status to "finished".Recommendations
To address this issue, we recommend modifying the query and/or logic to ensure that the campaign does not prematurely
finish when encountering batches of
blocklisted
subscribers. This will ensure that emails are sent to all eligible subscribers in the list.The text was updated successfully, but these errors were encountered: