Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Notifo stops processing events? #225

Open
benjyiw opened this issue Mar 19, 2024 · 6 comments
Open

Notifo stops processing events? #225

benjyiw opened this issue Mar 19, 2024 · 6 comments

Comments

@benjyiw
Copy link

benjyiw commented Mar 19, 2024

Hi again,

I have ran into a bug and unfortunately I don’t have many details to provide. I figure I would just open the issue now to see if that sparked any ideas from your end. I intend to provide more information if/when it happens again, so that I can drill down and provide more logs and perhaps poke around in the DB.

At some point in the past 90 days, Notifo seemingly stopped processing events. I am running Notifo in k8s, so my first hail mary attempt at a resolution was to kill the Notifo pod, and unsatisfyingly that fixed it. This happened for two separate environments, so this is a real bug, but unfortunately I have nothing that I think would be very helpful to provide as I can’t cause the bug to reoccur in order to get more information. I'll provide what I do have below.


Prior to restart of the pod the only logs I saw were simple HTTP access logs.

During pod shutdown, Notifo logged the following errors, I don't know if this is actually problematic:

2024-03-19 11:05:32	
{"logLevel":"Error","stoppedSystemFailed":"Messaging.Consumer(user-events)","timestamp":"2024-03-19T17:05:32Z","severity":"ERROR","exception":{"type":"System.NullReferenceException","message":"Object reference not set to an instance of an object.","stackTrace":"   at RabbitMQ.Client.Impl.ModelBase.BasicCancel(String consumerTag)\n   at RabbitMQ.Client.Impl.AutorecoveringModel.BasicCancel(String consumerTag)\n   at Squidex.Messaging.RabbitMq.RabbitMqSubscription.DisposeAsync()\n   at Squidex.Messaging.Implementation.DelegatingConsumer.StopAsync(CancellationToken ct)\n   at Squidex.Hosting.BackgroundHost.StopAsync(CancellationToken cancellationToken)"}}

2024-03-19 11:05:32	
{"logLevel":"Error","stoppedSystemFailed":"Messaging.Consumer(events)","timestamp":"2024-03-19T17:05:32Z","severity":"ERROR","exception":{"type":"System.NullReferenceException","message":"Object reference not set to an instance of an object.","stackTrace":"   at RabbitMQ.Client.Impl.ModelBase.BasicCancel(String consumerTag)\n   at RabbitMQ.Client.Impl.AutorecoveringModel.BasicCancel(String consumerTag)\n   at Squidex.Messaging.RabbitMq.RabbitMqSubscription.DisposeAsync()\n   at Squidex.Messaging.Implementation.DelegatingConsumer.StopAsync(CancellationToken ct)\n   at Squidex.Hosting.BackgroundHost.StopAsync(CancellationToken cancellationToken)"}}

2024-03-19 11:05:32	
{"logLevel":"Error","stoppedSystemFailed":"Messaging.Consumer(confirms)","timestamp":"2024-03-19T17:05:32Z","severity":"ERROR","exception":{"type":"System.NullReferenceException","message":"Object reference not set to an instance of an object.","stackTrace":"   at RabbitMQ.Client.Impl.ModelBase.BasicCancel(String consumerTag)\n   at RabbitMQ.Client.Impl.AutorecoveringModel.BasicCancel(String consumerTag)\n   at Squidex.Messaging.RabbitMq.RabbitMqSubscription.DisposeAsync()\n   at Squidex.Messaging.Implementation.DelegatingConsumer.StopAsync(CancellationToken ct)\n   at Squidex.Hosting.BackgroundHost.StopAsync(CancellationToken cancellationToken)"}}

After I restarted the application, I saw dozens of logs around new events being processed, here's one example:

{"logLevel":"Information","message":"Received event for app 026fa665-d935-4b55-a99b-edd47647039a with ID 84e0ae4b-7c86-4991-adf8-57860c858fa7 to topic users/9739ce9f-bc50-4c2b-abeb-1ace7da20ab7.","appId":"026fa665-d935-4b55-a99b-edd47647039a","id":"84e0ae4b-7c86-4991-adf8-57860c858fa7","topic":"users/9739ce9f-bc50-4c2b-abeb-1ace7da20ab7","timestamp":"2024-03-19T17:06:03Z","severity":"INFO","category":"Notifo.Domain.UserEvents.Pipeline.UserEventPublisher"}
@SebastianStehle
Copy link
Collaborator

So you are using RabbitMQ as a backbone for events?

@benjyiw
Copy link
Author

benjyiw commented Mar 19, 2024

So you are using RabbitMQ as a backbone for events?

Yep, that's what we are currently using.

@SebastianStehle
Copy link
Collaborator

I had a look to the code and I think right here a log statement could help: https://github.com/Squidex/libs/blob/main/messaging/Squidex.Messaging.RabbitMq/RabbitMqSubscription.cs#L51

I will provide that asap. Perhaps it helps. We could also think about adding a method to get the number of pending messages and add that to the health check endpoint.,

@SebastianStehle
Copy link
Collaborator

I have pushed a new version with the improved logging.

@benjyiw
Copy link
Author

benjyiw commented Mar 22, 2024

Thank you! I found the squidex/notifo-dev image and saw that it has a new tag on it. I'll try that out and update this issue if I learn anything more.

@benjyiw
Copy link
Author

benjyiw commented Mar 22, 2024

I was curious, so I checked the most recent commits, and I don't actually see the logging changes to the RabbitMQ messaging implementation. Am I missing something?

Ah, never mind, I see those changes in the lib repo. I had assumed that the link before was in this repo and didn't catch that it was actually elsewhere.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants