Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

RemoteServiceException when using foreground service on Android API 26 #127

Closed
mars-lan opened this issue Apr 27, 2021 · 48 comments
Closed

Comments

@mars-lan
Copy link
Contributor

Start seeing the following crash when rolling out our new version that uses foreground service

Fatal Exception: android.app.RemoteServiceException
Context.startForegroundService() did not then call Service.startForeground()

android.app.ActivityThread$H.handleMessage (ActivityThread.java:1881)
android.os.Handler.dispatchMessage (Handler.java:105)
android.os.Looper.loop (Looper.java:164)
android.app.ActivityThread.main (ActivityThread.java:6938)
java.lang.reflect.Method.invoke (Method.java)
com.android.internal.os.Zygote$MethodAndArgsCaller.run (Zygote.java:327)
com.android.internal.os.ZygoteInit.main (ZygoteInit.java:1374)

So far the crash is limited to only Android 8.0 devices, but not sure if it's an artifact of the small rollout percentage or an actual version-specific issue.

Based on https://developer.android.com/about/versions/oreo/android-8.0-changes & https://developer.android.com/reference/android/app/Service#startForeground(int,%20android.app.Notification) it seems to suggest that startForeground wasn't called within 5 seconds of foreground service creation. Without any visibility into the code base, it's difficult to debug the issue further.

@mars-lan
Copy link
Contributor Author

mars-lan commented Apr 27, 2021

Can confirm that this is an Android 8 (API 26) specific issue. Was able to reproduce the crash on the simulator by calling notifee.displayNotification with android.asForegroundService set to true. The same code works on API 24-25, 27-30 (foreground notifications are not shown in API 22-25 but at least it didn't crash)

@mars-lan mars-lan changed the title RemoteServiceException when using foreground service RemoteServiceException when using foreground service on Android API 26 Apr 27, 2021
@mikehardy
Copy link
Collaborator

Strange - I wonder if this is at all related to something happening with disk access (thus causing it to take longer than 5 seconds) as a proximate cause, using StrictMode to clean up things in general may be a useful pass in the codebase. I'm trying to understand (in my mental model) how a 5s violation could be API specific though, it doesn't really fit my model as I understand things. Odd.

@mars-lan
Copy link
Contributor Author

Yeah I'm not so sure about the 5s thingy either after narrowing it down to only API 26. Since it's so readily reproducible hopefully we'll hear from the notifee team soon.

@helenaford
Copy link
Member

@mars-lan thanks for the info and sorry you've experienced this. I had a quick look, nothing stands out what could be causing the issue. I'll run it locally with the emulator and get back to you with more details. At the very least, I can put something in place to stop the crashing (i.e. disable foreground service for api 26) like you said. Hopefully, we can resolve this though, and get the other versions working too.

@helenaford
Copy link
Member

@mars-lan I'm struggling to recreate this issue with API 26, with the following emulator:
Screenshot 2021-04-27 at 17 08 40

Also tried with API 23:
Screenshot 2021-04-27 at 17 14 37

with notification:

   await notifee.displayNotification({
      title: "Notification Title",
      body: "Main body content of the notification",
      android: {
        channelId,
        asForegroundService: true,
      },
    });

Could you can paste some example code or show what payload you are passing to displayNotification please? Or, list out steps you are doing? Are you calling displayNotification in the foreground? Did you reproduce this in debug mode?

@mikehardy
Copy link
Collaborator

I wonder if it is timing dependent (the 5 second limit), and if the notification does some super slow network fetch or something it would trigger it? Is that possible - for the notification in any way to trigger some slow process that might take more than 5 seconds? Or maybe just for testing hack right into the code a delay to see if it reproduces the stack

@mikehardy
Copy link
Collaborator

mikehardy commented Apr 27, 2021

Main reason I am paying a lot of attention to this is that StrictMode and 5-seconds-to-start-foreground etc - if not pursued to a conclusive end that usually results in starting a worker thread and hiving slow things off into it - will result in flaky behavior forever, and future reports that are hard to reproduce / make working with the library touchy. I've seen that happen in the other lib (react-native-background-geolocation-android) I use that does foreground service (which could crash due to timing) and/or could ANR on startup until it was thoroughly scrubbed

@helenaford
Copy link
Member

yeah that's what i was thinking it could be, like it's downloading an image and takes a while to start the foreground service 🤔 And, maybe it happens to be more common on older API versions. So the bug doesn't necessarily relate to the api version.

@helenaford
Copy link
Member

@mikehardy just had a look at the library you mentioned, same issue was reported here transistorsoft/react-native-background-geolocation#976. I'm not sure though if there's something we can do in notifee to help.

@helenaford
Copy link
Member

and here transistorsoft/react-native-background-geolocation#810 (comment). very similar logs @mars-lan. is this something you could try enabling, StrictMode to help debug? it could be something blocking the main thread?

@mikehardy
Copy link
Collaborator

My guess it will end up being an "all of the above" approach - there are likely StrictMode violations in notifee and in @mars-lan app, and they all add up to a 5s failure

I'm saying that because I'd bet 1% of apps or less are StrictMode clean, it's a safe bet :)

@mars-lan
Copy link
Contributor Author

@helenaford Whoa! Found the root cause! If I set largeIcon to a slightly large local file (256x256 25KB PNG), it'll lead to a crash. Resizing the image down to 128x128 fixed the problem. I suspect Android 8 has a slow/buggy PNG decoding that leads to the eventual 5s timeout. Here's the canonical code to reproduce it.

import React from 'react';
import { AppRegistry } from 'react-native';
import notifee from '@notifee/react-native';

notifee.registerForegroundService(async () => {
  return new Promise(() => {});
});

notifee.createChannel({
  id: 'channel',
  name: 'channel',
});

function Test() {
  React.useEffect(() => {
    notifee.displayNotification({
      title: 'Title',
      body: 'Body',
      android: {
        channelId: 'channel',
        largeIcon: require('./image.png'),
        asForegroundService: true,
      },
    });
  }, []);
  return null;
}

AppRegistry.registerComponent('Test', () => Test);

@helenaford
Copy link
Member

ah nice. OK, so would putting this in the docs to say be cautious about image sizes be a good fix for this?

@mars-lan
Copy link
Contributor Author

I think so. Maybe note that this seems to be an Android 8-specific issue. Another option is to move the image to Android drawable resource, which doesn't seem to suffer the same performance issue as local files.

@helenaford
Copy link
Member

helenaford commented Apr 27, 2021

sounds good. thanks for bringing this issue to our attention and finding the bug. Would this be the same reason for the other api versions 22 - 25? If not, we can reopen invertase/react-native-notifee#291

@mikehardy
Copy link
Collaborator

Maybe put the file decode into bitmap (I think Notifee is handling the file fetch and decode?) into a background thread since it is disk access and update the notification with the bitmap once it's decoded? Is that possible? All the StrictMode violations and timeout items end up with solutions like that

@mars-lan
Copy link
Contributor Author

mars-lan commented Apr 28, 2021

@helenaford I can confirm that by using an image with smaller dimension (128x128) or Android drawables directly for largeIcon, I was able to get the notification shown in all API versions (22-28). I did discover another somewhat-related strange bug (invertase/react-native-notifee#292) with notifee 1.5.0 though.

@mars-lan
Copy link
Contributor Author

@helenaford after rolling out an update that uses Android drawable for largeIcon, we still observed a handful of crashes on Android 8 in production. Looks like we may need to implement a proper fix as per @mikehardy's suggestion to avoid the 5s timeout.

@mikehardy
Copy link
Collaborator

I need a reaction for that where it's clear I'm unhappy but not at all at you personally. 😩 - thanks for the report

@mars-lan
Copy link
Contributor Author

mars-lan commented May 5, 2021

FWIW also started seeing a few crashs on Android 10 (API 29). Definitely less common than API 26 though.

@mars-lan
Copy link
Contributor Author

mars-lan commented May 6, 2021

@helenaford I assume notifee does the following when displayNotification is called

  1. Call startForegroundService
  2. Build Notification
  3. Call startForeground with the notification

How about simply swapping 1 & 2 to minimze the time between startForegroundService & startForeground?

@helenaford
Copy link
Member

Hey @mars-lan , I'm looking into if we can fix the image loading, not sure how it's blocking as it's done with tasks but on my TODO is to check it in 'strict' mode.

Unfortunately, it already does that, builds the notification, then does startForegroundService, and startForeground in that order. It was one of the first things I checked 😢 I think it's about the main thread being blocked somehow with the image decoding, even after the notification is built. We pass the built notification to startForegroundService.

@mikehardy
Copy link
Collaborator

Sounds like it is actually doing 2 then 1 and 3 per the comment above descriptors, vs 1 3 then 2. Is it even possible to startForegroundService before you have the notification built? sounds like maybe not

@mars-lan
Copy link
Contributor Author

mars-lan commented May 7, 2021

@mikehardy 3 takes the notification as one of the input so 1-3-2 isn't really an option here.

Thanks for the explanation, @helenaford. Hopefully strict mode will reveal more. At least it's consistently reproducible on Andorid 8 emulator so hopefully it makes debugging easier.

A somewhat hacky solution is to use startForegound instead if displayNotification is called in the foreground, which I assume is the majority of the cases. This way it won't be subjected to the 5s ANR.

@mars-lan
Copy link
Contributor Author

@helenaford Based on https://stackoverflow.com/questions/44425584/context-startforegroundservice-did-not-then-call-service-startforeground & https://issuetracker.google.com/issues/76112072, this seemss like a common issue for Android. In fact, there's simply no way to guarantee that the system will even start the foreground service within 5s, regardless how little work there is beteween startForegroundService & startForeground. This is consistent with what we observed in the wild even after dropping largeIcon altogether.

Looks like the only foolproof way to avoid ANR is context.bindService + startService + service.startForeground. As a developer, I'd rather lose the ability to invoke displayNotification in background, then dealing with potential crashes that I can't control.

@mikehardy
Copy link
Collaborator

Wow! Great digging @mars-lan - lot of electrons spent discussing this one.
Looks like https://stackoverflow.com/a/57521350/9910298 is a great example of how to use bindService.

I'm a little confused on your last paragraph though - not to read too much in to it, but could your last sentence ("As a developer...") be considered a new paragraph?

To be precise, are you saying that unless this API is fixed in general ("the ability to invoke displayNotification in background") you'd rather not touch it, but you think the bindService path could fix it? Or, are you saying that by your read on the links you posted it is unfixable?

I ask because it seemed as though the bindService style (or even JobService maybe - but bindService easier...) would fix it.

Then you'd have displayNotification in the background and no crashes, which would be ideal. But I might be missing something you've already realized

@mars-lan
Copy link
Contributor Author

@mikehardy Thanks for the info. The trick in https://stackoverflow.com/a/57521350/9910298 seems promsing. However, https://stackoverflow.com/a/53286232/15875597 and other users' comments claimed that the approach doesn't work 100%.

https://issuetracker.google.com/issues/76112072#comment158 says that the only reliable fix is to replace startForegroundService with startService, which means that you won't be able to invoke displayNotifiation when the app is runnning in background (aka Headless mode). What I meant is that I can certainly live with this limitation if that's the price to pay for stability.

@mars-lan
Copy link
Contributor Author

Can confirm that the same crash happens to Android 8-10 in production. Can we prioritize a fix?

@mikehardy
Copy link
Collaborator

Likely yes - crash issues need priority. You can see from the timing on previous comment (11 days ago) this was immediately prior to google IO and I think you've actually participated in react-native-firebase repo in the intervening time on related issues? Hopefully not mis-remembering, at any rate that pulled my full attention for a while and I'm still unwinding the stack from there so I haven't been able to turn attention back here. Will tag this up at least

@wilau2
Copy link

wilau2 commented May 31, 2021

I think we have occurrences on the crash for android 11 as well. We are still at the beginning of our investigation, we've halted our rollout until we have a better idea of the impact and reproduction.

image

@youssef-makhoul
Copy link

I know it is a really tricky bug to fix, but any idea when the fix can be delivered? and is there any way that we can contribute to the native side of the project?

@sambegin
Copy link

@mikehardy sorry for poking again. We have some bandwidth on our end and could help investigating/resolving this issue. Any way we can contribute to the project? We still have occurrences of that in production and we can't really do anything about it. Not having news from notifee's side is also scary as we need to know if we'll have to implement our own solution or not.

Could you please at least provide us with some insight on the progress of this issue ? 🙏

Thanks a lot

@mikehardy
Copy link
Collaborator

I understand the desire to fix it, and I really appreciate the offer to help. Last time I asked internally we couldn't come up with a good way to extend access for collaboration but perhaps @Salakar has new ideas since then?

In prep for development here I swept through just yesterday and got it cleaned up so changes here on my part are safe (#327 invertase/react-native-notifee#328 invertase/react-native-notifee#326). Steady forward progress but doesn't look exciting. Thanks for your patience

@sambegin
Copy link

@Salakar @mikehardy It's been 14 days without any news on this. We also tried to contact you via the "Contact Us" section of your website and still no signs of answers. I will re-iterate my question again regarding the help we can provide?

This is becoming quite time sensitive for us and we'd like to at least have the sentiment that this is not becoming an inactive library without any support. We'd like to know if someone is looking at this or not and if we can participate in the elaboration of a solution.

Also another question I would have regarding Notifee's support: Any idea when the Extended License including the High priority support could come out? It seems something we could really beneficiate... Please let us know, at least being transparent for all the people using this library 🙂

Thank you very much!

@mikehardy
Copy link
Collaborator

Only speaking to this:

we'd like to at least have the sentiment that this is not becoming an inactive library without any support. We'd like to know if someone is looking at this or not and if we can participate in the elaboration of a solution.

Not including the core library changes, invertase/react-native-notifee@v1.5.0...master

6 releases since this issue was logged

Obviously it would be better if this issue was fixed in one of those releases, but the library is receiving a steady amount of attention and effort, it is definitely under active development

@mars-lan
Copy link
Contributor Author

mars-lan commented Sep 1, 2021

Any progress on this one? :)

@mikehardy
Copy link
Collaborator

I must admit I was dreading the ping on this one - all I can offer is that the previous one from @sambegin is one of only a handful of messages I leave marked as unread in my software mail box. I have not made specific progress on it but am still slowly working towards it on the stack of todo items I've got in my Invertase queue. Cold comfort probably, but it is on the radar.

@mars-lan
Copy link
Contributor Author

mars-lan commented Sep 7, 2021

@mikehardy thanks for the transparency. I understand that there's always competing priorities. It took me nearly a year to get invertase/react-native-notifee#126 to close so this one is still young :P

@mikehardy
Copy link
Collaborator

You always log the tough ones @mars-lan ! 😆

@Salakar Salakar transferred this issue from invertase/react-native-notifee Sep 24, 2021
@mars-lan
Copy link
Contributor Author

mars-lan commented Oct 5, 2021

Thanks for open sourcing the library! Will definitely take a deeper look into this one during the weekends.

@camboYY
Copy link

camboYY commented Nov 14, 2022

any updates ?

@mieszko4
Copy link
Contributor

mieszko4 commented Dec 6, 2022

We just got hit by this issue today - after releasing to production we saw at least 20% of crashes. Locally it is extremely hard to reproduce (both debug and release variant).

Before integrating Notifee for foreground push, we used our own solution which never crashed so I was wondering why Notifee does crash.
The only difference in our case is that we always used ContextHolder.getApplicationContext().startService(intent) regardless of Android version but we got hit by trampoline issue hence we integrated Notifee for a fix.

I am wondering does Notifee really has to use ContextHolder.getApplicationContext().startForegroundService(intent) for Build.VERSION.SDK_INT >= Build.VERSION_CODES.O) in ForegroundService.java...

I see that in startForegroundService it says:

Note: Beginning with SDK Version Build.VERSION_CODES.S, apps targeting SDK Version Build.VERSION_CODES.S or higher are not allowed to start foreground services from the background. See Behavior changes: Apps targeting Android 12 for more details.

So I guess ContextHolder.getApplicationContext().startForegroundService(intent) has to be used to avoid issue with trampoline issue, isn't it?

But then it looks like Notifee could change the condition from Build.VERSION.SDK_INT >= Build.VERSION_CODES.O to Build.VERSION.SDK_INT >= Build.VERSION_CODES.S so it only affects Android 12+.

At least until the real fix is implemented.

@mieszko4
Copy link
Contributor

mieszko4 commented Dec 7, 2022

So I guess ContextHolder.getApplicationContext().startForegroundService(intent) has to be used to avoid issue with trampoline issue, isn't it?

I dig in a bit more. From my finding it is not related to trampoline effect. It is only used because Notifee assumes that foreground push may be started from background and since Buld.VERSION_CODES.O it is not allowed except for few conditions.

So:

  1. If foreground service is started from foreground we can always use ContextHolder.getApplicationContext().startService(intent) and not worry about 5s limit.
  2. Based on this doc foreground service can still be started from background when some conditions are satisfied, e.g. after getting high-priority message.

I think that starting foreground service from background that does not satisfy the conditions linked in (2) is rather an edge-case.

Hence I propose we add a setting in displayNotification like startServiceWitPromise to be used alongside asForegroundService. If startServiceWithPromise is false then we would always use older ContextHolder.getApplicationContext().startService(intent).

diff --git a/android/src/main/java/app/notifee/core/ForegroundService.java b/android/src/main/java/app/notifee/core/ForegroundService.java
index 9150047..7d16768 100644
--- a/android/src/main/java/app/notifee/core/ForegroundService.java
+++ b/android/src/main/java/app/notifee/core/ForegroundService.java
@@ -45,7 +45,7 @@ public class ForegroundService extends Service {
     intent.putExtra("notification", notification);
     intent.putExtra("notificationBundle", notificationBundle);
 
-    if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.O) {
+    if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.O && startServiceWithPromise) {
       ContextHolder.getApplicationContext().startForegroundService(intent);
     } else {
       // TODO test this on older device

For example, startServiceWithPromise: false could be used when a high-priority push that initialises a call.

@github-actions
Copy link

github-actions bot commented Jan 4, 2023

Hello 👋, to help manage issues we automatically close stale issues.

This issue has been automatically marked as stale because it has not had activity for quite some time.Has this issue been fixed, or does it still require attention?

This issue will be closed in 15 days if no further activity occurs.

Thank you for your contributions.

@rcidt
Copy link

rcidt commented Jan 9, 2023

Hello @mikehardy, was there a resolution to this issue?

@github-actions
Copy link

github-actions bot commented Feb 6, 2023

Hello 👋, to help manage issues we automatically close stale issues.

This issue has been automatically marked as stale because it has not had activity for quite some time.Has this issue been fixed, or does it still require attention?

This issue will be closed in 15 days if no further activity occurs.

Thank you for your contributions.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 21, 2023
@abdoutech19
Copy link

abdoutech19 commented Jan 20, 2025

Still getting this issue on Android 14 notifee version 9.1.8 :(

@ajitpatel28
Copy link

i've tried to solve this in this pr - #1202

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests