[REF][Rewrite 26/10] What the Android OS usage is and what it's not

Search This thread

AndreiLux

Senior Member
Jul 9, 2011
3,209
14,597
[Complete rewrite 26/10]

I'm writing this since the abundant misinformation about it coming across many many threads, and all other Android OS threads are too big and real information is easy to overlook.

After several weeks of analyzing and monitoring of the behavior of the device, things are starting to clear up and we can come to a conclusion.

What's the Android OS usage?

Many have noticed that on our Galaxy S2's and various other variations (Epic 4G Touch, etc) that in the Android battery stats the Android OS entry is supposedly eating away a large percentage of the battery used.

s1wcE.png

There's been many threads old and new discussing it and theorizing and many users magically fixing it by doing god knows what changes.

The Android OS package is a bundle of processes, in this case, all root processes used by the system kernel. Please remember this when continuing to read on, as there is much more to it than meets the eye and only once specific aspect of the whole story will be treated.

A diagnosis of the main problem:

b4ab1.png

Download [APP] BetterBatteryStats adds battery history back to Gingerbread. It is your #1 friend on finding out what uses up CPU power and causes problems.

Note: The CPU time in BetterBatteryStats is normalized, meaning 100% load during full frequency for a minute will show up as 1 minute CPU time, 100% load at half-frequency for a minute will show 30 seconds CPU time. Keep this in mind when monitoring processes. The CPU time shown in the Android battery stats is useless.

As you can see in the screenshot, "suspend" and "events/0" pop up as the big consumers of CPU power. Red means it's a system/kernel process and blue means it's a user process.

The suspend process is the program that's in charge of telling the hardware to go into deep sleep mode and reawaken from it.
The events/0 is just the general events manager for the system kernel.

You can monitor this with Watchdog Task Manager. Set it up so that it monitors system processes and raises notifications. Depending on the severity of the bug at the time and some black magic, you'll see the 99% load on the system.

h5B7a.png
6Klow.png

Due to how the hardware of the phone is designed, the default kernel settings of the phone lock its frequency to 800MHz before falling into deep sleep. This is believed to be one of the main problems as to why so much battery is drained. Compared to other the devices the Exynos 4210 in our Galaxy S2's takes a much longer time to enter deep sleep (suspend) and wake up again from sleep (resume). This can be seen in the device message logs:
<6>[ 749.242666] PM: suspend of devices complete after 171.413 msecs
<6>[ 749.250855] PM: late suspend of devices complete after 0.396 msecs
<4>[ 749.250864] Disabling non-boot CPUs ...
<7>[ 749.250909] s3c_pm_enter(3)
<7>[ 749.250916] s3c_sleep_save_phys=0x40d9fe80
<7>[ 749.250959] sleep: irq wakeup masks: 000fffdd,cb37b97f
<6>[ 0.000000] WAKEUP_STAT: 0x80000001
<6>[ 0.000000] WAKUP_INT0_PEND: 0x80
<6>[ 0.000000] WAKUP_INT1_PEND: 0x0
<6>[ 0.000000] WAKUP_INT2_PEND: 0x0
<6>[ 0.000000] WAKUP_INT3_PEND: 0x0
<7>[ 0.000000] s3c_pm_enter: post sleep, preparing to return
<7>[ 0.000000] S3C PM Resume (post-restore)
<7>[ 0.000000] Wakeup from sleep, 0x80000001
<6>[ 0.000000] L310 cache controller enabled
<6>[ 0.000000] l2x0: 16 ways, CACHE_ID 0x4100c4c5, AUX_CTRL 0x7e470001, Cache Size: 1048576 B
<6>[ 749.251252] PM: early resume of devices complete after 0.255 msecs
..
..
<6>[ 749.748189] PM: resume of devices complete after 496.599 msecs
The fact that these operations are done at a higher frequency, together with the fact that a suspend-wake cycle takes around anything from 0.6 to 0.8 seconds, while doing nothing else, will drain more power than actually needed.

The screenshot on the right is a period of 15 minutes where the processes were bugged, the timers in CPU Spy were reset, and the screen immediately turned off to collect the statistics. A screen-off profile was used to limit the maximum frequency to 200MHz, so theoretically any standard process is not allowed to use any of the frequency-states above that. Nevertheless, There you can see the device being in the 800MHz state for roughly half the time. This means that half the time the device is spending on being awake is being wasted on just entering and leaving the deep sleep state.

Note: The deep sleep state is a state in which the CPU isn't clocked anymore and caches are no longer kept coherent, basically it's turning off most of the CPU, and thus conserving a lot of power.

What's the cause of high Android OS usage?

This is where things get interesting.

Network usage

The most common cause of the device waking up often enough to cause a problem and raise the suspend process up, and thus also the AOS usage, are incoming and outgoing network traffic. These will be attributed to the operating system rather than the service or application at cause of the problem.

Wild wakelocks

Second most common cause are high-frequency wakelocks caused by some application or service. Refer to the next section for this topic.

Incorrect statistical interpretation

One has to keep in mind that when generating the percentages shown under the battery stats in Android, that those values are just an estimate. The ROM has a file called power_profile.xml in which power drain values for different uses are listed. These values are coming from Samsung. For example, the CPU running at 1200MHz is listed as using 577mAh of power, so when a task is running for 10 seconds at 1200MHz, then it means that that process will be getting attributed a power consumption of 1602µAh (577mAh/3600s *10s). ASOP ROMs like MIUI will actually list these values in the battery stats. Each entry in the battery usage as such has an estimated power consumption value, and out of which the percentages are calculated from.

The problem is, that these are all just estimates and wether they are calculated correctly is up to debate. There are issues if either dual core use is taken into account in these estimates, or not, and even if the values provided by power_profile.xml are representative of real use or not. One issue for certain is that power consumption will be wrong for people who underclock/overclock, and undervolting. There are no entries for states above 1200MHz or below 200MHz, and the ones already present are meant to be representative of default/stock voltages. If you're running 100MHz in idle, the system will use the 200MHz estimates and thus overestimate your power consumption, if you're overclocking and using 1400MHz for example, the system will underestimate your power consumption.

This part is also what causes kernels which claim to fix the problem to be nothing but red herrings: somewhere in the patching of the kernels to 2.6.35.12 there is a change which makes the whole suspend and wakeup process no longer visible to the system and thus is no longer registered. Because of that, there are no statistics of these and the drain can not be calculated. Tests have shown that this is nothing other than just hiding of the problem rather than a tangible fix.

What you can do:

First of all you need to find the cause of the problem; study your wakelocks. You can do this by using BBS as mentioned above but this will only show part of the story as it will not tell you about system wakelocks, but only user wakelocks. If there is no obvious villain listed there, then you must do some more advanced troubleshooting:

Dump your wakelocks file using a terminal or over ADB if you are familiar with it, as follows:
cat /proc/wakelocks > /sdcard/wakelocks.csv​
To read this data you need to import it into a spreadsheet application like Excel or OpenOffice and you'll end up getting something like this: (Read the thread around here on how to format your spreadsheet)

zOGey.png
You need to mainly look for high-duration wakelocks, and the wake_count frequency, meaning how often your device has woken up.

While this is all a bit advanced and all, you can do it simpler:
Avoid using application and services that use a permanent internet connections. Avoid using applications which use polling for their connectivity and instead use those who support C2DM push-notifications. This will vastly increase your battery life by vastly decreasing network traffic. If this all doesn't help, try doing a network analysis (packet capture) to inspect the source of it, please refer to discussions later in the thread for this.

What your device can be capable of:


NHGdg.png
kXRnh.png
In the first screenshot you can see two periods of around 18 hours where the battery drain is relatively flat. In this period the suspend process gained merely a few seconds of CPU time. Then I use the phone for a bit, it goes haywire, and then calms down again and then starts draining again. You can see this in the step-wise drain in the curve. It should be smooth and not like that.

In the second screenshot it's after a full charge where it didn't trigger for 8 hours. My battery is perfectly calibrated so those are real 4% of battery. So if things were to go smoothly on the software part then the phone is idling at around 0.4% of battery per hour on Wifi, and other people have been reporting down to 0.8% or even less with a good 3G signal.

Left side: low network usage, how it should be; Right side: high network usage, how it behaves sometimes in severe cases.
Nt4tg.png
E25ph.png
Same network, same location, same signal strength, same ROM (Changing my font maybe fixed it....) , same wakelocks between them, same background apps running. Not even 5 minutes screen time on the two. The only difference with .14 patched kernels I'm experiencing is that it's not showing up anymore in the stats, but it drains just as much as in the right screen, showing the same CPU state behavior. It's also happening less often, and many say it's not happening at all anymore, but how would they know it it's hidden now?


"tvout resume wo"

One might have noticed the high usage of the "tvout resume wo" (wo = work, btw) in BetterBatteryStats. This is part of the TVOut functionality/driver, and it seems that every time the CPU resumes it goes through this work wether you use TVOut or not. Due it being a kernel process it's also part of adding up to the Android OS package usage. If you are suffering from high frequency wakeups, then this will also go up as system routines for this functionality are called every time the device wakes up.
<6>[ 749.251252] PM: early resume of devices complete after 0.255 msecs
<3>[ 749.251350] [MHL]mhl_int_irq_handler() is called


Last words

The device is still superb. Even with this bug the battery is equivalent to many other high-end devices out there. The thing is, it could be much better. If you want to squeeze out most of your device read on the discussions in this thread to try to find out what's keeping the phone from wasting power. Forward this to people so they're informed and aware of it.
 
Last edited:

xak944

Senior Member
Jun 9, 2010
379
211
North Carolina
I really appreciate all the effort you've put into researching this!

Summarized quick-facts:

  • It's caused by a bug in the drivers.

While I certainly do believe this, what leads you to believe it is a driver issue? And what particular driver have you found is the culprit?

We can just hope that Samsung will fix this issue soon. For all we know they're aware of it and are working on it. The original Galaxy S had this very same issue and it has been supposedly fixed in a firmware update earlier this year. The Galaxy S2 is not the only phone having this issue, but it's the one that has it guaranteed most of the time.

I know that the 2.3.4 update for the original GX2 was supposed to have fixed this exact bug. You're saying that it is not in fact fixed? Shouldn't Samsung be alerted to this blunder? :confused:

What your device would be capable of.


NHGdg.png
kXRnh.png
In the first screenshot you can see two periods of around 18 hours where the battery drain is relatively flat. In this period the suspend process gained merely a few seconds of CPU time. Then I use the phone for a bit, it goes haywire, and then calms down again and then starts draining again. You can see this in the step-wise drain in the curve. It should be smooth and not like that.

In the second screenshot it's after a full charge where it didn't trigger for 8 hours. My battery is perfectly calibrated so those are real 4% of battery. So if things were to go smoothly on the software part then the phone is idling at around 0.4% of battery per hour on Wifi, and other people have been reporting down to 0.8% or even less with a good 3G signal.

Basically Android OS should not be even showing up in the battery stats, like on many other devices. Lowest people are getting on the Galaxy S2 is about 5-10%. There's also some other issues like high usage of "tvout resume wo", but that's something irrelevant compared to the elephant in the room that is "suspend" & "events/0".

So you're saying it was just dumb luck and you happened to not trigger the bug? So this could account for people finding "solutions," but they were only coincidental of the bug randomly not manifesting itself?
 
Last edited:

Molitro

Senior Member
Oct 19, 2010
744
167
Google Pixel 5
ASUS ZenFone 8
Very weell explained.

As far as the first SGS, I can confirm that the bug was solved from the first 2.3.4.
I was experiencing a fantastic battery life with my SGS, and I was quite shocked when I had to watch helpless as Android OS went up on the S2.
 

Jebus99

Senior Member
Jun 4, 2011
139
25
That was a fantastic explanation. It does explain the vast difference in AOS between my Nexus S and SGSII. Hope this thread gets pinned to help put away the vast number of AOS threads present. And hopefully it'll help get more people to star the issue and get it the attention it badly needs!
 

AndreiLux

Senior Member
Jul 9, 2011
3,209
14,597
While I certainly do believe this, what leads you to believe it is a driver issue? And what particular driver have you found is the culprit?
It's been confirmed by a Samsung person on the original Galaxy S where it had the very same issue. Which driver exactly it is, I don't know. Supposedly it was the Wifi-driver but the bug also happens to some extent without Wifi, so I don't know. If you want you're welcome to go through 120MB of kernel source code to find the issue and change it.
I know that the 2.3.4 update for the original GX2 was supposed to have fixed this exact bug. You're saying that it is not in fact fixed? Shouldn't Samsung be alerted to this blunder? :confused:
Look. There's Android, and then there's a complete phone firmware. People tend to call firmware versions by the Android version they're on. Android version itself is completely meaningless and it's not Google's job to fix this, the manufacturers are the ones who deliver the drivers and everything that makes Android run on the phone. There's been for example about 8 leaked firmware versions for 2.3.4 on the Galaxy S2.

So you're saying it was just dumb luck and you happened to not trigger the bug? So this could account for people finding "solutions," but they were only coincidental of the bug randomly not manifesting itself?
Yes. Go wave a dead chicken over the phone and practice voodoo magic on it and you'll be just as successful at changing the behavior. The issue goes way too deep for the user to have any reproducible impact on it.
 

SiSL

Senior Member
Jan 14, 2011
130
45
Istanbul
Thanks for great explanation and having a bit more light on the issue. I'm going to star those as soon as it goes off maintenance.
 

xak944

Senior Member
Jun 9, 2010
379
211
North Carolina
It's been confirmed by a Samsung person on the original Galaxy S where it had the very same issue.

Was this in a support case you opened with Samsung? A forum? Where and how?

Look. There's Android, and then there's a complete phone firmware. People tend to call firmware versions by the Android version they're on. Android version itself is completely meaningless and it's not Google's job to fix this, the manufacturers are the ones who deliver the drivers and everything that makes Android run on the phone. There's been for example about 8 leaked firmware versions for 2.3.4 on the Galaxy S2.

Then why are we flagging the issues on the Android project? Shouldn't we be hammering on Samsung support channels for a fix if the problem is located in their proprietary drivers?
 
  • Like
Reactions: microline

AndreiLux

Senior Member
Jul 9, 2011
3,209
14,597
Was this in a support case you opened with Samsung? A forum? Where and how?
Galaxy S XDA thread ; post in question ; response by Todd who is a Google employee.
Then why are we flagging the issues on the Android project? Shouldn't we be hammering on Samsung support channels for a fix if the problem is located in their proprietary drivers?
You're more than welcome to show me the proper non-Korean Samsung channels. You're the one who said in big bold red letters to flag it to Google. :p In any way, Google forwards big issues directly to the manufacturers themselves.
 
Last edited:

xak944

Senior Member
Jun 9, 2010
379
211
North Carolina

Oh good grief. We've really gone full circle now. That entire thread claims it was fixed in 2.3.4! :eek:


You're the one who said in big bold red letters to flag it to Google.
Yeah, you're right. I should've done my own research and not assumed you knew what you were talking about.

You're more than welcome to show me the proper non-Korean Samsung channels
http://www.samsung.com/us/SUPPORT/
http://twitter.com/SamsungSupport
http://facebook.com/SamsungSupport
 
Last edited:
  • Like
Reactions: soraxd

jonny68

Senior Member
Mar 27, 2010
5,744
600
Dublin
Very good post probably the best on the subject to date.

Personally running 2.3.5 i don't find it as bad but there can be no denying the problem is still there and until Samsung sort it out it will continue to be there.
 

AndreiLux

Senior Member
Jul 9, 2011
3,209
14,597
Oh good grief. We've really gone full circle now. That entire thread claims it was fixed in 2.3.4! :eek:
It was... for the Galaxy S, we're on another phone. XXJVP fixed it, not 2.3.4. I said it earlier, firmware version != Android version.
Well I'm not that stupid and have been that far myself. The US site doesn't even let me select my model because it doesn't officially exist there, and the international site where I'm rerouted to doesn't even have a report feature and the only contact is a phone number. And I'm not going to start FB'ing or Twittering about driver issues to a mega-conglomerate coorporation, I'll leave others to do that. Hardly a "proper" tech support channel as Google Code. Only the Korean site has that, but sadly, I can't speak Korean.
 

spridell

Senior Member
Aug 17, 2010
564
111
Thanks for the research.

Lets all star this so we cant get it to Samsung and they can fix this ASAP.
 

claimui

Senior Member
Sep 11, 2011
173
105
Taipei
Great post. Lots of people are talking about battery life but without understanding how the stats work and how to interpret them.

The battery usage screenshot is funny though. Yeah, I understand it shouldn't have those steep fall-offs. But still, must be pretty nice to have 40% battery left after 2.5 days!
 

marcadam

Senior Member
Nov 23, 2010
697
186
Thanks for your explanation, I often wondered what the android os bug was and now I know. I'm quite fortunate by the sound of things my android os is usually around 15%. Is this issue likely to be resolved with ICS?

Sent from my GT-I9100 using Tapatalk
 

Shafty

Senior Member
Sep 20, 2007
187
23
Here's what I don't get. I've had the phone since week one back in may and from may to September - no os bug. So why now? Have I done something to trigger it? Can we isolate the cause to prevent others getting it? I can list what I've done different recently to previous months if it helps?

And if it's drivers then why doesn't it access the CPU when the charger is plugged in?
5adc6b86-8cd1-86a6.jpg


I share everyone's frustration over all this.

Sent from my GT-I9100 using XDA App
 
Last edited:

AndreiLux

Senior Member
Jul 9, 2011
3,209
14,597
Here's what I don't get. I've had the phone since week one back in may and from may to September - no os bug. So why now? Have I done something to trigger it? Can we isolate the cause to prevent others getting it? I can list what I've done different recently to previous months if it helps?

Sent from my GT-I9100 using XDA App
I have no idea.

I've tried it on half a dozen ROMs, wiped everything clear, removed every app to the point of the ROM being pretty naked, and still didn't get to change it. There's a lot of people like you who say they don't have the bug and suddenly have it. It may be a hardware fault that doesn't trigger until some certain circumstances are fulfilled. Maybe it's just some stupid bug in the drivers and it's some synchronization problem between registers. It may be cosmic rays who change a bit in the memory. Who the **** knows.

All I know is that nobody has been able to isolate it to any common cause, and I sure haven't managed in the last 3 months, and I've had it since July myself. At the same time you ask yourself why there's such a wide variety of devices having the same issue, but different hardware. If I would be paid full time to look through the source code and find the bug then I'd do it. But until then, I just wanted to share it with everybody and just tell them to stop wasting their time.
And if it's drivers then why doesn't it access the CPU when the charger is plugged in?
Because when the device is plugged in then a wakelock is active and the device doesn't go to deep sleep at all. Since the going to deep sleep process is what is bugging, then wakelocking it prevents it from bugging out. I mentioned this already in the post. Since you have MIUI, you can for example tell it in the battery setting to not go to deep sleep at all (there's a checkbox), and see the results for yourself.
 
Last edited:

shotta35

Senior Member
Nov 30, 2008
1,578
449
NYC & Germany
I'll just say i got my SGS2 recently and it came with 2.3.4 - there was no AOS bug at first then oneday started noticing it. I upgraded to 2.3.5 via XXKI3 and it's better as it's "only" 11% now rather than 29-40%.


sc20111005130237.png
sc20111005130255.png

sc20111005130300.png


Guess i should check with CPUspy to see if the phone is actually deep sleeping - I would however assume so as it's been getting better battery life since i upgrading to 2.3.5.
 
Last edited:
  • Like
Reactions: Capp5050

Top Liked Posts

  • There are no posts matching your filters.
  • 211
    [Complete rewrite 26/10]

    I'm writing this since the abundant misinformation about it coming across many many threads, and all other Android OS threads are too big and real information is easy to overlook.

    After several weeks of analyzing and monitoring of the behavior of the device, things are starting to clear up and we can come to a conclusion.

    What's the Android OS usage?

    Many have noticed that on our Galaxy S2's and various other variations (Epic 4G Touch, etc) that in the Android battery stats the Android OS entry is supposedly eating away a large percentage of the battery used.

    s1wcE.png

    There's been many threads old and new discussing it and theorizing and many users magically fixing it by doing god knows what changes.

    The Android OS package is a bundle of processes, in this case, all root processes used by the system kernel. Please remember this when continuing to read on, as there is much more to it than meets the eye and only once specific aspect of the whole story will be treated.

    A diagnosis of the main problem:

    b4ab1.png

    Download [APP] BetterBatteryStats adds battery history back to Gingerbread. It is your #1 friend on finding out what uses up CPU power and causes problems.

    Note: The CPU time in BetterBatteryStats is normalized, meaning 100% load during full frequency for a minute will show up as 1 minute CPU time, 100% load at half-frequency for a minute will show 30 seconds CPU time. Keep this in mind when monitoring processes. The CPU time shown in the Android battery stats is useless.

    As you can see in the screenshot, "suspend" and "events/0" pop up as the big consumers of CPU power. Red means it's a system/kernel process and blue means it's a user process.

    The suspend process is the program that's in charge of telling the hardware to go into deep sleep mode and reawaken from it.
    The events/0 is just the general events manager for the system kernel.

    You can monitor this with Watchdog Task Manager. Set it up so that it monitors system processes and raises notifications. Depending on the severity of the bug at the time and some black magic, you'll see the 99% load on the system.

    h5B7a.png
    6Klow.png

    Due to how the hardware of the phone is designed, the default kernel settings of the phone lock its frequency to 800MHz before falling into deep sleep. This is believed to be one of the main problems as to why so much battery is drained. Compared to other the devices the Exynos 4210 in our Galaxy S2's takes a much longer time to enter deep sleep (suspend) and wake up again from sleep (resume). This can be seen in the device message logs:
    <6>[ 749.242666] PM: suspend of devices complete after 171.413 msecs
    <6>[ 749.250855] PM: late suspend of devices complete after 0.396 msecs
    <4>[ 749.250864] Disabling non-boot CPUs ...
    <7>[ 749.250909] s3c_pm_enter(3)
    <7>[ 749.250916] s3c_sleep_save_phys=0x40d9fe80
    <7>[ 749.250959] sleep: irq wakeup masks: 000fffdd,cb37b97f
    <6>[ 0.000000] WAKEUP_STAT: 0x80000001
    <6>[ 0.000000] WAKUP_INT0_PEND: 0x80
    <6>[ 0.000000] WAKUP_INT1_PEND: 0x0
    <6>[ 0.000000] WAKUP_INT2_PEND: 0x0
    <6>[ 0.000000] WAKUP_INT3_PEND: 0x0
    <7>[ 0.000000] s3c_pm_enter: post sleep, preparing to return
    <7>[ 0.000000] S3C PM Resume (post-restore)
    <7>[ 0.000000] Wakeup from sleep, 0x80000001
    <6>[ 0.000000] L310 cache controller enabled
    <6>[ 0.000000] l2x0: 16 ways, CACHE_ID 0x4100c4c5, AUX_CTRL 0x7e470001, Cache Size: 1048576 B
    <6>[ 749.251252] PM: early resume of devices complete after 0.255 msecs
    ..
    ..
    <6>[ 749.748189] PM: resume of devices complete after 496.599 msecs
    The fact that these operations are done at a higher frequency, together with the fact that a suspend-wake cycle takes around anything from 0.6 to 0.8 seconds, while doing nothing else, will drain more power than actually needed.

    The screenshot on the right is a period of 15 minutes where the processes were bugged, the timers in CPU Spy were reset, and the screen immediately turned off to collect the statistics. A screen-off profile was used to limit the maximum frequency to 200MHz, so theoretically any standard process is not allowed to use any of the frequency-states above that. Nevertheless, There you can see the device being in the 800MHz state for roughly half the time. This means that half the time the device is spending on being awake is being wasted on just entering and leaving the deep sleep state.

    Note: The deep sleep state is a state in which the CPU isn't clocked anymore and caches are no longer kept coherent, basically it's turning off most of the CPU, and thus conserving a lot of power.

    What's the cause of high Android OS usage?

    This is where things get interesting.

    Network usage

    The most common cause of the device waking up often enough to cause a problem and raise the suspend process up, and thus also the AOS usage, are incoming and outgoing network traffic. These will be attributed to the operating system rather than the service or application at cause of the problem.

    Wild wakelocks

    Second most common cause are high-frequency wakelocks caused by some application or service. Refer to the next section for this topic.

    Incorrect statistical interpretation

    One has to keep in mind that when generating the percentages shown under the battery stats in Android, that those values are just an estimate. The ROM has a file called power_profile.xml in which power drain values for different uses are listed. These values are coming from Samsung. For example, the CPU running at 1200MHz is listed as using 577mAh of power, so when a task is running for 10 seconds at 1200MHz, then it means that that process will be getting attributed a power consumption of 1602µAh (577mAh/3600s *10s). ASOP ROMs like MIUI will actually list these values in the battery stats. Each entry in the battery usage as such has an estimated power consumption value, and out of which the percentages are calculated from.

    The problem is, that these are all just estimates and wether they are calculated correctly is up to debate. There are issues if either dual core use is taken into account in these estimates, or not, and even if the values provided by power_profile.xml are representative of real use or not. One issue for certain is that power consumption will be wrong for people who underclock/overclock, and undervolting. There are no entries for states above 1200MHz or below 200MHz, and the ones already present are meant to be representative of default/stock voltages. If you're running 100MHz in idle, the system will use the 200MHz estimates and thus overestimate your power consumption, if you're overclocking and using 1400MHz for example, the system will underestimate your power consumption.

    This part is also what causes kernels which claim to fix the problem to be nothing but red herrings: somewhere in the patching of the kernels to 2.6.35.12 there is a change which makes the whole suspend and wakeup process no longer visible to the system and thus is no longer registered. Because of that, there are no statistics of these and the drain can not be calculated. Tests have shown that this is nothing other than just hiding of the problem rather than a tangible fix.

    What you can do:

    First of all you need to find the cause of the problem; study your wakelocks. You can do this by using BBS as mentioned above but this will only show part of the story as it will not tell you about system wakelocks, but only user wakelocks. If there is no obvious villain listed there, then you must do some more advanced troubleshooting:

    Dump your wakelocks file using a terminal or over ADB if you are familiar with it, as follows:
    cat /proc/wakelocks > /sdcard/wakelocks.csv​
    To read this data you need to import it into a spreadsheet application like Excel or OpenOffice and you'll end up getting something like this: (Read the thread around here on how to format your spreadsheet)

    zOGey.png
    You need to mainly look for high-duration wakelocks, and the wake_count frequency, meaning how often your device has woken up.

    While this is all a bit advanced and all, you can do it simpler:
    Avoid using application and services that use a permanent internet connections. Avoid using applications which use polling for their connectivity and instead use those who support C2DM push-notifications. This will vastly increase your battery life by vastly decreasing network traffic. If this all doesn't help, try doing a network analysis (packet capture) to inspect the source of it, please refer to discussions later in the thread for this.

    What your device can be capable of:


    NHGdg.png
    kXRnh.png
    In the first screenshot you can see two periods of around 18 hours where the battery drain is relatively flat. In this period the suspend process gained merely a few seconds of CPU time. Then I use the phone for a bit, it goes haywire, and then calms down again and then starts draining again. You can see this in the step-wise drain in the curve. It should be smooth and not like that.

    In the second screenshot it's after a full charge where it didn't trigger for 8 hours. My battery is perfectly calibrated so those are real 4% of battery. So if things were to go smoothly on the software part then the phone is idling at around 0.4% of battery per hour on Wifi, and other people have been reporting down to 0.8% or even less with a good 3G signal.

    Left side: low network usage, how it should be; Right side: high network usage, how it behaves sometimes in severe cases.
    Nt4tg.png
    E25ph.png
    Same network, same location, same signal strength, same ROM (Changing my font maybe fixed it....) , same wakelocks between them, same background apps running. Not even 5 minutes screen time on the two. The only difference with .14 patched kernels I'm experiencing is that it's not showing up anymore in the stats, but it drains just as much as in the right screen, showing the same CPU state behavior. It's also happening less often, and many say it's not happening at all anymore, but how would they know it it's hidden now?


    "tvout resume wo"

    One might have noticed the high usage of the "tvout resume wo" (wo = work, btw) in BetterBatteryStats. This is part of the TVOut functionality/driver, and it seems that every time the CPU resumes it goes through this work wether you use TVOut or not. Due it being a kernel process it's also part of adding up to the Android OS package usage. If you are suffering from high frequency wakeups, then this will also go up as system routines for this functionality are called every time the device wakes up.
    <6>[ 749.251252] PM: early resume of devices complete after 0.255 msecs
    <3>[ 749.251350] [MHL]mhl_int_irq_handler() is called


    Last words

    The device is still superb. Even with this bug the battery is equivalent to many other high-end devices out there. The thing is, it could be much better. If you want to squeeze out most of your device read on the discussions in this thread to try to find out what's keeping the phone from wasting power. Forward this to people so they're informed and aware of it.
    13
    I really appreciate all the effort you've put into researching this!

    Summarized quick-facts:

    • It's caused by a bug in the drivers.

    While I certainly do believe this, what leads you to believe it is a driver issue? And what particular driver have you found is the culprit?

    We can just hope that Samsung will fix this issue soon. For all we know they're aware of it and are working on it. The original Galaxy S had this very same issue and it has been supposedly fixed in a firmware update earlier this year. The Galaxy S2 is not the only phone having this issue, but it's the one that has it guaranteed most of the time.

    I know that the 2.3.4 update for the original GX2 was supposed to have fixed this exact bug. You're saying that it is not in fact fixed? Shouldn't Samsung be alerted to this blunder? :confused:

    What your device would be capable of.


    NHGdg.png
    kXRnh.png
    In the first screenshot you can see two periods of around 18 hours where the battery drain is relatively flat. In this period the suspend process gained merely a few seconds of CPU time. Then I use the phone for a bit, it goes haywire, and then calms down again and then starts draining again. You can see this in the step-wise drain in the curve. It should be smooth and not like that.

    In the second screenshot it's after a full charge where it didn't trigger for 8 hours. My battery is perfectly calibrated so those are real 4% of battery. So if things were to go smoothly on the software part then the phone is idling at around 0.4% of battery per hour on Wifi, and other people have been reporting down to 0.8% or even less with a good 3G signal.

    Basically Android OS should not be even showing up in the battery stats, like on many other devices. Lowest people are getting on the Galaxy S2 is about 5-10%. There's also some other issues like high usage of "tvout resume wo", but that's something irrelevant compared to the elephant in the room that is "suspend" & "events/0".

    So you're saying it was just dumb luck and you happened to not trigger the bug? So this could account for people finding "solutions," but they were only coincidental of the bug randomly not manifesting itself?
    8
    It is not signal related. The international version has been out for 4+ months now, I've had mine for 3 months and think I know enough of the behavior of the device and how it acts than you who has had for not even 2 weeks on AT&T. I already responded to you in the AT&T thread so I won't bother repeating myself here once again.
    And in those two weeks, I've run more controlled tests (Top off battery, reset CPUSpy timers, then let it drain in known signal conditions, approx. one every night) than I have seen from you (0 controlled tests - just a bunch of flailing about with no methodology and poor data collection practices, we'll get to that later.)

    I took a while to reply because I wanted to collect more data. Signal strength matters for battery life in a cell phone - over a decade of owning cell phones of various technologies, and battery life being negatively affected for poor signal strength has been a universal truth on all of them. It matters less so for GSM/EDGE (due to them using timeslots for multiple channel access, which necessitates less aggressive transmit power control.) than for CDMA-based technologies (cdmaOne, CDMA2000, UMTS) which have strict requirements on handset transmitter power control to solve the near/far problem. UMTS has been notorious for high battery drain since its inception, which is why people STILL like to try and force GSM/EDGE mode on modern phones. The situation has improved a lot since the early days of UMTS, but we've been at the physical limits of transmit power amplifier efficiency for a while. You seem to have ridiculous expectations that compared to recent predecessors, the GS2 is some sort of mythical god-phone that can violate basic laws of physics, such as conservation of energy. Sorry, it isn't - just like any other phone, the radio is bound by the laws of physics.

    You asked me what my usage patterns were that I would consider less than 1%/hour idle drain to be good, since you think that should be bad.
    1) I'm in the United States. Population density is lower, which means tower density is lower, which means signal strengths on average are much lower than is likely typical in Europe.
    2) I'm on AT&T. There's no conclusive evidence that the network supports fast dormancy, but most anecdotal evidence indicates that it does not, despite AT&T labs having an article that talks about it (although it also talks about how AT&T is, as usual, working on their own way of doing things. AT&T is a major sufferer "not invented here" syndrome). (Enabling fast dormancy on the Infuse caused all sorts of problems, enabling it on the GS2 does nothing, so at least the GS2 has saner fallback with enabling it on a non-supporting network.) Fast dormancy is a known battery-saving feature.
    3) I'm on AT&T - There's enough known about their network that it's obvious they're one of the "big two" in this poorly anonymized paper, and the rash of recent people getting tethering nastygrams makes it obvious they're the carrier doing DPI - http://research.microsoft.com/en-us/people/mzh/netpiculet.pdf - Unfortunately NetPiculet is no longer available, but I'd be shocked if AT&T is not one of the carriers with ridiculously low TCP timeouts, also known for causing poor drain.

    Unfortunately I'm a bit short on weak-signal tests due to a persistent sinus infection - I've missed a lot of work (which is my classic weak-signal test case) in the past two weeks, but on the data runs I've taken, the pattern is consistent - poor signal has a significant detrimental affect on battery life, both because the radio has to work harder, and also deep sleep percentages start dropping as sync processes hold wakelocks longer.

    BTW, I've never seen events/0 or suspend take more than a minute out of a 12-hour run, even on .7 kernels. Yet AOS is claimed to have 40-60% usage in a situation where drain performance is excellent. The AOS reported usage is just plain inconsistent with reality.

    The CPU architecture is similar? What the **** are you on? The S5PV210/S5PC210 are last generation architectures based on Cortex A8 designs while the 4210 are A9's, and googling even for 10 seconds will give you quotes like:
    Cortex-A9 also boasts numerous improvements in efficiency, claiming power consumption numbers nearly half that of the Cortex-A8, as well as the ability to use multiple-core technology to scale processing power in accordance with energy limitations.​
    They are as similar as saying a Core2 is similar to a Pentium 4 because they have the same instruction set.
    Resorting to profanity now, are we? A clear sign you know your arguments are falling apart.

    I've been working as an engineer long enough to know not to blindly accept marketing material as truth. "claims" power consumption "nearly half" in marketing material - what's reality? Anyway, you can reduce CPU usage all you want, as I said above, all that will do is make signal strength dominate even more.

    Yes, they are both 45nm processes, but with a 2 year gap in between them, the 4210 being the last one before shrinking to 32nm so 45nm is as mature right now as it will ever get, and if you think there's no yield and process improvements between mask shrinks then you are just plain ignorant.
    Go read on the tear-down article of the S2 and you'll see how it has almost exclusively new, never used, high-end components for almost everything.
    Not a 2 year gap - the S5PV210 in the Infuse is a refined variant of what most first-gen GalaxyS phones had, and hence includes most of those yield and process improvements (which is why it's sold at 1.2 GHz instead of 800 or 1.0, and reliably hits 1.6 for more people than previous chips). It's architecturally midway between the original GS and the GS2 - for example, it has the exact same wifi chip as the GS2, and probably the same baseband. Anyway the baseband doesn't matter, even if the baseband consumed no power, see my previous comments regarding the RF power amp.

    Yes, it pulls current drain statistics out of its ass. Go read on it for your ****ing self. And yes, I've seen that there's a fuel gauge in the source code and hardware, just as you've said, but it's not used, so it might just as well not be there.
    More profanity, huh? Especially amusing considering that the link you gave me confirmed my claims, it's pulling current numbers out of its ass. If it isn't measured directly with a current sensor circuit, it's worthless. So please stop talking about current draw until you start using a method that actually makes sense and provides valid data. (Either a backport of the MAX17042 current sensor support from the Linux 3.1 kernel trees, or replacing the device battery with a dummy battery that includes a current sensor, connected to an external battery)

    And yet again, you provide excellent evidence that you don't know what you're talking about. The fuel gauge isn't used? Where the hell do you think the state of charge estimates come from? Thin air?

    Nope, they come from here:
    Code:
    /sys/class/power_supply/battery/batt_soc
    If you actually bother to read the code, you'll see where this comes from - the MAX17042. While the 42 supports some additional features not present in the 17040, those features aren't used in our device. But the fuel gauge IS there and IS used.

    Oh really? First of all, go put that thing in damn airplane mode so I don't have to listen about your damn radio power rhetoric anymore. Turn on Wifi to reproduce the bug, again, it's a different chip, the Broadcom one, vs the Infineon baseband. It's as far away as it gets from radio usage, don't you agree?
    Did that last night. I'll attach the results as a .zip because individual images is a pain in the ass - nothing unusual, no major unusual wakelocking, although the battery monitor wakelock is a bit higher than I'd like, that's because AT&T apparently wants it polled at 10 seconds instead of 40. AT&T is kind of notorious for doing things that break battery performance...

    Well if it's a reporting bug, under the conditions above: then it's damn ****ing big one as:
    1) the kernel is reporting false CPU real time % use on suspend and events/0
    2) the kernel is reporting false frequency state statistics
    3) why is my CPU out of deep sleep for amounts of time way larger than the sum of time of all wakelocks for that period?
    More profanity - and more evidence that you don't have any proper debugging methodology. You claim it's a huge kernel bug, without a single direct kernel artifact. In the entirety of this thread, you have not posted a single dmesg, nor have you posted a single /proc/wakelocks dump. All you've done is blindly trust screenshots of your point-click-drool apps. More on this soon.

    Look, I'm not trying to play the messiah here or pretend to know it all, because I don't. All that I'm doing is collecting evidence on the issue, and the evidence points out to that there's something wrong. If your explanation that battery drain can increase 10 fold for the same circumstances (and yes, the battery dies, it's not just playing around and acting dead), on the same desk, under the same signal reception conditions, without even data connectivity for that matter as it's idling on GSM when under Wifi, and that's all how it should be, then you need to get a clue.

    What's the matter with this bullshit attitude? I presented my findings and argued them and all you're doing is saying I'm talking out of my ass while you're not even backing up the things you're saying as seen above.
    Because the people in this world that piss me off the most are those that try to pass themselves off as experts, even though their data collection techniques are piss-poor and they clearly know less than they're representing themselves as knowing.

    In your case, your findings and debugging have collected data from:
    1) An app (Battery Monitor Widget) that is clearly marked by its author as providing inaccurate data on our device
    2) An app (BetterBatteryStats) that doesn't show all wakelocks in the system - yet you claim there's a kernel bug because you trusted the app instead of looking at /proc/wakelocks. See the ODF spreadsheet in bln_wakelocks.zip, which is derived from a /proc/wakelocks dump of the one case where I've had severe battery drain caused by a kernel bug (as we said in the Infuse community - ****ing BLN...) but calculates human-readable values for the total time and sleep (screen-off) wakelock time. In this particular controlled test (enable BLN, enable the "test notification", turn the screen off for a good long time), BLN held a wakelock for 60 minutes. Yet BetterBatteryStats didn't show a thing.
    3) A system app (Settings->About Phone->Battery Usage) that is known to have bugs (such as displaying 100% time with no signal when that clearly isn't the case) - so why do you trust it?

    Your data collection has not included, in the entirety of this thread:
    1) A single dmesg dump
    2) A single dump of /proc/wakelocks

    Yet you claim definitively there's a kernel bug, despite not having a shred of evidence that is directly derived from the kernel?

    So now someone from Samsung may be watching this thread, which poorly represents XDA as a bunch of idiots who do "boy who cries wolf" with no proper data collection methodology. It'll make it harder to have them pay attention when there are real problems with properly collected data to back it up.

    p.s., if SamsungJohn is still following this thread:
    1) MAX17042 datasheet please? We had it for the MAX17040 functionality inside the MAX8998 in the GalaxyS/Infuse, and it looks like it was available at one point for the 17042 within the MAX8997 of the I9100/I777, but it doesn't seem to be there any more.
    2) I know this is far less likely, but MAX8997/8998 datasheets please?
    3) Even less likely, Exynos 4210 datasheets, or at least better documentation of clock divider registers and clocking architecture. We've got leaks of the S5PC110 which seem to translate well to the S5PV210 in those regards, but it's clear there's a lot of new clocking stuff in the Exynos 4210 - I realize why you wouldn't want to support/condone overclocking, but people are doing it, and since they are, why not make sure they're doing it armed with proper information on what is what?
    4) Why does the Infuse count CPU/screen usage against battery charge current limits? This leads to battery drain when running navigation on a charger at full brightness (confirmed behavior), and also potentially may lead to improper battery charge termination, potentially damaging the battery (not confirmed, there may be charge cutoff safeguards beyond the check_chg_current loop in the kernel). PM me for more details and my test methodology if you want, it's a bit offtopic for this thread. This behavior also seems to be present in the I9100 and the I777, although at least with the I777 there's more evidence of safeguards beyond the cutoff loop in the kernel. I don't have enough data on these devices to call "Problem/bug!" though. Also I'm in the process of trying to get data from I9000s and Captivates, the kernel code and what I know of the MAX8998 from the kernel code say they should have the same problem, but user anecdotes indicate something seems different and at least the Cappy charges much faster stock, which is odd since its current limits are the same and battery is only slightly smaller.
    7
    It's a myth. AndreiLux keeps harping on CPU power usage when he forgets that other components in the system use power.

    Statistics from the Infuse, which has a similar sized battery:
    In airplane mode, with the cellular radio completely shut off - almost no drain at all. Maybe 5% per day.
    In normal operation, with moderate signal - around 0.8-1% per hour drain
    In normal operation, with extremely weak signal - around 1.5-2% an hour
    Deep sleep percentages and clocks used change minimally - 93-95% deep sleep in both normal operation cases. Yet despite CPU power-on times changing insignificantly, battery drain is at least 50% higher in the weak-signal case. Proof that CPU usage in this regime is NOT the dominant consumer of power, but the radio is. Android OS usage in these cases is low - single digits. Cell Standby is the #1 user of power, as it should be. 60-70% in moderate signal, I've seen it as high as 80-85% at my desk at work where signal is extremely weak.

    The SGH-I777 achieves similar battery life to the Infuse - less than 1% per hour in moderate signal, a bit over 1.5% per hour when in extreme weak signal. Again - signal strength results in a significant delta in power consumption, which is expected as the power amplifier must crank up the transmit power to handle the near/far problem at the tower. (If all handsets transmitted at the same power level in a CDMA-based system such as UMTS, near handsets would "drown out" the far handsets. As a result handsets are kept in a strict power control loop to ensure that the receive signal strength of all handsets is approximately the same at the tower. The higher the RF path loss, the higher the required transmit power to maintain the power control loop.)
    It is not signal related. The international version has been out for 4+ months now, I've had mine for 3 months and think I know enough of the behavior of the device and how it acts than you who has had for not even 2 weeks on AT&T. I already responded to you in the AT&T thread so I won't bother repeating myself here once again.
    So this is clear evidence you don't really know what you're talking about. The CPU architecture of our phone is very similar to that of its predecessor - no major power consumption miracles here, other than minimum frequency with most kernels being twice that of its predecessors. I'm having trouble finding information on the process feature size of each CPU, but I'm fairly certain Samsung didn't do a process shrink between the S5PV210 and the Exynos 4210 - evidence contributing to this is the fact that for each clock frequency supported by the two processors, core voltage is the same. If there were a process shrink, core voltages for the 4210 would be lower. So the only real differences is we've got a second core to potentially eat power - without a doubt it's not going to be a major improvement from its predecessor in terms of CPU power.
    The CPU architecture is similar? What the **** are you on? The S5PV210/S5PC210 are last generation architectures based on Cortex A8 designs while the 4210 are A9's, and googling even for 10 seconds will give you quotes like:
    Cortex-A9 also boasts numerous improvements in efficiency, claiming power consumption numbers nearly half that of the Cortex-A8, as well as the ability to use multiple-core technology to scale processing power in accordance with energy limitations.​
    They are as similar as saying a Core2 is similar to a Pentium 4 because they have the same instruction set.

    Yes, they are both 45nm processes, but with a 2 year gap in between them, the 4210 being the last one before shrinking to 32nm so 45nm is as mature right now as it will ever get, and if you think there's no yield and process improvements between mask shrinks then you are just plain ignorant.
    Go read on the tear-down article of the S2 and you'll see how it has almost exclusively new, never used, high-end components for almost everything.

    More evidence you're talking out of your ***. So, our fuel gauge driver has no way to measure current drain (actually, it is buried in the fuel gauge chip but not enabled in our kernels - check max17042-battery.c in newer Linux kernels for some neat stuff), but battery monitor widget somehow pulls current drain statistics out of its ass? WORTHLESS.
    Yes, it pulls current drain statistics out of its ass. Go read on it for your ****ing self. And yes, I've seen that there's a fuel gauge in the source code and hardware, just as you've said, but it's not used, so it might just as well not be there.

    In short - if in a situation of weak to moderate signal, with drain on the order of 1% an hour, if Android OS is getting reported as higher than Cell Standby, it is a reporting bug, plain and simple.
    Oh really? First of all, go put that thing in damn airplane mode so I don't have to listen about your damn radio power rhetoric anymore. Turn on Wifi to reproduce the bug, again, it's a different chip, the Broadcom one, vs the Infineon baseband. It's as far away as it gets from radio usage, don't you agree?

    Well if it's a reporting bug, under the conditions above: then it's damn ****ing big one as:
    1) the kernel is reporting false CPU real time % use on suspend and events/0
    2) the kernel is reporting false frequency state statistics
    3) why is my CPU out of deep sleep for amounts of time way larger than the sum of time of all wakelocks for that period?

    Look, I'm not trying to play the messiah here or pretend to know it all, because I don't. All that I'm doing is collecting evidence on the issue, and the evidence points out to that there's something wrong. If your explanation that battery drain can increase 10 fold for the same circumstances (and yes, the battery dies, it's not just playing around and acting dead), on the same desk, under the same signal reception conditions, without even data connectivity for that matter as it's idling on GSM when under Wifi, and that's all how it should be, then you need to get a clue.

    What's the matter with this bullshit attitude? I presented my findings and argued them and all you're doing is saying I'm talking out of my ass while you're not even backing up the things you're saying as seen above.

    ________________________________

    Edit:

    Just looked again through my screencaps.

    Left side: not bugged, how it should be; Right side: bugged, how it behaves sometimes in severe cases.
    Nt4tg.png
    E25ph.png
    Same network, same location, same signal strength, same ROM (Changing my font maybe fixed it....) , same wakelocks between them, same background apps running. Not even 5 minutes screen time on the two. The only difference with .14 patched kernels I'm experiencing is that it's not showing up anymore in the stats, but it drains just as much as in the right screen, showing the same CPU state behavior. It's also happening less often, and many say it's not happening at all anymore, but how would they know it it's hidden now?
    6
    Great thread, once I'm done fixing my car and my house I can dive into the kernel and take a look, as well as ***** at Samsung.