I'm writing this since the abundant misinformation about it coming across many many threads, and all other Android OS threads are too big and real information is easy to overlook.
After several weeks of analyzing and monitoring of the behavior of the device, things are starting to clear up and we can come to a conclusion.
What's the Android OS usage?
Many have noticed that on our Galaxy S2's and various other variations (Epic 4G Touch, etc) that in the Android battery stats the Android OS entry is supposedly eating away a large percentage of the battery used.
There's been many threads old and new discussing it and theorizing and many users magically fixing it by doing god knows what changes.
The Android OS package is a bundle of processes, in this case, all root processes used by the system kernel. Please remember this when continuing to read on, as there is much more to it than meets the eye and only once specific aspect of the whole story will be treated.
A diagnosis of the main problem:
Download [APP] BetterBatteryStats adds battery history back to Gingerbread. It is your #1 friend on finding out what uses up CPU power and causes problems.
Note: The CPU time in BetterBatteryStats is normalized, meaning 100% load during full frequency for a minute will show up as 1 minute CPU time, 100% load at half-frequency for a minute will show 30 seconds CPU time. Keep this in mind when monitoring processes. The CPU time shown in the Android battery stats is useless.
As you can see in the screenshot, "suspend" and "events/0" pop up as the big consumers of CPU power. Red means it's a system/kernel process and blue means it's a user process.
The suspend process is the program that's in charge of telling the hardware to go into deep sleep mode and reawaken from it.
The events/0 is just the general events manager for the system kernel.
You can monitor this with Watchdog Task Manager. Set it up so that it monitors system processes and raises notifications. Depending on the severity of the bug at the time and some black magic, you'll see the 99% load on the system.
Due to how the hardware of the phone is designed, the default kernel settings of the phone lock its frequency to 800MHz before falling into deep sleep. This is believed to be one of the main problems as to why so much battery is drained. Compared to other the devices the Exynos 4210 in our Galaxy S2's takes a much longer time to enter deep sleep (suspend) and wake up again from sleep (resume). This can be seen in the device message logs:
<6>[ 749.242666] PM: suspend of devices complete after 171.413 msecsThe fact that these operations are done at a higher frequency, together with the fact that a suspend-wake cycle takes around anything from 0.6 to 0.8 seconds, while doing nothing else, will drain more power than actually needed.
<6>[ 749.250855] PM: late suspend of devices complete after 0.396 msecs
<4>[ 749.250864] Disabling non-boot CPUs ...
<7>[ 749.250909] s3c_pm_enter(3)
<7>[ 749.250916] s3c_sleep_save_phys=0x40d9fe80
<7>[ 749.250959] sleep: irq wakeup masks: 000fffdd,cb37b97f
<6>[ 0.000000] WAKEUP_STAT: 0x80000001
<6>[ 0.000000] WAKUP_INT0_PEND: 0x80
<6>[ 0.000000] WAKUP_INT1_PEND: 0x0
<6>[ 0.000000] WAKUP_INT2_PEND: 0x0
<6>[ 0.000000] WAKUP_INT3_PEND: 0x0
<7>[ 0.000000] s3c_pm_enter: post sleep, preparing to return
<7>[ 0.000000] S3C PM Resume (post-restore)
<7>[ 0.000000] Wakeup from sleep, 0x80000001
<6>[ 0.000000] L310 cache controller enabled
<6>[ 0.000000] l2x0: 16 ways, CACHE_ID 0x4100c4c5, AUX_CTRL 0x7e470001, Cache Size: 1048576 B
<6>[ 749.251252] PM: early resume of devices complete after 0.255 msecs
<6>[ 749.748189] PM: resume of devices complete after 496.599 msecs
The screenshot on the right is a period of 15 minutes where the processes were bugged, the timers in CPU Spy were reset, and the screen immediately turned off to collect the statistics. A screen-off profile was used to limit the maximum frequency to 200MHz, so theoretically any standard process is not allowed to use any of the frequency-states above that. Nevertheless, There you can see the device being in the 800MHz state for roughly half the time. This means that half the time the device is spending on being awake is being wasted on just entering and leaving the deep sleep state.
Note: The deep sleep state is a state in which the CPU isn't clocked anymore and caches are no longer kept coherent, basically it's turning off most of the CPU, and thus conserving a lot of power.
What's the cause of high Android OS usage?
This is where things get interesting.
The most common cause of the device waking up often enough to cause a problem and raise the suspend process up, and thus also the AOS usage, are incoming and outgoing network traffic. These will be attributed to the operating system rather than the service or application at cause of the problem.
Second most common cause are high-frequency wakelocks caused by some application or service. Refer to the next section for this topic.
Incorrect statistical interpretation
One has to keep in mind that when generating the percentages shown under the battery stats in Android, that those values are just an estimate. The ROM has a file called power_profile.xml in which power drain values for different uses are listed. These values are coming from Samsung. For example, the CPU running at 1200MHz is listed as using 577mAh of power, so when a task is running for 10 seconds at 1200MHz, then it means that that process will be getting attributed a power consumption of 1602µAh (577mAh/3600s *10s). ASOP ROMs like MIUI will actually list these values in the battery stats. Each entry in the battery usage as such has an estimated power consumption value, and out of which the percentages are calculated from.
The problem is, that these are all just estimates and wether they are calculated correctly is up to debate. There are issues if either dual core use is taken into account in these estimates, or not, and even if the values provided by power_profile.xml are representative of real use or not. One issue for certain is that power consumption will be wrong for people who underclock/overclock, and undervolting. There are no entries for states above 1200MHz or below 200MHz, and the ones already present are meant to be representative of default/stock voltages. If you're running 100MHz in idle, the system will use the 200MHz estimates and thus overestimate your power consumption, if you're overclocking and using 1400MHz for example, the system will underestimate your power consumption.
This part is also what causes kernels which claim to fix the problem to be nothing but red herrings: somewhere in the patching of the kernels to 18.104.22.168 there is a change which makes the whole suspend and wakeup process no longer visible to the system and thus is no longer registered. Because of that, there are no statistics of these and the drain can not be calculated. Tests have shown that this is nothing other than just hiding of the problem rather than a tangible fix.
What you can do:
First of all you need to find the cause of the problem; study your wakelocks. You can do this by using BBS as mentioned above but this will only show part of the story as it will not tell you about system wakelocks, but only user wakelocks. If there is no obvious villain listed there, then you must do some more advanced troubleshooting:
Dump your wakelocks file using a terminal or over ADB if you are familiar with it, as follows:
cat /proc/wakelocks > /sdcard/wakelocks.csvTo read this data you need to import it into a spreadsheet application like Excel or OpenOffice and you'll end up getting something like this: (Read the thread around here on how to format your spreadsheet)
You need to mainly look for high-duration wakelocks, and the wake_count frequency, meaning how often your device has woken up.
While this is all a bit advanced and all, you can do it simpler:
Avoid using application and services that use a permanent internet connections. Avoid using applications which use polling for their connectivity and instead use those who support C2DM push-notifications. This will vastly increase your battery life by vastly decreasing network traffic. If this all doesn't help, try doing a network analysis (packet capture) to inspect the source of it, please refer to discussions later in the thread for this.
What your device can be capable of:
In the first screenshot you can see two periods of around 18 hours where the battery drain is relatively flat. In this period the suspend process gained merely a few seconds of CPU time. Then I use the phone for a bit, it goes haywire, and then calms down again and then starts draining again. You can see this in the step-wise drain in the curve. It should be smooth and not like that.
In the second screenshot it's after a full charge where it didn't trigger for 8 hours. My battery is perfectly calibrated so those are real 4% of battery. So if things were to go smoothly on the software part then the phone is idling at around 0.4% of battery per hour on Wifi, and other people have been reporting down to 0.8% or even less with a good 3G signal.
Left side: low network usage, how it should be; Right side: high network usage, how it behaves sometimes in severe cases.
Same network, same location, same signal strength, same ROM (Changing my font maybe fixed it....) , same wakelocks between them, same background apps running. Not even 5 minutes screen time on the two. The only difference with .14 patched kernels I'm experiencing is that it's not showing up anymore in the stats, but it drains just as much as in the right screen, showing the same CPU state behavior. It's also happening less often, and many say it's not happening at all anymore, but how would they know it it's hidden now?
"tvout resume wo"
One might have noticed the high usage of the "tvout resume wo" (wo = work, btw) in BetterBatteryStats. This is part of the TVOut functionality/driver, and it seems that every time the CPU resumes it goes through this work wether you use TVOut or not. Due it being a kernel process it's also part of adding up to the Android OS package usage. If you are suffering from high frequency wakeups, then this will also go up as system routines for this functionality are called every time the device wakes up.
<6>[ 749.251252] PM: early resume of devices complete after 0.255 msecs
<3>[ 749.251350] [MHL]mhl_int_irq_handler() is called
The device is still superb. Even with this bug the battery is equivalent to many other high-end devices out there. The thing is, it could be much better. If you want to squeeze out most of your device read on the discussions in this thread to try to find out what's keeping the phone from wasting power. Forward this to people so they're informed and aware of it.