[PATCH] [Kernel] The spontaneous reboot (solution?) thread

Search This thread

toadlife

Inactive Recognized Developer
Aug 19, 2008
1,208
1,012
Lemoore, CA
I've been getting occasional random reboots EI22, and I've seen reports in the general forum of the same.

There seem to be two main causes of reboots:


WIFI Related (bcm4329 DHD Bus Module TX Queue overflow)​
The reboots always happen when WIFI is on. The system will freeze up hard for about 10-15 seconds and then just reset. The LAST_KMSG file shows 60KB of nothing but the following:

Code:
[ 5315.816399] dhd_bus_txdata: out of bus->txq !!!
[ 5315.820979] dhd_bus_txdata: out of bus->txq !!!
[ 5315.825411] dhd_bus_txdata: out of bus->txq !!!
[ 5315.829922] dhd_bus_txdata: out of bus->txq !!!
[ 5315.834499] dhd_bus_txdata: out of bus->txq !!!
[ 5315.838929] dhd_bus_txdata: out of bus->txq !!!
[ 5315.843505] dhd_bus_txdata: out of bus->txq !!!


OneDram driver related reboots​

After patching the WIFI related bugs I started to get these more often. Here is an example of the log leading up to the reboot.

Code:
[   13.319028] [OneDRAM](dpram_write) Failed to get a Semaphore. sem:0, PHONE_ACTIVE:HIGH, fail_cnt:1
[   13.339579] [OneDram] dpram_drop_data, head: 319, tail: 319
[   13.351715] [OneDRAM](dpram_write) Failed to get a Semaphore. sem:0, PHONE_ACTIVE:HIGH, fail_cnt:1
[   13.366965] [OneDram] dpram_drop_data, head: 319, tail: 319
[   13.379178] [OneDRAM](dpram_write) Failed to get a Semaphore. sem:0, PHONE_ACTIVE:HIGH, fail_cnt:1

Another log sample from a reboot, right before it restarts...
Code:
[   11.585629]
[   11.585665] dpram_shutdown !!!!!!!!!!!!!!!!!!!!!
[   11.590957]
[   11.590986] dpram_shutdown ret : 0
[   11.598695] Restarting system.
[   11.600629] KERNEL:magic_number=0 CLEAR_UPLOAD_MAGIC_NUMBER
[   11.608974] arch_reset: attempting watchdog reset

For the WIFI issue, I searched and found a set of patches on the linux driver project mailing list that address this issue.

See: http://driverdev.linuxdriverproject.org/pipermail/devel/2011-March/012948.html

After applying the patch verbatim I still got the WIFI related reboots, so I slowly pushed the TXQLEN values up until the reboots stopped. Here is the commit from my github that contains the patch that seems to have done the trick.

As for the OneDRAM related reboots, I think I may have found the solution to these too. The problem seems to have gone away when I switched the specific ARM toolchain that is mentioned in the Samsung kernel source README. Previously I was using the 4.4.x toolchain from Googles repo. Thanks to Earthbound for working with me on this one. Aside from Earthbound (he uses the recommended Codesourcery one), I don't know what toolchains others use, but I do know that I have not had one reboot in the four days since switching to the CodeSourcery toolchain.

Here is a kernel (source) with the WIFI fix that was compiled with the 2009-q3-68 toolchain. Aside from the WIFI fix, the kernel contains the following features:



  • root/su/busybox
  • init.d script support
  • bootanimation.zip support
  • overclock to 1.4/voltage control
  • keyboard delay fix
  • ext4/rfs support
  • CWM5/ROM Manager support (reboot bml8 recovery)

If you are having WIFI or dpram/onedram related reboots, you might want to give this kernel a try and report back.

NOTE: If you overclock or over/under volt please stop doing so if you want to test this kernel and post logs.
 
Last edited:

mjben

Senior Member
I've experienced a few ramdom freezes and reboots lately even when overclocked to previously stable frequencies- ever since I successful ran with 1.5ghz enabled for two minutes.

I gave this a go, and overclocked to 1.4 and my phone went into a reboot loop- the rom booted up, but would freeze and reboot shortly afterwards.

I'm stable again now after running the VC cwm zip to restore settings to 1ghz.

Is there any useful feedback I could provide? Some sort of log or something?

Sent from my secret underground bunker
 
Last edited:

toadlife

Inactive Recognized Developer
Aug 19, 2008
1,208
1,012
Lemoore, CA
I just got another reboot...same issue. :(

So much for that!!

This bug is defintely causing the reboots though.
 

toadlife

Inactive Recognized Developer
Aug 19, 2008
1,208
1,012
Lemoore, CA
Is there any useful feedback I could provide? Some sort of log or something?

The system keeps logs in /data/system/dropbox

After a reboot the file with a name like " SYSTEM_LAST_KMSG@xxxxxxxxxxxxxx.txt.gz" will contain a copy of the last system log buffer before your phone restarted.


But like I said I just got another reboot. I ran a speedtest over wifi using the speedtest.net app just before my last two reboots so I might be onto a way of triggering the bug.
 

toadlife

Inactive Recognized Developer
Aug 19, 2008
1,208
1,012
Lemoore, CA
I'm going to continue working on this. I'm experimenting with larger and smaller TXQLEN values.

Right now running with a TXQLEN of 4096 instead of 2048. It may be that larger is not better; that I needed to go smaller.

I'll keep my github updated.
 

darkierawr

Senior Member
Feb 2, 2011
1,520
895
Excellent work!

I'd jump all over this but I stopped getting them after IAP 1.0.5+.

If they start up again I will report back.
 

SenseiSimple

Senior Member
Jun 17, 2008
340
543
Austin, TX
the only time i have experienced this particular reboot concerning wifi, was having wifi + 4g enabled at home after a dalvik cache wipe (where the phone starts up, 4g starts first, then wifi simultaneously gets on LAN - both icons are showing, locks up - then reboot) once it restarts, it's all ok, wifi starts first, and doesn't happen again.

(running IAP samurai 1.0.4)
 
Last edited:

toadlife

Inactive Recognized Developer
Aug 19, 2008
1,208
1,012
Lemoore, CA
Excellent work!

I'd jump all over this but I stopped getting them after IAP 1.0.5+.

If they start up again I will report back.

AFAIK, the Earthbound is only mucking around with frequencies and voltages in his kernel. Of course I'm basing this on the only source he has posted, which was from two or three versions of his kernel back. He hasn't posted the source for any of this new builds, so apparently he still views GPL compliance as optional.

The reboots don't happen that often. I've gone many days without a single reboot and then gotten them multiple times in one day.
 

toadlife

Inactive Recognized Developer
Aug 19, 2008
1,208
1,012
Lemoore, CA
the only time i have experienced this particular reboot concerning wifi, was having wifi + 4g enabled at home after a dalvik cache wipe (where the phone starts up, 4g starts first, then wifi simultaneously gets on LAN - both icons are showing, locks up - then reboot) once it restarts, it's all ok, wifi starts first, and doesn't happen again.

(running IAP samurai 1.0.4)

No 4G service for me within 40 miles, so I never turn it on, but now that you mention it the lockups I've had have seemed to happen during that split second when both 3G data and Wifi icons show up in the top bar.
 

SenseiSimple

Senior Member
Jun 17, 2008
340
543
Austin, TX
No 4G service for me within 40 miles, so I never turn it on, but now that you mention it the lockups I've had have seemed to happen during that split second when both 3G data and Wifi icons show up in the top bar.

yeah maybe something about having two sources of internet... being assigned two addresses, no idea...

earthbound is stuck on the frequency tuning, but i think it's so stable because of the trunks he started with.
 

toadlife

Inactive Recognized Developer
Aug 19, 2008
1,208
1,012
Lemoore, CA
yeah maybe something about having two sources of internet... being assigned two addresses, no idea...

earthbound is stuck on the frequency tuning, but i think it's so stable because of the trunks he started with.

What trunk did he start from? He's got no history in github.

I'm assuming nubecoder's since his initial kernels had nubecoders broken video recording. I took a quick look a nubecoder's tree and nothing network related is touched there.

Anyhow in the general and Q&A forums have two reports reboots with Samauri 1.04, so that kernel is definitely not immune.
 

SenseiSimple

Senior Member
Jun 17, 2008
340
543
Austin, TX
What trunk did he start from? He's got no history in github.

I'm assuming nubecoder's since his initial kernels had nubecoders broken video recording. I took a quick look a nubecoder's tree and nothing network related is touched there.

Anyhow in the general and Q&A forums have two reports reboots with Samauri 1.04, so that kernel is definitely not immune.

yeah it's not immune, as i said it's happened to me, but it's appeared to me he started picking and choosing from nubernel into rodderick's clean kernel, and as he's said in his OP for samurai, there are elements of Genocide and other forerunners.. i forgot which, he's since removed the part where he says where his mods come from... he did at one point put in a fix for wifi issues, i'm not sure if it's in relation to the same issue (i think it was a general connectivity thing).

i have a short list of todo's for a kernel and have made headway but i really have no desire to compete with the 4 (including yours) on here.
 
Last edited:

toadlife

Inactive Recognized Developer
Aug 19, 2008
1,208
1,012
Lemoore, CA
i have a short list of todo's for a kernel and have made headway but i really have no desire to compete with the 4 (including yours) on here.

We are not competing. I just want my phone to be stable. Collaboration is what makes open source work. If you have ideas on how to make our kernels better share them.
 
Last edited:

BeerChameleon

Senior Member
Aug 21, 2008
16,212
1,174
Tucson,Arizona.
@ toad

Iv'e been using your rom for over week and the e122 modem and have not had any reboots.

I do not overclock


I use 4g when i drive in a certain area to stream pandora faster and then it connect and reconnect to 4g

Ive connected wifi and 3g at the same time for picture mail and charging to my sprint account and still no reboot
 

D-Buckner

Member
Mar 7, 2011
36
1
Toad,

Running your Kernel and Clean GB I had WiFi reboots every 3 to 4 minutes. WiFi would drop and go to the router for DHCP to hand out the IP again. I figured I would give you a chance to work on it and run rooted stock for a few days before trying Clean GB again as I like it! ;) This same issue was even worse for me with ACS ICS v4.

Running rooted stock is a different story all together! I don't get the random WiFi drops, instead I get Android reboots with the freezing that's mentioned. Things run great for awhile before I get stuck in a loop of reboots. I noticed this last time that touchwiz lost all the bar icons besides the applications drawer icon. the bar also doubled in size along with the icon. Very strange!

Anyway, I was able to drop the default home between reboots this last time. The next reboot I launched ADW instead of TW. ADW instantly crashed and offered to email a bug report, So I did. In the stack trace I see that android Package manager has died as a cause of the exception. I am not a java programmer so am at a loss here to what's really happening.

Anyway, I hope this long story may be of some help in tracking this issue down. It looks like quite a few people are having it, and I myself don't want to go back to Froyo ;)

Cheers,
Dave

---------- Post added at 02:02 AM ---------- Previous post was at 01:53 AM ----------

Here is a copy of the stack trace I mentioned...

Stack :
=======
java.lang.RuntimeException: Package manager has died
at android.app.ContextImpl$ApplicationPackageManager.getApplicationInfo(ContextImpl.java:2062)
at android.app.ContextImpl$ApplicationPackageManager.getResourcesForApplication(ContextImpl.java:2554)
at org.adwfreak.launcher.LauncherModel.a(Unknown Source)
at org.adwfreak.launcher.LauncherModel.b(Unknown Source)
at org.adwfreak.launcher.LauncherModel.b(Unknown Source)
at org.adwfreak.launcher.LauncherModel.a(Unknown Source)
at org.adwfreak.launcher.ck.run(Unknown Source)
at java.lang.Thread.run(Thread.java:1019)
Caused by: android.os.DeadObjectException
at android.os.BinderProxy.transact(Native Method)
at android.content.pm.IPackageManager$Stub$Proxy.getApplicationInfo(IPackageManager.java:1280)
at android.app.ContextImpl$ApplicationPackageManager.getApplicationInfo(ContextImpl.java:2057)
... 7 more

Cause :
=======
java.lang.RuntimeException: Package manager has died
at android.app.ContextImpl$ApplicationPackageManager.getApplicationInfo(ContextImpl.java:2062)
at android.app.ContextImpl$ApplicationPackageManager.getResourcesForApplication(ContextImpl.java:2554)
at org.adwfreak.launcher.LauncherModel.a(Unknown Source)
at org.adwfreak.launcher.LauncherModel.b(Unknown Source)
at org.adwfreak.launcher.LauncherModel.b(Unknown Source)
at org.adwfreak.launcher.LauncherModel.a(Unknown Source)
at org.adwfreak.launcher.ck.run(Unknown Source)
at java.lang.Thread.run(Thread.java:1019)
Caused by: android.os.DeadObjectException
at android.os.BinderProxy.transact(Native Method)
at android.content.pm.IPackageManager$Stub$Proxy.getApplicationInfo(IPackageManager.java:1280)
at android.app.ContextImpl$ApplicationPackageManager.getApplicationInfo(ContextImpl.java:2057)
... 7 more
android.os.DeadObjectException
at android.os.BinderProxy.transact(Native Method)
at android.content.pm.IPackageManager$Stub$Proxy.getApplicationInfo(IPackageManager.java:1280)
at android.app.ContextImpl$ApplicationPackageManager.getApplicationInfo(ContextImpl.java:2057)
at android.app.ContextImpl$ApplicationPackageManager.getResourcesForApplication(ContextImpl.java:2554)
at org.adwfreak.launcher.LauncherModel.a(Unknown Source)
at org.adwfreak.launcher.LauncherModel.b(Unknown Source)
at org.adwfreak.launcher.LauncherModel.b(Unknown Source)
at org.adwfreak.launcher.LauncherModel.a(Unknown Source)
at org.adwfreak.launcher.ck.run(Unknown Source)
at java.lang.Thread.run(Thread.java:1019)
**** End of current Report ***
 
Last edited:

Top Liked Posts

  • There are no posts matching your filters.
  • 7
    I've been getting occasional random reboots EI22, and I've seen reports in the general forum of the same.

    There seem to be two main causes of reboots:


    WIFI Related (bcm4329 DHD Bus Module TX Queue overflow)​
    The reboots always happen when WIFI is on. The system will freeze up hard for about 10-15 seconds and then just reset. The LAST_KMSG file shows 60KB of nothing but the following:

    Code:
    [ 5315.816399] dhd_bus_txdata: out of bus->txq !!!
    [ 5315.820979] dhd_bus_txdata: out of bus->txq !!!
    [ 5315.825411] dhd_bus_txdata: out of bus->txq !!!
    [ 5315.829922] dhd_bus_txdata: out of bus->txq !!!
    [ 5315.834499] dhd_bus_txdata: out of bus->txq !!!
    [ 5315.838929] dhd_bus_txdata: out of bus->txq !!!
    [ 5315.843505] dhd_bus_txdata: out of bus->txq !!!


    OneDram driver related reboots​

    After patching the WIFI related bugs I started to get these more often. Here is an example of the log leading up to the reboot.

    Code:
    [   13.319028] [OneDRAM](dpram_write) Failed to get a Semaphore. sem:0, PHONE_ACTIVE:HIGH, fail_cnt:1
    [   13.339579] [OneDram] dpram_drop_data, head: 319, tail: 319
    [   13.351715] [OneDRAM](dpram_write) Failed to get a Semaphore. sem:0, PHONE_ACTIVE:HIGH, fail_cnt:1
    [   13.366965] [OneDram] dpram_drop_data, head: 319, tail: 319
    [   13.379178] [OneDRAM](dpram_write) Failed to get a Semaphore. sem:0, PHONE_ACTIVE:HIGH, fail_cnt:1

    Another log sample from a reboot, right before it restarts...
    Code:
    [   11.585629]
    [   11.585665] dpram_shutdown !!!!!!!!!!!!!!!!!!!!!
    [   11.590957]
    [   11.590986] dpram_shutdown ret : 0
    [   11.598695] Restarting system.
    [   11.600629] KERNEL:magic_number=0 CLEAR_UPLOAD_MAGIC_NUMBER
    [   11.608974] arch_reset: attempting watchdog reset

    For the WIFI issue, I searched and found a set of patches on the linux driver project mailing list that address this issue.

    See: http://driverdev.linuxdriverproject.org/pipermail/devel/2011-March/012948.html

    After applying the patch verbatim I still got the WIFI related reboots, so I slowly pushed the TXQLEN values up until the reboots stopped. Here is the commit from my github that contains the patch that seems to have done the trick.

    As for the OneDRAM related reboots, I think I may have found the solution to these too. The problem seems to have gone away when I switched the specific ARM toolchain that is mentioned in the Samsung kernel source README. Previously I was using the 4.4.x toolchain from Googles repo. Thanks to Earthbound for working with me on this one. Aside from Earthbound (he uses the recommended Codesourcery one), I don't know what toolchains others use, but I do know that I have not had one reboot in the four days since switching to the CodeSourcery toolchain.

    Here is a kernel (source) with the WIFI fix that was compiled with the 2009-q3-68 toolchain. Aside from the WIFI fix, the kernel contains the following features:



    • root/su/busybox
    • init.d script support
    • bootanimation.zip support
    • overclock to 1.4/voltage control
    • keyboard delay fix
    • ext4/rfs support
    • CWM5/ROM Manager support (reboot bml8 recovery)

    If you are having WIFI or dpram/onedram related reboots, you might want to give this kernel a try and report back.

    NOTE: If you overclock or over/under volt please stop doing so if you want to test this kernel and post logs.
    3
    Lovely, got my first spontaneous reboot on EI22. Haven't patched my kernel yet, will do so later - but I'm going to pull kmsg and lastkmsg and attach them here before I forget. Consider this a placeholder for now, I'll read through the logs a bit later to see if there's any new info.

    EDIT: last_kmsg added; I'll have to grab the current one after a (requested) reboot. Forgot to cat kmsg as the device booted. It appears that I may have ran into a WiMAX bug - and also, I forgot to chmod /system/vendor/bin in my updater-script. D'oh.
    2
    by competing, i don't mean as in a contest, i mean that i wouldn't want to work on anything that someone else has already in the works, is all.

    i do think devs see their effort as a contest for superiority which hinders everyone in the process - there needs to be a bit more transparency involved, like - not to stoke an old fire but "IAP samurai's" latest source should be checked in/available by this point, which makes possible a comparison of upstream/stock/rodderick's vs nubernel vs samurai's to get at each of their strengths for a unified kernel since everyone focuses on their area of enjoyment (i.e. frequency tuning)

    i feel like some of the mods that were ported or kept along from EH17 and earlier may or may not bring with them old bugs or the mods are being made without consideration or attention to what other areas they will affect (adding/editing one feature damages another), since i presume the fully stock kernel doesn't have these issues? but that's the price of improvement, it's just that with more eyes and expertise allowed to approach an issue, the more well rounded it becomes. i know you know this, i was just clarifying my stance.

    as far as the reboot issue, it's so sporadic that i haven't yet caught it again since discussing... it only happened for the first time yesterday, but it occurred multiple times, which was a cause for alarm so i'm still looking out for it.



    It has been random and rare, a special combination of two radios (3g/wifi, 4g/wifi) coming on at once before either has an address at a certain time such as during media scanning during startup where everything is momentarily slow (before the launcher is responsive). Also on 4g when browsing internet with stock browser (with very low 4g signal, where i'm assuming it's going to drop/rescan), i haven't looked at logs but saw your post in other thread about this so i dunno.

    Coming to think of it, could this be solvable with some sort of timeout or brief pause, considering that most mods aim to solve lag or remove delays, maybe some delay is required in the preamble to establish a connection.. or actually, in how the network chain prioritized, as in 3g > 4g > wifi (if they are all enabled) because of the apparent conflict - perhaps a ConnectionManager/android.net issue (as noted when switching on Wifi while still connected to 4g, wifi connects > 4g disconnects > 4g pops up momentarily trying to connect and goes away again when wifi gets its IP) probably having nothing to do with the kernel...

    any thoughts?

    EDIT: This is all really only happening to me with enabling/disabling wimax... and it's getting nastier with every reboot (currently it just rebooted, then hung at the bootanim till i pulled the battery then it was fine... if i leave 4g off, i have no problem

    IAP won't even listen to the community's suggestions anyways.
    i tried to get some sort of failsafe for noobs and all these crazy overclocks
    and was told it was just garbage.

    also I've only had few reboots. and one was when i had first flashed a increment update from CleanGB. which in since was in sync with what you guys describe about both 3g and wifi being on. but I also think maybe could data sync's/activation of 3g/4g could be apart of this? cause i noticed on certain roms i couldn't data sync while using wifi, orr that wifi would interrupt the data sync on 3g which would cause the double signal's and therefore sometimes cause a reboot.
    2
    to throw in my $0.02, using samurai 2.0.8 which includes your patches in my rom, aside from installation data issues relating to working with backups, reboots have been non-existent even dealing with the 4G and those reporting any reboots have been strictly related to data corruption or incompatibility

    -- in integrating and testing the lockscreen mod from scratch (not counting TSMParts - strictly lockscreens) in SleeperROM, i explicitly decided not to include it because it adds variable instability depending on too many things - not to mention an increased chance of data incompatibilities when restoring from backups, so while it's a neat way to customize the phone at will, it could potentially be detrimental so it's an unfortunate decision between customizability and stability.

    And overclocking/undervolting -- i should add, reboots/shutdowns have been heavily reported for even slightest undervoltages on some phones. I know it should be obvious, but i've personally experienced a random reboot and in investigating the problem, had forgotten that i was playing with the cpu clock/voltage settings.. and it wasted a couple of hours on a wild goose chase when the solution was to return the clock settings to default since my phone can't handle it.
    1
    @ toadlife...
    Mtd may actually solve these random reboots, because we won't be using samsung's onedram driver.

    sent from my always aosp epic