Post Reply

Help tracking down kernel panic...

12th March 2014, 04:36 PM   |  #1  
CWGSM3VO's Avatar
OP Senior Member
Thanks Meter: 198
 
691 posts
Join Date:Joined: Jan 2012
More
Hi Everyone,

Was hoping I could ask a few people to help me out here. Since Feb 28th(ish), I get random kernel panics on CM11 or any ROMs based on the unified moto8960 tree.

To help me out to see if it's just me or a wider problem, all you need to do is install this app:

https://play.google.com/store/apps/d...mumu21.bootlog

Which will keep track of your uptime. It needs root privileges in order to copy last_kmsg and trying to view from the logs will FC the app. This is fine, it still copies the logs which you can then view/etc from your filemanager (default is to save to internal sd card)

Once the app is installed, you need not do anything more other than grant it root privs (head into preferences of the app and enable all 4 check marks) -- If your phone reboots, there'll be a record and can capture the last_kmsg file and hopefully push this up to the CM team, if there's an issue.

The app is smart enough to detect between user reboot and unexpected reboots, you'll see it listed as "crash" in the logs.

Thanks to anyone who decides to help out!

EDIT:

For reference, I'm including links to my last_kmsg in the two times I caught it on two separate builds (same error)

http://forum.xda-developers.com/show...postcount=1567
http://forum.xda-developers.com/show...postcount=1477

End of the day, for me it appears to be here:

Code:
[11746.588339,0] PM: Preparing system for mem sleep
[11746.588522,0] pm_debug: suspend uah=154646
[11746.591849,0] Freezing user space processes ... 
[11746.596732,1] msm_server_control: wait_event error -512 for command10
[11746.596855,1] msm_open send open server failed
[11746.624903,0] ov8820_power_down
[11746.670164,0] msm_open: destroy ion client(elapsed 0.08 seconds) done.
[11746.924399,0] Freezing remaining freezable tasks ... 
[11746.925681,0] active wake lock alarm_rtc, time left 73
[11746.926291,0] 
[11746.926536,0] Freezing of tasks  aborted
[11746.930625,0] 
[11746.930869,0] Restarting tasks ... 
[11746.949212,0] Unable to handle kernel NULL pointer dereference at virtual address 00000000
Can see logging around going into mem sleep plenty of times before, this time it fails. On AOSP builds, the error also happens (moto_msm8960) but the error is not with msm_server_control but rather:

Code:
[ 5630.051304,1] PM: Preparing system for mem sleep
[ 5630.051548,1] pm_debug: suspend uah=106686
[ 5678.901571,0] [0:986:E :DAT] Set DXE Power state 0
[ 5680.681793,0] [0:986:E :DAT] Set DXE Power state 2
[ 5680.689881,0] [WLAN][987:E :TL ] WLAN TL:No station registered with TL at this point
[ 5690.081214,1] **** Suspend timeout 
[ 5690.081366,1] kworker/u:16    D c08131e0     0 22912      2 0x00000000
[ 5690.081671,1] [<c08131e0>] (__schedule+0x780/0x90c) from [<c081457c>] (__mutex_lock_slowpath+0x178/0x1e4)
[ 5690.081824,1] [<c081457c>] (__mutex_lock_slowpath+0x178/0x1e4) from [<c081463c>] (mutex_lock+0x54/0x6c)
[ 5690.082007,1] [<c081463c>] (mutex_lock+0x54/0x6c) from [<c0172480>] (cpu_hotplug_disable_before_freeze+0x8/0x20)
[ 5690.082190,1] [<c0172480>] (cpu_hotplug_disable_before_freeze+0x8/0x20) from [<c01724e0>] (cpu_hotplug_pm_callback+0x28/0x40)
[ 5690.082373,1] [<c01724e0>] (cpu_hotplug_pm_callback+0x28/0x40) from [<c08181c8>] (notifier_call_chain+0x38/0x68)
[ 5690.082526,1] [<c08181c8>] (notifier_call_chain+0x38/0x68) from [<c0193a1c>] (__blocking_notifier_call_chain+0x40/0x54)
[ 5690.082709,1] [<c0193a1c>] (__blocking_notifier_call_chain+0x40/0x54) from [<c0193a44>] (blocking_notifier_call_chain+0x14/0x18)
[ 5690.082892,1] [<c0193a44>] (blocking_notifier_call_chain+0x14/0x18) from [<c01a70d8>] (pm_notifier_call_chain+0x14/0x2c)
[ 5690.083075,1] [<c01a70d8>] (pm_notifier_call_chain+0x14/0x2c) from [<c01a8010>] (enter_state+0x68/0x13c)
[ 5690.083167,1] [<c01a8010>] (enter_state+0x68/0x13c) from [<c01a92f0>] (suspend+0x68/0x180)
[ 5690.083350,1] [<c01a92f0>] (suspend+0x68/0x180) from [<c0189aa8>] (process_one_work+0x228/0x434)
[ 5690.083533,1] [<c0189aa8>] (process_one_work+0x228/0x434) from [<c0189e88>] (worker_thread+0x1a8/0x2c8)
[ 5690.083686,1] [<c0189e88>] (worker_thread+0x1a8/0x2c8) from [<c018e2a8>] (kthread+0x80/0x8c)
[ 5690.083869,1] [<c018e2a8>] (kthread+0x80/0x8c) from [<c0106714>] (kernel_thread_exit+0x0/0x8)
[ 5690.083991,1] Unable to handle kernel NULL pointer dereference at virtual address 00000000
Curious to see if others are hitting the same thing.
Last edited by CWGSM3VO; 12th March 2014 at 08:20 PM.
12th March 2014, 07:13 PM   |  #2  
Member
Flag Bs As
Thanks Meter: 26
 
43 posts
Join Date:Joined: Jun 2013
More
Had a random reboot a few hours ago on cm11 (March 11th). Just installed the app, let's see what we can get now.
The Following User Says Thank You to kernelmoron For This Useful Post: [ View ]
12th March 2014, 10:30 PM   |  #3  
kabaldan's Avatar
Recognized Developer
Flag Prague
Thanks Meter: 3,244
 
1,389 posts
Join Date:Joined: Dec 2009
Donate to Me
More
Quote:
Originally Posted by CWGSM3VO

Hi Everyone,
...

The issue captured in the second log is long known to us.
It's caused by msm-dcvs CPU governor and despite a few already applied kernel patches, it remains problematic.
So the current recommendation is: do not use msm-dcvs CPU governor.
(for the reference, see https://github.com/razrqcom-dev-team...mmon/issues/12 )

The issue captured by the first log is something new and I already experienced it myself today (for the first time).
It's obviously a race condition during suspend/resume of camera kernel drivers.
It needs to be investigated. At this point I have no idea which commit brought this issue to surface.
Last edited by kabaldan; 12th March 2014 at 10:41 PM.
The Following User Says Thank You to kabaldan For This Useful Post: [ View ]
12th March 2014, 10:49 PM   |  #4  
CWGSM3VO's Avatar
OP Senior Member
Thanks Meter: 198
 
691 posts
Join Date:Joined: Jan 2012
More
Quote:
Originally Posted by kabaldan

The issue captured in the second log is long known to us.
It's caused by msm-dcvs CPU governor and despite a few already applied kernel patches, it remains problematic.
So the current recommendation is: do not use msm-dcvs CPU governor.
(for the reference, see https://github.com/razrqcom-dev-team...mmon/issues/12 )

The issue captured by the first log is something new and I already experienced it myself today (for the first time).
It's obviously a race condition during suspend/resume of camera kernel drivers.
It needs to be investigated. At this point I have no idea which commit brought this issue to surface.

With regards to the second capture, that makes sense and knew Epinter's commits were merged but will stay away from the governor again. And the explanation of the first is muched appreciated. I do know however, I've experienced the panic using interactive as well in effort to stay as "stock" as one can be.

I'll switch back (so far so good on 03-12) to interactive again and go from there.

Thanks again for responding.
Post Reply Subscribe to Thread

Tags
moto_msm8960 cm11 nightly unified
Previous Thread Next Thread
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes


Top Threads in RAZR HD Q&A, Help & Troubleshooting by ThreadRank