Hi,
Thanks @kabaldan for your work
I think these two issues are connected, the kernel-panic happens when the kernel tries to notify other parts of the system about state-changes and the corresponding list got corrupted by the crashing modem.
Just to write down what I found out on the weekend:
Sample log for point 1&2:
(CPU1 doesn't have to get up again and the RTC error is usually the first time after a system-boot when the modem crashes, IPC dst 1:xxx is the modem)
I have the feeling that the problem is not at all software related but we have a power-problem with the modem and it's crashing when the power-levels fluctuate and it has to do some work when that happens.
Comparing the Asanti device-tree to the decompiled ghost one reveals major differences (mostly because ghost is based on msm8960ab). Ghost has some pm-regulators defined as always-on for example. I've attached the decompiled ghost-devtree and the decompiled old asanti-devtree (more or less useless, and has some byteorder-problems) to this post.
I've also tried basing asanti on msm8960ab (which also missed the always-on parts), but this didn't help (but it did boot up correctly).
Let's hope we find the problem soon
Cheers,
Michael
Thanks @kabaldan for your work
The main issues remain the same:
- while the modem subsystem can occasionally work stable for tens of minutes, it can also get being restarted by watchdog bite in matter of minutes or even seconds (=too often).
- occasional kernel panic (NULL pointer dereference) at process_one_work coming immediately after hitting this warning https://github.com/CyanogenMod/andr...t-common/blob/cm-11.0/kernel/workqueue.c#L550
I think these two issues are connected, the kernel-panic happens when the kernel tries to notify other parts of the system about state-changes and the corresponding list got corrupted by the crashing modem.
Just to write down what I found out on the weekend:
- The crash mostly occurs when one of the CPU-cores changes power-state, usually 2 seconds after CPU1 is shut down
- Quite often there are IPC messages to the modem in the time between the CPU-shutdown and the modem-crash
- Sometimes the modem crashes without any event when the system was idle for 20s or more.
Sample log for point 1&2:
Code:
<6>[ 52.114512,0] CPU1: shutdown
<6>[ 53.010834,0] [HDR] [w rr_h] ver=1,type=data ,src_nid=00000001,src_port_id=0000001a,control_flag=0,size= 25,dst_pid=00000000,dst_cid=00000007
<6>[ 53.011231,0] alarm_set_rtc: Failed to set RTC, time will be lost on reboot
<6>[ 53.115153,1] CPU1 is up
<6>[ 53.365603,0] [RAW] ver=1 type=1 src=0:00000007 crx=0 siz=14 dst=1:0000001a
<6>[ 53.499374,1] [HDR] [w rr_h] ver=1,type=data ,src_nid=00000001,src_port_id=00000034,control_flag=0,size= 25,dst_pid=00000000,dst_cid=0000000c
<3>[ 54.555470,0] Watchdog bite received from modem software!
<3>[ 54.555562,0] modem subsystem failure reason: (unknown, smem_get_entry failed).
I have the feeling that the problem is not at all software related but we have a power-problem with the modem and it's crashing when the power-levels fluctuate and it has to do some work when that happens.
Comparing the Asanti device-tree to the decompiled ghost one reveals major differences (mostly because ghost is based on msm8960ab). Ghost has some pm-regulators defined as always-on for example. I've attached the decompiled ghost-devtree and the decompiled old asanti-devtree (more or less useless, and has some byteorder-problems) to this post.
I've also tried basing asanti on msm8960ab (which also missed the always-on parts), but this didn't help (but it did boot up correctly).
Let's hope we find the problem soon
Cheers,
Michael
Attachments
Last edited: