3.4 kernel bring-up

Search This thread

stargo

Inactive Recognized Developer
Jan 7, 2011
538
1,718
Fürth
Hi,

Thanks @kabaldan for your work :)

The main issues remain the same:
- while the modem subsystem can occasionally work stable for tens of minutes, it can also get being restarted by watchdog bite in matter of minutes or even seconds (=too often).
- occasional kernel panic (NULL pointer dereference) at process_one_work coming immediately after hitting this warning https://github.com/CyanogenMod/andr...t-common/blob/cm-11.0/kernel/workqueue.c#L550

I think these two issues are connected, the kernel-panic happens when the kernel tries to notify other parts of the system about state-changes and the corresponding list got corrupted by the crashing modem.

Just to write down what I found out on the weekend:

  1. The crash mostly occurs when one of the CPU-cores changes power-state, usually 2 seconds after CPU1 is shut down
  2. Quite often there are IPC messages to the modem in the time between the CPU-shutdown and the modem-crash
  3. Sometimes the modem crashes without any event when the system was idle for 20s or more.

Sample log for point 1&2:
Code:
<6>[   52.114512,0] CPU1: shutdown
<6>[   53.010834,0] [HDR] [w rr_h] ver=1,type=data    ,src_nid=00000001,src_port_id=0000001a,control_flag=0,size= 25,dst_pid=00000000,dst_cid=00000007
<6>[   53.011231,0] alarm_set_rtc: Failed to set RTC, time will be lost on reboot
<6>[   53.115153,1] CPU1 is up
<6>[   53.365603,0] [RAW] ver=1 type=1 src=0:00000007 crx=0 siz=14 dst=1:0000001a
<6>[   53.499374,1] [HDR] [w rr_h] ver=1,type=data    ,src_nid=00000001,src_port_id=00000034,control_flag=0,size= 25,dst_pid=00000000,dst_cid=0000000c
<3>[   54.555470,0] Watchdog bite received from modem software!
<3>[   54.555562,0] modem subsystem failure reason: (unknown, smem_get_entry failed).
(CPU1 doesn't have to get up again and the RTC error is usually the first time after a system-boot when the modem crashes, IPC dst 1:xxx is the modem)

I have the feeling that the problem is not at all software related but we have a power-problem with the modem and it's crashing when the power-levels fluctuate and it has to do some work when that happens.

Comparing the Asanti device-tree to the decompiled ghost one reveals major differences (mostly because ghost is based on msm8960ab). Ghost has some pm-regulators defined as always-on for example. I've attached the decompiled ghost-devtree and the decompiled old asanti-devtree (more or less useless, and has some byteorder-problems) to this post.
I've also tried basing asanti on msm8960ab (which also missed the always-on parts), but this didn't help (but it did boot up correctly).

Let's hope we find the problem soon :)

Cheers,
Michael
 

Attachments

  • ghost.stock.kk.txt
    44.5 KB · Views: 36
  • asanti.stock.jb.txt
    7.2 KB · Views: 20
Last edited:

mrvek

Senior Member
Feb 10, 2011
579
460
/home
(CPU1 doesn't have to get up again and the RTC error is usually the first time after a system-boot when the modem crashes, IPC dst 1:xxx is the modem)

You can enable RTC writes ( https://github.com/CyanogenMod/andr...11.0/arch/arm/mach-msm/board-mmi-pmic.c#L1241 ) but the modem will still happily crash. Having that disabled seems to be common, some modems, most notably cdma, behave crazy when rtc write is enabled/time changes in some, unknown to me, way.
We had it enabled in 3.0 kernel for a long time without negative side-effects.

There are logs in dmesg of rild beeing killed. Is that supposed to happen or is not important? Every time before modem crash, I can observe that ril-daemon was killed. But I have no clue if it crashed because modem crashed first (but it was not logged until later, perhaps because bp-dumps were beeing written).
 
  • Like
Reactions: JoseBerga

stargo

Inactive Recognized Developer
Jan 7, 2011
538
1,718
Fürth
There are logs in dmesg of rild beeing killed. Is that supposed to happen or is not important? Every time before modem crash, I can observe that ril-daemon was killed. But I have no clue if it crashed because modem crashed first (but it was not logged until later, perhaps because bp-dumps were beeing written).

Interesting, rild is not dieing for me when the modem crashes.

I have disabled bp-dumps as they filled up my storage pretty quickly.

Cheers,
Michael
 
  • Like
Reactions: JoseBerga and mrvek

kabaldan

Inactive Recognized Developer
Dec 15, 2009
1,640
3,926
Prague
android.doshaska.net
Just a status update:
- the workqueue kernel panics have been fixed by a revert to JB sensor blobs
- the modem subsystem watchdog bites are still happening. The most easily reproducible trigger is GPS activity (which most likely corresponds to those IPC messages noticed by stargo). With GPS disabled, the modem subsystem can remain stable even for hours, but the watchdog bites irregularly happen anyway, just not so often.
 

mrvek

Senior Member
Feb 10, 2011
579
460
/home
@kabaldan: how did you solve the initial issue with modem goint to restart as soon as it was brought out of reset by PIL? What was is the issue there?

A snippet of rild dying up until the watchdog bite:
Code:
<6>[   77.662719,0] init: command 'write' r=0
<7>[   78.103891,0] rmnet0: no IPv6 routers present
<6>[   78.396276,0] init: waitpid returned pid 2697, status = 00000000
<5>[   78.396429,0] init: process 'start_fd', pid 2697 exited
<6>[   78.948512,0] init: waitpid returned pid 367, status = 00000009
<5>[   78.948634,0] init: process 'ril-daemon', pid 367 exited
<5>[   78.948756,0] init: process 'ril-daemon' killing any children in process group
<6>[   78.949122,0] init: computing context for service '/system/bin/rild'
<5>[   78.949305,0] init: starting 'ril-daemon'
<6>[   78.950770,1] init: Created socket '/dev/socket/rild-debug' with mode '660', user '1001', group '1000'
<6>[   78.951167,1] init: Created socket '/dev/socket/rild' with mode '660', user '0', group '1001'
<3>[   78.974515,0] init: sys_prop: permission denied uid:1001  name:ro.radio.adb_log.on
<6>[   78.998351,0] init: waitpid returned pid 463, status = 00000009
<3>[   78.998443,0] init: untracked pid 463 exited
<6>[   78.998596,0] init: waitpid returned pid 366, status = 00000009
<5>[   78.998687,0] init: process 'debuggerd', pid 366 exited
<5>[   78.998809,0] init: process 'debuggerd' killing any children in process group
<6>[   78.999084,0] init: computing context for service '/system/bin/debuggerd'
<5>[   78.999328,0] init: starting 'debuggerd'
<6>[   82.025209,0] init: waitpid returned pid 2773, status = 00000100
<3>[   82.025331,0] init: untracked pid 2773 exited
<6>[   82.045627,0] init: waitpid returned pid 2779, status = 00000100
<3>[   82.045780,0] init: untracked pid 2779 exited
<6>[   88.985258,0] binder: undelivered transaction 28903
<6>[   89.183274,0] init: waitpid returned pid 2033, status = 00000000
<3>[   89.183366,0] init: untracked pid 2033 exited
<6>[   89.254814,0] init: waitpid returned pid 2030, status = 00000000
<3>[   89.254906,0] init: untracked pid 2030 exited
<6>[   91.212604,0] init: processing action 0x824c8 (property:gsm.network.type=Unknown)
<6>[   91.212849,0] init: computing context for service '/system/bin/sh'
<5>[   91.213093,0] init: starting 'stop_fd'
<6>[   91.213428,0] init: command 'start' r=0
<6>[   91.237967,0] init: processing action 0x824c8 (property:gsm.network.type=Unknown)
<6>[   91.238150,0] init: command 'start' r=0
<6>[   91.669372,0] init: processing action 0x57e40 (property:sys.sysctl.tcp_def_init_rwnd=*)
<6>[   91.669555,0] init: command 'write' r=0
<6>[   92.224599,1] init: waitpid returned pid 2877, status = 00000000
<5>[   92.224721,1] init: process 'stop_fd', pid 2877 exited
<3>[   94.730016,0] Watchdog bite received from modem software!
<3>[   94.730383,0] modem subsystem failure reason: (unknown, smem_get_entry failed).
I'm not sure what to make of it since you and stargo do not get the same. I can also see other services dying, sometimes ueventd, etc. Though the system seems to be functioning normal.

Has someone noticed modem going to SMSM_RESET state and restarting? On my builds it is less common than watchdog bite but is happenning.
Code:
<6>[  466.541217,0] request_suspend_state: sleep (0->3) at 466527380624 (2014-07-23 19:00:03.705349418 UTC)
<6>[  466.541706,0] lm3532_early_suspend: suspended
<3>[  466.563009,0] dtv_pipe is not configured yet
<6>[  466.586570,0] pm_debug: sleep uah=-1163
<6>[  466.586906,0] atmxt_set_ic_state: IC state ACTIVE -> SLEEP
<3>[  467.149549,0] 
<3>[  467.149549,0] SMSM: Modem SMSM state changed to SMSM_RESET.
<3>[  467.149763,1] Notify: start reset
<3>[  467.149763,0] 
<3>[  467.149763,0] SMSM: Modem SMSM state changed to SMSM_RESET.
<3>[  467.149854,0] Probable fatal error on the modem.
<3>[  467.149854,0] modem subsystem failure reason: (unknown, smem_get_entry failed).
<6>[  467.149854,0] subsys-restart: subsystem_restart_dev(): Restart sequence requested for modem, restart_level = 3.
<3>[  467.150282,1] Notify: start reset
<6>[  467.263909,0] subsys-restart: subsystem_shutdown(): [e3b639c0]: Shutting down modem

The same I've noticed on moto's drop for razr m (3.4.42, patched with kabaldan's changes, using CM defconfig), modem was up about 1 min after powerup and it went to smsm_reset about 18 mins later, soon folllowed by a watchdog bite, but that build had other issues (display was derpy, took forever to boot, "MDP ioctl unsupported" filling the log, sensors, audio broken...) so it may not be fully reliable, I'm not really sure, didn't have much time to play with it.

I've also prevented modem restart, right after logging "Watchdog timeout..." what left modem in a non-working state so it might safe to assume that we are not getting false bp panics.

Could someone please post their kernel log so I'd have something to comapre it to before I start writing more crap?
Thanks.
 

kabaldan

Inactive Recognized Developer
Dec 15, 2009
1,640
3,926
Prague
android.doshaska.net
@mrvek : Thanks for testing the stock motorola 3.4 kernel release and what happens when the watchdog bite is ignored, much appreciated.

Currently, I'm locally running as close to JB blobs as possible - JB mpdecision, qseecomd and thermald in /system/bin/, no KK firmware images besides wcnss (q6 and tzapps images removed from /system/etc/firmware).

Regarding SMSM_RESET state, yes, I've seen it as well.

What I observe now is that when GPS is disabled (Location off or set to Battery saving mode) and radio tech set to 2G only, there are no modem subsystem watchdog bites at all - modem seem to work just fine under such conditions.

No rild crashes observed here as mentioned before. I'll post some logs for you later.

Jugging by some (unfortunately log-less) reports by users of xt907/xt926 KK updated devices running on GSM networks, they are also experiencing some (not clearly specified) modem issues... (also, even the soak testers of beta xt925 KK builds are reporting modem crashes.)
At this point, I have a feeling (maybe too optimistic) that we may see some important commits with the KK release for moto_msm8960 GSM devices by Motorola. So far only CDMA devices have received the official KK update...
 
Last edited:
  • Like
Reactions: S-I-M-O-N and mrvek

mrvek

Senior Member
Feb 10, 2011
579
460
/home
Please disregard the info about the modem being stable when set to 2G only - after more than 22 hours uptime without a single BP dump, soon after I posted the above, the watchdog has bitten again.

Perhaps it just might be related somehow. I've noticed that when mobile data is disabled modem crashes a bit less frequently. Nothing dramatic but it's not that often.

Since we won't be getting updated firmware images I honestly hope that whatever moto or qc fix it wont leave us in a worse state than we are now.

On a related note, from /data/misc/ril/modem.log it seems that error reported is the same file on the same task in over 90% of cases
Code:
Error in file wl1m.c, line 6266
Time of crash (m-d-y h:m:s): 07-21-2014 14:12:19
Uptime (h:m:s): 0:49:23
Build ID: M8960A-AAAANAZM-1.0.103205
baseband ver: ASANTI_C_BP_1139.000.18.12P-FKA011-837FE52_84646EA-2279

REX_TCB ptr: 0x8cbf0d90
tcb.task_name: IST418

Coredump ARCH type is: ERR_ARCH_QDSP6

on one occasion it changed
Code:
Error in file smd_main.c, line 1424
Time of crash (m-d-y h:m:s): 07-21-2014 18:18:52
Uptime (h:m:s): 0:05:49
Build ID: M8960A-AAAANAZM-1.0.103205
baseband ver: ASANTI_C_BP_1139.000.18.12P-FKA011-837FE52_84646EA-2279

REX_TCB ptr: 0x8c0aae18
tcb.task_name: ds

and on another occasion it reported dog.c but I haven't saved that one.

I guess those are modem fw/sw internal sources which we will never see...
 
  • Like
Reactions: S-I-M-O-N

kabaldan

Inactive Recognized Developer
Dec 15, 2009
1,640
3,926
Prague
android.doshaska.net
  • Like
Reactions: S-I-M-O-N and mrvek

mrvek

Senior Member
Feb 10, 2011
579
460
/home
@kabaldan : those reports about modem crashes you have mentioned are exclusively from CM users? The only I found are on CM bug tracker. Are you perhaps aware of other, possibly from KK official testers?
 
  • Like
Reactions: niko99

thewraith420

Senior Member
Sep 3, 2011
395
79
Google Pixel 6
Hi guys. I apologize for the slightly off topic interruption but only someone who is into the kernel code would have a good chance at being able to answer my question.
I'm running my RAZR M on page plus and for that to work I need to be able to run the ics radio. Every jb version was able to use them just fine but the kk update broke that compatibility. My question is do you think it could be possible to patch the 3.4 kernel to work with the ics/jb radio? In my situation I can still use the kk bootloader do I'm thinking it might be somewhat simpler than your situation on the photon Q. Also since I'm making this post I might as well also thank you guys for working on this, really really appreciate it and even if my idea has no ground to stand I know your working on this could possibly bring a 3.4 kernel build to our jb bootloader as well. Thanks again and I won't interrupt again ☺
 
  • Like
Reactions: birdgofly

mrvek

Senior Member
Feb 10, 2011
579
460
/home
Hi.

Just to point few more-less interesting findings.

1) modem watchdog bites in aiplane mode
2) modem watchdog bites without sim card present
3) modem crash log (/data/misc/ril/bp-dumps/modem.log) seems to be successfully written only when modem crashes on SMSM_RESET (which seems to be some bit set by modem). On watchdog bites 16Kb of nothing (0x0) is written
4) according to some reports of other qc baseband crashes the log_modem_sfr() sholud result in "SFR Init: wdog or kernel error suspected." (as seen in smem ramdump), but instead we get "modem subsystem failure reason: (unknown, smem_get_entry failed)."
5) ramdump_modem_{sw/fw}.bin is a lot of 0x0 - this and 4 might indicate some memory misconfiguration or similar issue

The most common "error" seen in smem ramdump is: "gfw_command_process.c
Invalid value passed to GFW in generic config command value passed %d, setting to %d", followed by some binary data which I have no clue how to intrepret. This string can significatly vary in count so it might not be fatal (dumping smem during normal operation and on 3.0.y might be interesting)

Stack dump is as below in vast majority of cases. Assuming it is correct, the issue is strongly related to idle states/suspend. On the other hand, according to some docs, modem disables its watchdog during certain power collapse periods (not clearly specified which).
The most reliable way I was able to find to trigger the modem crash is by toggling screen on/off several times (could be 2-3, might take more, but never fails).

Also, modem's tasks timeout is probably 60s for all tasks except one: "rxtx" (judging by modem crash log and assuming my interpretation of "Timeout" and "Count" table headers is correct. "Is_Blocked" always points to "xtm" and "tlm" tasks, whatever it may mean)
Code:
<3>[  138.293270,0] Watchdog bite received from modem software!
<4>[  138.293728,0] [<c00133bc>] (unwind_backtrace+0x0/0xe0) from [<c0057a3c>] (modem_wdog_bite_irq+0x20/0x58)
<4>[  138.294094,0] [<c0057a3c>] (modem_wdog_bite_irq+0x20/0x58) from [<c00d1f28>] (handle_irq_event_percpu+0x84/0x26c)
<4>[  138.294491,0] [<c00d1f28>] (handle_irq_event_percpu+0x84/0x26c) from [<c00d214c>] (handle_irq_event+0x3c/0x5c)
<4>[  138.294704,0] [<c00d214c>] (handle_irq_event+0x3c/0x5c) from [<c00d4c60>] (handle_fasteoi_irq+0xd8/0x124)
<4>[  138.295070,0] [<c00d4c60>] (handle_fasteoi_irq+0xd8/0x124) from [<c00d1834>] (generic_handle_irq+0x20/0x30)
<4>[  138.295467,0] [<c00d1834>] (generic_handle_irq+0x20/0x30) from [<c000e240>] (handle_IRQ+0x7c/0xc0)
<4>[  138.295833,0] [<c000e240>] (handle_IRQ+0x7c/0xc0) from [<c00085e8>] (gic_handle_irq+0x64/0xb4)
<4>[  138.296200,0] [<c00085e8>] (gic_handle_irq+0x64/0xb4) from [<c07fd7c0>] (__irq_svc+0x40/0x74)
<4>[  138.296413,0] Exception stack(0xc0e05f18 to 0xc0e05f60)
<4>[  138.296749,0] 5f00:                                                       00000000 00000018
<4>[  138.297085,0] 5f20: 00000004 00000000 c2de75c0 00000000 c2de75c0 c0e51178 00000000 511f04d0
<4>[  138.297298,0] 5f40: 00000000 00000000 00000000 c0e05f60 c0052704 c0058160 60000013 ffffffff
<4>[  138.297695,0] [<c07fd7c0>] (__irq_svc+0x40/0x74) from [<c0058160>] (msm_cpuidle_enter+0x54/0x58)
<4>[  138.298061,0] [<c0058160>] (msm_cpuidle_enter+0x54/0x58) from [<c0563404>] (cpuidle_enter+0x14/0x18)
<4>[  138.298428,0] [<c0563404>] (cpuidle_enter+0x14/0x18) from [<c05637d4>] (cpuidle_enter_state+0x14/0x6c)
<4>[  138.298794,0] [<c05637d4>] (cpuidle_enter_state+0x14/0x6c) from [<c056399c>] (cpuidle_idle_call+0x170/0x33c)
<4>[  138.299038,0] [<c056399c>] (cpuidle_idle_call+0x170/0x33c) from [<c000e7e0>] (cpu_idle+0x58/0xe8)
<4>[  138.299404,0] [<c000e7e0>] (cpu_idle+0x58/0xe8) from [<c07c8a14>] (rest_init+0x88/0xa0)
<4>[  138.299771,0] [<c07c8a14>] (rest_init+0x88/0xa0) from [<c0d00b4c>] (start_kernel+0x41c/0x480)
<3>[  138.300137,0] modem subsystem failure reason: (unknown, smem_get_entry failed).
<6>[  138.300350,0] subsys-restart: subsystem_restart_dev(): Restart sequence requested for modem, restart_level = 3.

How do you keep track of proprietary blobs and which are compatible? I remember that some blob (libaducal?) was from a samsung device (S3?) at some point. More importantly, how do you know which blobs need to be updated when kernel updates? Is that perhaps documented somewhere? Keeping track of git history is causing me a lot of problems with this so I'd appreciate any pointers to right direction.

Btw., why are OEMs and chipset manufacturers so rigid on their baseband, ril, etc. sources, documentation and datasheets? Solely because of "security through obsucrity"?
 
  • Like
Reactions: niko99 and comstyle

mrvek

Senior Member
Feb 10, 2011
579
460
/home
Just to correct the meaning of dog_state_table[] :

"Count" is time in seconds that the task has left to respond to watchdog task and bite occurs when it reaches 0 (src):
- timeout - the timeout value (currently in seconds) of the task
- count - amount of seconds left the task has to respond
- is_blocked - boolean that is set to 1/true when the task is not currently monitored by dog

This became awfully frustrating.
Good luck.
 
  • Like
Reactions: niko99

kabaldan

Inactive Recognized Developer
Dec 15, 2009
1,640
3,926
Prague
android.doshaska.net
@mrvek: I took a break from 3.4 kernel bring-up attempts for a while as I was overwhelmed by the tasks at work and also the demands from the usual family life :) .
But I'd like to get back to it.
I agree that there seems to be an issue revolving around the shared memory (SMEM).
I plan to investigate it some more.

Regarding the vendor blobs, there's no easy way to cope with it, but basically I chose to use the new blobs everywhere (to match the updated kernel space), unless the blobs relate to modem (qmi - qualcomm modem interface), which includes e.g. the gps related proprietaries.

The audio lib you mention is libacdbloader.so - we needed to switch the kernel to use ION instead of PMEM everywhere to stay on the track, so we also needed to find a compatible proprietary libacdbloader lib. Initially, mako (Nexus 4) lib was used, but it was not a good match (mako is using APQ8064, not MSM8960, so while close, not getting there in the end). So the issue was later fixed by using HTC One S (ville - htc s4 family) lib, which was the only available good matching ION using alternative at that time.
 
Last edited:

mrvek

Senior Member
Feb 10, 2011
579
460
/home
@kabaldan: thank you very much for the explanation ;)

Btw., the previously mentioned "error" is not fatal as it happens in 3.0 on a running modem and can repeat a lot.
Code:
gfw_command_process.c
Invalid value passed to GFW in generic config command value passed 0, setting to 305
This one is generated by diag_mdlog binary which I noticed in stock JB xt897 release. Nifty little tool, hard to interpret output.
 

mifritscher

Senior Member
Aug 5, 2007
52
38
Are there any news regarding this?
Just as a silly idea: Is there a way to compare the status of the memory management unit between 3.0 and 3.4?
Or does a new bug in 3.4 sometimes corrupt something in the memory area of the modem (perhaps fixed in the meantime ;) )?
Or could we compare the memory area using a 3.0 with a 3.4 kernel to see any corruption?
 
Last edited:

mifritscher

Senior Member
Aug 5, 2007
52
38
On one of my photons I get these regulary crashes as well - Linegeos 14 - kernel 3.0

[ 116.882313,0] Watchdog bite received from modem SW!
[ 116.882374,0] subsys-restart: subsystem_restart(): Restart sequence requested for modem, restart_level = 3.
[ 116.882618,0] subsys-restart: subsystem_restart_thread(): [c9ea6f80]: Shutting down modem
 
Last edited:

mifritscher

Senior Member
Aug 5, 2007
52
38
The biggest problem of the 3.0 kernel for me is that glibc 2.23 needs kernel 3.2 at least. So e.g. a chroot to a ubuntu 16.04 does not work. A solution could be that the kernel pretends to be 3.2, and possibly a few fixes/changes to match this pretend to the extend that glibc works. Will that be possible?

Edit: Perhaps http://man7.org/tlpi/api_changes/#Linux-3.2 is of help?
 
Last edited:

Top Liked Posts

  • There are no posts matching your filters.
  • 11
    @stargo , @Skrilax_CZ , @mrvek , @arrrghhh and others:
    Sorry for the delay. I've been too busy at work and home again (surprise!).

    Anyway, I've finally managed to push my current local mess to razrqcom, reusing the old xt897 and msm8960-common repos for tentative 3.4 kernel on JB firmware builds.
    Local manifest:
    Code:
    <?xml version="1.0" encoding="UTF-8"?>
    <manifest>
      <project name="razrqcom-dev-team/android_device_motorola_xt897" path="device/motorola/xt897" remote="github" revision="cm-11.0-3.4" />
      <project name="razrqcom-dev-team/android_device_motorola_msm8960-common" path="device/motorola/msm8960-common" remote="github" revision="cm-11.0-3.4" />
      <project name="razrqcom-dev-team/proprietary_vendor_motorola" path="vendor/motorola" remote="github" revision="cm-11.0-3.4" />
    </manifest>

    The main issues remain the same:
    - while the modem subsystem can occasionally work stable for tens of minutes, it can also get being restarted by watchdog bite in matter of minutes or even seconds (=too often).
    - occasional kernel panic (NULL pointer dereference) at process_one_work coming immediately after hitting this warning https://github.com/CyanogenMod/andr...t-common/blob/cm-11.0/kernel/workqueue.c#L550
    9
    By KitKat update for xt897/asanti being canceled, we're missing at least:
    updated bootladers (sbl1,sbl2,sbl3 and appsboot), updated trustzone (not sure if that is an issue at this point), updated modem firmware.

    Fortunately, at least the device tree for asanti seems to be ready and working fine, apart from not being handled by JB bootloader.

    As a base, I'm using https://github.com/CyanogenMod/android_kernel_motorola_msm8960dt-common/tree/cm-11.0 that has been updated by dhacker for support of KitKat updated Razr HD/M.

    To circumvent dt not being handled, I'm using appended dtb:
    CONFIG_ARM_APPENDED_DTB=y
    CONFIG_ARM_ATAG_DTB_COMPAT=y

    Device tree partition check revealed that my device is p2 or p2b revision (https://github.com/MotorolaMobility...razrm/arch/arm/boot/dts/msm8960-asanti-p2.dts and https://github.com/MotorolaMobility...azrm/arch/arm/boot/dts/msm8960-asanti-p2b.dts are identical), so I'm using msm8960-asanti-p2.dtb appended to zImage.

    Next issue is that bootloader is supposed to add dynamic data on runtime in addition to static ones loaded from dtb.

    To get the display working, I've added:
    Code:
    	chosen {
    		/* mipi_mot_cmd_auo_qhd_430 */
    		mmi,panel_name = [6d6970695f6d6f745f636d645f61756f5f7168645f34333000];
    	};
    to msm8960-asanti-p2.dts.

    That's where I'm currently. The next step is to add additional entries we're missing there (e.g. "mmi,mbmprotocol" etc.).

    Regarding bootloaders, I see another issue - memory configuration.
    There's failing shared mem allocation in mmi_unit_info_init:
    https://github.com/CyanogenMod/andr...ob/cm-11.0/arch/arm/mach-msm/board-mmi.c#L433
    That memory block is supposed to be reserved by sbl3, as indicated by https://github.com/MotorolaMobilityLLC/kernel-msm/commit/585b70b3fcd1af84edcb8748d1fdf190b32784af

    Also, this commit https://github.com/MotorolaMobilityLLC/kernel-msm/commit/e3b9a040ca408d217a873df7994b9452ece04d16 indicates that Moto has done some tests also with older modem firmware, at least at some point...
    9
    OK, good news, I've got the BP running fine now, under 3.4 kernel.
    9
    Just to give a 3.4 kernel on Q current status report:
    It's not stable yet. Occasionally, there's a kernel panic, usually when going to suspend state.
    Will need to figure out the cause.
    Otherwise, everything is basically working: radio, gps, nfc, wlan (while using the new wcnss firmware), bt, audio, camera, usb...
    6
    Hi,

    Thanks @kabaldan for your work :)

    The main issues remain the same:
    - while the modem subsystem can occasionally work stable for tens of minutes, it can also get being restarted by watchdog bite in matter of minutes or even seconds (=too often).
    - occasional kernel panic (NULL pointer dereference) at process_one_work coming immediately after hitting this warning https://github.com/CyanogenMod/andr...t-common/blob/cm-11.0/kernel/workqueue.c#L550

    I think these two issues are connected, the kernel-panic happens when the kernel tries to notify other parts of the system about state-changes and the corresponding list got corrupted by the crashing modem.

    Just to write down what I found out on the weekend:

    1. The crash mostly occurs when one of the CPU-cores changes power-state, usually 2 seconds after CPU1 is shut down
    2. Quite often there are IPC messages to the modem in the time between the CPU-shutdown and the modem-crash
    3. Sometimes the modem crashes without any event when the system was idle for 20s or more.

    Sample log for point 1&2:
    Code:
    <6>[   52.114512,0] CPU1: shutdown
    <6>[   53.010834,0] [HDR] [w rr_h] ver=1,type=data    ,src_nid=00000001,src_port_id=0000001a,control_flag=0,size= 25,dst_pid=00000000,dst_cid=00000007
    <6>[   53.011231,0] alarm_set_rtc: Failed to set RTC, time will be lost on reboot
    <6>[   53.115153,1] CPU1 is up
    <6>[   53.365603,0] [RAW] ver=1 type=1 src=0:00000007 crx=0 siz=14 dst=1:0000001a
    <6>[   53.499374,1] [HDR] [w rr_h] ver=1,type=data    ,src_nid=00000001,src_port_id=00000034,control_flag=0,size= 25,dst_pid=00000000,dst_cid=0000000c
    <3>[   54.555470,0] Watchdog bite received from modem software!
    <3>[   54.555562,0] modem subsystem failure reason: (unknown, smem_get_entry failed).
    (CPU1 doesn't have to get up again and the RTC error is usually the first time after a system-boot when the modem crashes, IPC dst 1:xxx is the modem)

    I have the feeling that the problem is not at all software related but we have a power-problem with the modem and it's crashing when the power-levels fluctuate and it has to do some work when that happens.

    Comparing the Asanti device-tree to the decompiled ghost one reveals major differences (mostly because ghost is based on msm8960ab). Ghost has some pm-regulators defined as always-on for example. I've attached the decompiled ghost-devtree and the decompiled old asanti-devtree (more or less useless, and has some byteorder-problems) to this post.
    I've also tried basing asanti on msm8960ab (which also missed the always-on parts), but this didn't help (but it did boot up correctly).

    Let's hope we find the problem soon :)

    Cheers,
    Michael