Attend XDA's Second Annual Developer Conference, XDA:DevCon 2014!
5,802,957 Members 38,930 Now Online
XDA Developers Android and Mobile Development Forum

[Bug report] Music player stops playing music from sd after some time

Tip us?
 
zeitferne
Old
#521  
Junior Member
Thanks Meter 21
Posts: 12
Join Date: Jul 2014
Quote:
Originally Posted by Lanchon View Post
i think the best way to proceed would be to make a native synthetic workload that reliably triggers and detects the bug (see -fstack-protector-all) independent of android to test stock kernels and the old abandoned 4210 cm kernel.
The stack protector was actually a great idea! CM smdk4412 kernel on my SGS2 i9100 with the stack protector enabled (the normal kernel option CONFIG_CC_STACKPROTECTOR=y) crashes right while booting (full(er) last_kmsg):
Code:
<0>[   26.412257] c1 Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: bf013a2c
<0>[   26.412315] c1 
<4>[   26.412334] c1 Backtrace: 
<4>[   26.412382] c1 [<c064e5b8>] (dump_backtrace+0x0/0x10c) from [<c0b91e6c>] (dump_stack+0x18/0x1c)
<4>[   26.412439] c1  r6:e211e820 r5:c0ed4760 r4:c0f5c940 r3:271aed5c
<4>[   26.412496] c1 [<c0b91e54>] (dump_stack+0x0/0x1c) from [<c0b92204>] (panic+0x80/0x1ac)
<4>[   26.412561] c1 [<c0b92184>] (panic+0x0/0x1ac) from [<c0684be0>] (init_oops_id+0x0/0x58)
<4>[   26.412613] c1  r3:271aed5c r2:271aed00 r1:bf013a2c r0:c0cb8880
<4>[   26.412663] c1  r7:e273bc32
<4>[   26.412742] c1 [<c0684bc4>] (__stack_chk_fail+0x0/0x1c) from [<bf013a2c>] (dhd_write_macaddr+0x2e4/0x310 [dhd])
<4>[   26.412864] c1 [<bf013748>] (dhd_write_macaddr+0x0/0x310 [dhd]) from [<bf01a554>] (dhd_bus_start+0x1a4/0x2e0 [dhd])
<4>[   26.412985] c1 [<bf01a3b0>] (dhd_bus_start+0x0/0x2e0 [dhd]) from [<bf020558>] (dhdsdio_probe+0x4a4/0x72c [dhd])
<4>[   26.413097] c1 [<bf0200b4>] (dhdsdio_probe+0x0/0x72c [dhd]) from [<bf00c0ec>] (bcmsdh_probe+0xf8/0x150 [dhd])
<4>[   26.413206] c1 [<bf00bff4>] (bcmsdh_probe+0x0/0x150 [dhd]) from [<bf00e038>] (bcmsdh_sdmmc_probe+0x54/0xbc [dhd])
<4>[   26.413304] c1 [<bf00dfe4>] (bcmsdh_sdmmc_probe+0x0/0xbc [dhd]) from [<c09a7fe8>] (sdio_bus_probe+0xfc/0x108)
<4>[   26.413368] c1  r5:e2d97000 r4:e2d97008
<4>[   26.413414] c1 [<c09a7eec>] (sdio_bus_probe+0x0/0x108) from [<c0896764>] (driver_probe_device+0x94/0x1a8)
<4>[   26.413474] c1  r8:00000000 r7:bf067414 r6:e2d9703c r5:c0f6ddb8 r4:e2d97008
<4>[   26.413531] c1 r3:c09a7eec
<4>[   26.413563] c1 [<c08966d0>] (driver_probe_device+0x0/0x1a8) from [<c089690c>] (__driver_attach+0x94/0x98)
<4>[   26.413624] c1  r7:e2e631e0 r6:e2d9703c r5:bf067414 r4:e2d97008
<4>[   26.413683] c1 [<c0896878>] (__driver_attach+0x0/0x98) from [<c0895678>] (bus_for_each_dev+0x4c/0x94)
<4>[   26.413742] c1  r6:c0896878 r5:bf067414 r4:00000000 r3:c0896878
<4>[   26.413799] c1 [<c089562c>] (bus_for_each_dev+0x0/0x94) from [<c0896428>] (driver_attach+0x24/0x28)
<4>[   26.413857] c1  r6:c0f02af0 r5:bf067414 r4:bf067414
<4>[   26.413904] c1 [<c0896404>] (driver_attach+0x0/0x28) from [<c08960c8>] (bus_add_driver+0x180/0x250)
<4>[   26.413970] c1 [<c0895f48>] (bus_add_driver+0x0/0x250) from [<c0896e14>] (driver_register+0x80/0x150)
<4>[   26.414037] c1 [<c0896d94>] (driver_register+0x0/0x150) from [<c09a8128>] (sdio_register_driver+0x2c/0x30)
<4>[   26.414131] c1 [<c09a80fc>] (sdio_register_driver+0x0/0x30) from [<bf00e250>] (sdio_function_init+0x3c/0x8c [dhd])
<4>[   26.414244] c1 [<bf00e214>] (sdio_function_init+0x0/0x8c [dhd]) from [<bf00c19c>] (bcmsdh_register+0x1c/0x24 [dhd])
<4>[   26.414311] c1  r5:00000004 r4:bf06a3c4
<4>[   26.414398] c1 [<bf00c180>] (bcmsdh_register+0x0/0x24 [dhd]) from [<bf027990>] (dhd_bus_register+0x24/0x48 [dhd])
<4>[   26.414515] c1 [<bf02796c>] (dhd_bus_register+0x0/0x48 [dhd]) from [<bf07618c>] (init_module+0x18c/0x284 [dhd])
<4>[   26.414610] c1 [<bf076000>] (init_module+0x0/0x284 [dhd]) from [<c06448f8>] (do_one_initcall+0x128/0x1a8)
<4>[   26.414683] c1 [<c06447d0>] (do_one_initcall+0x0/0x1a8) from [<c06b9710>] (sys_init_module+0xdf8/0x1b1c)
<4>[   26.414756] c1 [<c06b8918>] (sys_init_module+0x0/0x1b1c) from [<c064a8c0>] (ret_fast_syscall+0x0/0x30)
<2>[   26.414861] c0 CPU0: stopping
<4>[   26.414886] c0 Backtrace: 
<4>[   26.414920] c0 [<c064e5b8>] (dump_backtrace+0x0/0x10c) from [<c0b91e6c>] (dump_stack+0x18/0x1c)
<4>[   26.414977] c0  r6:c0d54000 r5:c0eb5d08 r4:00000006 r3:271aed5c
<4>[   26.415039] c0 [<c0b91e54>] (dump_stack+0x0/0x1c) from [<c06444bc>] (do_IPI+0x258/0x29c)
<4>[   26.415102] c0 [<c0644264>] (do_IPI+0x0/0x29c) from [<c064a340>] (__irq_svc+0x80/0x130)
<4>[   26.415156] c0 Exception stack(0xc0d55ef0 to 0xc0d55f38)
<4>[   26.415197] c0 5ee0:                                     3b9ac9ff 540deacd 01c99e53 00072679
<4>[   26.415258] c0 5f00: c0f5a468 00000000 c0d54000 00000000 c1b540a8 412fc091 00000000 c0d55f64
<4>[   26.415317] c0 5f20: 540deacd c0d55f38 c06aa768 c065bd78 20000013 ffffffff
<4>[   26.415380] c0 [<c065bd3c>] (exynos4_enter_idle+0x0/0x174) from [<c099a890>] (cpuidle_idle_call+0xa4/0x120)
<4>[   26.415442] c0  r7:00000000 r6:00000001 r5:c0f815ac r4:c1b540b8
<4>[   26.415498] c0 [<c099a7ec>] (cpuidle_idle_call+0x0/0x120) from [<c064bd40>] (cpu_idle+0xc4/0x100)
<4>[   26.415554] c0  r8:4000406a r7:c0ba09a8 r6:c0f59ec4 r5:c0ebd8c4 r4:c0d54000
<4>[   26.415610] c0 r3:c099a7ec
<4>[   26.415641] c0 [<c064bc7c>] (cpu_idle+0x0/0x100) from [<c0b83238>] (rest_init+0x8c/0xa4)
<4>[   26.415694] c0  r7:c1b51180 r6:c0f59e00 r5:00000002 r4:c0d54000
<4>[   26.415752] c0 [<c0b831ac>] (rest_init+0x0/0xa4) from [<c00089c4>] (start_kernel+0x2dc/0x330)
<4>[   26.415807] c0  r5:c063d944 r4:c0eb5d34
<4>[   26.415845] c0 [<c00086e8>] (start_kernel+0x0/0x330) from [<40008044>] (0x40008044)
The Following 2 Users Say Thank You to zeitferne For This Useful Post: [ Click to Expand ]
 
Lanchon
Old
#522  
Senior Member
Thanks Meter 148
Posts: 349
Join Date: Jun 2011
lol, amazing that android boots at all!

still, a synthetic workload would be better imho. we could run it on stock and other roms and see if any is not affected. i dont have a 4210 device so i cant test anything. please advice if you d like help coming up with a suitable workload.
OnePlus One
If you would like to thank me, a OnePlus One invite would be a great way to do it

Dropbox Replacement
10 times as much space: 20GB for free (if you use this referral) instead of 2. Sync for Linux, Mac and Windows. Android and iOS apps.
 
HippyTed
Old
(Last edited by HippyTed; 9th September 2014 at 02:21 AM.)
#523  
Member
Thanks Meter 34
Posts: 88
Join Date: Jul 2012
Location: Nottingham
Quote:
Originally Posted by zeitferne View Post
The stack protector was actually a great idea! CM smdk4412 kernel on my SGS2 i9100 with the stack protector enabled (the normal kernel option CONFIG_CC_STACKPROTECTOR=y) crashes right while booting (full(er) last_kmsg):
Code:
<0>[   26.412257] c1 Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: bf013a2c
<0>[   26.412315] c1 
<4>[   26.412334] c1 Backtrace: 
<4>[   26.412382] c1 [<c064e5b8>] (dump_backtrace+0x0/0x10c) from [<c0b91e6c>] (dump_stack+0x18/0x1c)
<4>[   26.412439] c1  r6:e211e820 r5:c0ed4760 r4:c0f5c940 r3:271aed5c
<4>[   26.412496] c1 [<c0b91e54>] (dump_stack+0x0/0x1c) from [<c0b92204>] (panic+0x80/0x1ac)
<4>[   26.412561] c1 [<c0b92184>] (panic+0x0/0x1ac) from [<c0684be0>] (init_oops_id+0x0/0x58)
<4>[   26.412613] c1  r3:271aed5c r2:271aed00 r1:bf013a2c r0:c0cb8880
<4>[   26.412663] c1  r7:e273bc32
<4>[   26.412742] c1 [<c0684bc4>] (__stack_chk_fail+0x0/0x1c) from [<bf013a2c>] (dhd_write_macaddr+0x2e4/0x310 [dhd])
<4>[   26.412864] c1 [<bf013748>] (dhd_write_macaddr+0x0/0x310 [dhd]) from [<bf01a554>] (dhd_bus_start+0x1a4/0x2e0 [dhd])
<4>[   26.412985] c1 [<bf01a3b0>] (dhd_bus_start+0x0/0x2e0 [dhd]) from [<bf020558>] (dhdsdio_probe+0x4a4/0x72c [dhd])
<4>[   26.413097] c1 [<bf0200b4>] (dhdsdio_probe+0x0/0x72c [dhd]) from [<bf00c0ec>] (bcmsdh_probe+0xf8/0x150 [dhd])
<4>[   26.413206] c1 [<bf00bff4>] (bcmsdh_probe+0x0/0x150 [dhd]) from [<bf00e038>] (bcmsdh_sdmmc_probe+0x54/0xbc [dhd])
<4>[   26.413304] c1 [<bf00dfe4>] (bcmsdh_sdmmc_probe+0x0/0xbc [dhd]) from [<c09a7fe8>] (sdio_bus_probe+0xfc/0x108)
<4>[   26.413368] c1  r5:e2d97000 r4:e2d97008
<4>[   26.413414] c1 [<c09a7eec>] (sdio_bus_probe+0x0/0x108) from [<c0896764>] (driver_probe_device+0x94/0x1a8)
<4>[   26.413474] c1  r8:00000000 r7:bf067414 r6:e2d9703c r5:c0f6ddb8 r4:e2d97008
<4>[   26.413531] c1 r3:c09a7eec
<4>[   26.413563] c1 [<c08966d0>] (driver_probe_device+0x0/0x1a8) from [<c089690c>] (__driver_attach+0x94/0x98)
<4>[   26.413624] c1  r7:e2e631e0 r6:e2d9703c r5:bf067414 r4:e2d97008
<4>[   26.413683] c1 [<c0896878>] (__driver_attach+0x0/0x98) from [<c0895678>] (bus_for_each_dev+0x4c/0x94)
<4>[   26.413742] c1  r6:c0896878 r5:bf067414 r4:00000000 r3:c0896878
<4>[   26.413799] c1 [<c089562c>] (bus_for_each_dev+0x0/0x94) from [<c0896428>] (driver_attach+0x24/0x28)
<4>[   26.413857] c1  r6:c0f02af0 r5:bf067414 r4:bf067414
<4>[   26.413904] c1 [<c0896404>] (driver_attach+0x0/0x28) from [<c08960c8>] (bus_add_driver+0x180/0x250)
<4>[   26.413970] c1 [<c0895f48>] (bus_add_driver+0x0/0x250) from [<c0896e14>] (driver_register+0x80/0x150)
<4>[   26.414037] c1 [<c0896d94>] (driver_register+0x0/0x150) from [<c09a8128>] (sdio_register_driver+0x2c/0x30)
<4>[   26.414131] c1 [<c09a80fc>] (sdio_register_driver+0x0/0x30) from [<bf00e250>] (sdio_function_init+0x3c/0x8c [dhd])
<4>[   26.414244] c1 [<bf00e214>] (sdio_function_init+0x0/0x8c [dhd]) from [<bf00c19c>] (bcmsdh_register+0x1c/0x24 [dhd])
<4>[   26.414311] c1  r5:00000004 r4:bf06a3c4
<4>[   26.414398] c1 [<bf00c180>] (bcmsdh_register+0x0/0x24 [dhd]) from [<bf027990>] (dhd_bus_register+0x24/0x48 [dhd])
<4>[   26.414515] c1 [<bf02796c>] (dhd_bus_register+0x0/0x48 [dhd]) from [<bf07618c>] (init_module+0x18c/0x284 [dhd])
<4>[   26.414610] c1 [<bf076000>] (init_module+0x0/0x284 [dhd]) from [<c06448f8>] (do_one_initcall+0x128/0x1a8)
<4>[   26.414683] c1 [<c06447d0>] (do_one_initcall+0x0/0x1a8) from [<c06b9710>] (sys_init_module+0xdf8/0x1b1c)
<4>[   26.414756] c1 [<c06b8918>] (sys_init_module+0x0/0x1b1c) from [<c064a8c0>] (ret_fast_syscall+0x0/0x30)
<2>[   26.414861] c0 CPU0: stopping
<4>[   26.414886] c0 Backtrace: 
<4>[   26.414920] c0 [<c064e5b8>] (dump_backtrace+0x0/0x10c) from [<c0b91e6c>] (dump_stack+0x18/0x1c)
<4>[   26.414977] c0  r6:c0d54000 r5:c0eb5d08 r4:00000006 r3:271aed5c
<4>[   26.415039] c0 [<c0b91e54>] (dump_stack+0x0/0x1c) from [<c06444bc>] (do_IPI+0x258/0x29c)
<4>[   26.415102] c0 [<c0644264>] (do_IPI+0x0/0x29c) from [<c064a340>] (__irq_svc+0x80/0x130)
<4>[   26.415156] c0 Exception stack(0xc0d55ef0 to 0xc0d55f38)
<4>[   26.415197] c0 5ee0:                                     3b9ac9ff 540deacd 01c99e53 00072679
<4>[   26.415258] c0 5f00: c0f5a468 00000000 c0d54000 00000000 c1b540a8 412fc091 00000000 c0d55f64
<4>[   26.415317] c0 5f20: 540deacd c0d55f38 c06aa768 c065bd78 20000013 ffffffff
<4>[   26.415380] c0 [<c065bd3c>] (exynos4_enter_idle+0x0/0x174) from [<c099a890>] (cpuidle_idle_call+0xa4/0x120)
<4>[   26.415442] c0  r7:00000000 r6:00000001 r5:c0f815ac r4:c1b540b8
<4>[   26.415498] c0 [<c099a7ec>] (cpuidle_idle_call+0x0/0x120) from [<c064bd40>] (cpu_idle+0xc4/0x100)
<4>[   26.415554] c0  r8:4000406a r7:c0ba09a8 r6:c0f59ec4 r5:c0ebd8c4 r4:c0d54000
<4>[   26.415610] c0 r3:c099a7ec
<4>[   26.415641] c0 [<c064bc7c>] (cpu_idle+0x0/0x100) from [<c0b83238>] (rest_init+0x8c/0xa4)
<4>[   26.415694] c0  r7:c1b51180 r6:c0f59e00 r5:00000002 r4:c0d54000
<4>[   26.415752] c0 [<c0b831ac>] (rest_init+0x0/0xa4) from [<c00089c4>] (start_kernel+0x2dc/0x330)
<4>[   26.415807] c0  r5:c063d944 r4:c0eb5d34
<4>[   26.415845] c0 [<c00086e8>] (start_kernel+0x0/0x330) from [<40008044>] (0x40008044)
Sorry if I'm teaching a grandmother how to suck eggs! I see dhd_write_macaddr() comes from drivers/net/wireless/bcmdhd/dhd_custom_sec.c which ends up in the dhd.ko kernel module.

T
rying to run
prebuilt/linux-x86/toolchain/arm-eabi-4.4.3/arm-eabi/bin/objdump -d out/target/product/i9100/system/lib/modules/dhd.ko | less

Doesnt seem to tie up for me. Maybe I'm using a different compiler than you, and/or the stack protector has changed the addresses.
Have you managed to see why it derps? I dont really know ARM assembly language much, anyway.
 
Lanchon
Old
#524  
Senior Member
Thanks Meter 148
Posts: 349
Join Date: Jun 2011
Quote:
Originally Posted by HippyTed View Post
Sorry if I'm teaching a grandmother how to suck eggs! I see dhd_write_macaddr() comes from drivers/net/wireless/bcmdhd/dhd_custom_sec.c which ends up in the dhd.ko kernel module.

T
rying to run
prebuilt/linux-x86/toolchain/arm-eabi-4.4.3/arm-eabi/bin/objdump -d out/target/product/i9100/system/lib/modules/dhd.ko | less

Doesnt seem to tie up for me. Maybe I'm using a different compiler than you, and/or the stack protector has changed the addresses.
Have you managed to see why it derps? I dont really know ARM assembly language much, anyway.
the failure is probably non deterministic and happens when a hardware interrupt is triggered. by the way @zeitferne, can you confirm that the kernel dies in a random place each time?
OnePlus One
If you would like to thank me, a OnePlus One invite would be a great way to do it

Dropbox Replacement
10 times as much space: 20GB for free (if you use this referral) instead of 2. Sync for Linux, Mac and Windows. Android and iOS apps.
 
zeitferne
Old
#525  
Junior Member
Thanks Meter 21
Posts: 12
Join Date: Jul 2014
Quote:
Originally Posted by Lanchon View Post
the failure is probably non deterministic and happens when a hardware interrupt is triggered. by the way @zeitferne, can you confirm that the kernel dies in a random place each time?
Sorry, this is a very deterministic and probably unrelated bug. In drivers/net/wireless/bcmdhd/dhd_custom_sec.c:11032 replace
Code:
char buf[18]   = {0};
with
Code:
char buf[sizeof("00:11:22:33:44:55\n")]   = {0};
(where the sizeof expression evaluates to 19).
This was "just" an off-by-one error in the wireless driver initialization, causing the sprintf in line 1110 to write a rogue null byte.

After changing this, the kernel boots fine, but still exhibitis the sdcard bug. Even enabling -fstack-protector-all (which required changing the option from -fstack-protector in arch/arm/Makefile:41 and adding -fno-stack-protector add arch/arm/boot/compressed/Makefile:103 to avoid linker errors) does not find any more stack corruptions, even while triggering the sdcard bug.
The Following User Says Thank You to zeitferne For This Useful Post: [ Click to Expand ]
 
Lanchon
Old
#526  
Senior Member
Thanks Meter 148
Posts: 349
Join Date: Jun 2011
Quote:
Originally Posted by zeitferne View Post
Sorry, this is a very deterministic and probably unrelated bug. In drivers/net/wireless/bcmdhd/dhd_custom_sec.c:11032 replace
Code:
char buf[18]   = {0};
with
Code:
char buf[sizeof("00:11:22:33:44:55\n")]   = {0};
(where the sizeof expression evaluates to 19).
This was "just" an off-by-one error in the wireless driver initialization, causing the sprintf in line 1110 to write a rogue null byte.

After changing this, the kernel boots fine, but still exhibitis the sdcard bug. Even enabling -fstack-protector-all (which required changing the option from -fstack-protector in arch/arm/Makefile:41 and adding -fno-stack-protector add arch/arm/boot/compressed/Makefile:103 to avoid linker errors) does not find any more stack corruptions, even while triggering the sdcard bug.
well the kernel might not be the best place to stack-protect. i guess building sdcard.c with stack protect should trigger the protection fault. but sdcard is not portable, we should find better workload.
OnePlus One
If you would like to thank me, a OnePlus One invite would be a great way to do it

Dropbox Replacement
10 times as much space: 20GB for free (if you use this referral) instead of 2. Sync for Linux, Mac and Windows. Android and iOS apps.
 
zeitferne
Old
(Last edited by zeitferne; 9th September 2014 at 10:35 AM.)
#527  
Junior Member
Thanks Meter 21
Posts: 12
Join Date: Jul 2014
Quote:
Originally Posted by Lanchon View Post
well the kernel might not be the best place to stack-protect. i guess building sdcard.c with stack protect should trigger the protection fault. but sdcard is not portable, we should find better workload.
Well, I don't know very much about Android or Linux kernel development but I can certainly test things and probably also build them. If you come up with something, I would be happy to try it. If I should also test it on Stock, it would be best in APK form and without requiring root, so that I can test in on my brothers SGS2 w/o flashing around (I can't get nandroid to restore one ROM from the other).

But I also think that more important than checking if the bug appears on Stock or on the 4210 kernel is actually finding the bug. There are so many differences between these kernels that I don't know if this would be very much help in locating the bug. Still interesting to know though.

EDIT: I submitted the wireless driver fix to CyanogenMod's Gerrit: http://review.cyanogenmod.org/#/c/72657/
The Following 2 Users Say Thank You to zeitferne For This Useful Post: [ Click to Expand ]
 
baldaz
Old
#528  
Junior Member
Thanks Meter 0
Posts: 4
Join Date: Sep 2012
If we are in the section of the forum regarding the OMNI ROM why not submit this patch also in Gerrit OMNI?
 
zeitferne
Old
#529  
Junior Member
Thanks Meter 21
Posts: 12
Join Date: Jul 2014
Quote:
Originally Posted by baldaz View Post
If we are in the section of the forum regarding the OMNI ROM why not submit this patch also in Gerrit OMNI?
I would need to download the whole repo which takes hours with my bandwidth. Also I haven't tested with OmniROM (you can never be sure, although this change should be more than safe). But anyone can commit this change to OmniROM, git commit even has a --author option for giving proper credit (although this change does not hold much original value ).
 
Lanchon
Old
#530  
Senior Member
Thanks Meter 148
Posts: 349
Join Date: Jun 2011
Quote:
Originally Posted by zeitferne View Post
I also think that more important than checking if the bug appears on Stock or on the 4210 kernel is actually finding the bug. There are so many differences between these kernels that I don't know if this would be very much help in locating the bug.
if we have source without the bug (say, sammy's) then *maybe* we could find the bug in the current kernel. if all kernels have the bug, then this is probably a hardware issue for which no config or toolchain workaround (aka errata) was ever developed and could even be impossible to develop. think defective hardware, and since sammy is unlikely to replace old S2s, class action
OnePlus One
If you would like to thank me, a OnePlus One invite would be a great way to do it

Dropbox Replacement
10 times as much space: 20GB for free (if you use this referral) instead of 2. Sync for Linux, Mac and Windows. Android and iOS apps.

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes