[KERNEL] [KEXEC] Kernel EXECution for locked devices [N900V] [WIP]

CalcProgrammer1

Senior Member
Oct 8, 2007
649
756
0
Kansas City
I just loaded a new test kernel to play with. This time I used standard cm-11.0 master branch (no hardboot or other changes) with CONFIG_KEXEC and CONFIG_PROC_DEVICETREE set. This should provide enough to perform a non-hardboot kexec I think. My initramfs is very basic, just opens a busybox ash shell on the default console (serial port in this case) and doesn't even use the screen. I then mounted my /data partition (mmcblk0p25) where I have my guest kernel and ran kexec -e. I got the following errors on the console followed by a watchdog-triggered reboot.

http://pastebin.com/uj3cDC3g

Here is the disassembly of the machine_kexec function:

Code:
1dc:   00000000        .word   0x00000000
 1e0:   00000074        .word   0x00000074

000001e4 <machine_kexec>:
 1e4:   e92d40f8        push    {r3, r4, r5, r6, r7, lr}
 1e8:   e1a05000        mov     r5, r0
 1ec:   ebfffffe        bl      0 <arch_kexec>
 1f0:   e59f2098        ldr     r2, [pc, #152]  ; 290 <machine_kexec+0xac>
 1f4:   e5950014        ldr     r0, [r5, #20]
 1f8:   e5954000        ldr     r4, [r5]
 1fc:   e5926000        ldr     r6, [r2]
 200:   e3c44eff        bic     r4, r4, #4080   ; 0xff0
 204:   e3c4400f        bic     r4, r4, #15
 208:   e0666000        rsb     r6, r6, r0
 20c:   ebfffffe        bl      0 <page_address>
 210:   e5953010        ldr     r3, [r5, #16]
 214:   e1a07000        mov     r7, r0
 218:   e1a062c6        asr     r6, r6, #5
 21c:   e59f2070        ldr     r2, [pc, #112]  ; 294 <machine_kexec+0xb0>
 220:   e1a06606        lsl     r6, r6, #12
 224:   e5823000        str     r3, [r2]
 228:   e59f2068        ldr     r2, [pc, #104]  ; 298 <machine_kexec+0xb4>
 22c:   e2433a07        sub     r3, r3, #28672  ; 0x7000
 230:   e5824000        str     r4, [r2]
 234:   e59f2060        ldr     r2, [pc, #96]   ; 29c <machine_kexec+0xb8>
 238:   e5921000        ldr     r1, [r2]
 23c:   e59f205c        ldr     r2, [pc, #92]   ; 2a0 <machine_kexec+0xbc>
 240:   e5821000        str     r1, [r2]
 244:   e59f2058        ldr     r2, [pc, #88]   ; 2a4 <machine_kexec+0xc0>
 248:   e59f1058        ldr     r1, [pc, #88]   ; 2a8 <machine_kexec+0xc4>
 24c:   e5823000        str     r3, [r2]
 250:   e59f3054        ldr     r3, [pc, #84]   ; 2ac <machine_kexec+0xc8>
 254:   e5932000        ldr     r2, [r3]
 258:   ebfffffe        bl      0 <memcpy>
 25c:   e1a00007        mov     r0, r7
 260:   e2871a01        add     r1, r7, #4096   ; 0x1000
 264:   ebfffffe        bl      0 <v7_coherent_kern_range>
 268:   e59f0040        ldr     r0, [pc, #64]   ; 2b0 <machine_kexec+0xcc>
 26c:   ebfffffe        bl      0 <printk>
 270:   e59f303c        ldr     r3, [pc, #60]   ; 2b4 <machine_kexec+0xd0>
 274:   e5933004        ldr     r3, [r3, #4]
 278:   e3530000        cmp     r3, #0
 27c:   0a000000        beq     284 <machine_kexec+0xa0>
 280:   e12fff33        blx     r3
 284:   e1a00006        mov     r0, r6
 288:   e8bd40f8        pop     {r3, r4, r5, r6, r7, lr}
 28c:   eafffffe        b       0 <soft_restart>
        ...
 
Last edited:

hsbadr

Inactive Recognized Developer
May 18, 2014
3,930
22,397
0
I just loaded a new test kernel to play with. This time I used standard cm-11.0 master branch (no hardboot or other changes) with CONFIG_KEXEC and CONFIG_PROC_DEVICETREE set. This should provide enough to perform a non-hardboot kexec I think. My initramfs is very basic, just opens a busybox ash shell on the default console (serial port in this case) and doesn't even use the screen. I then mounted my /data partition (mmcblk0p25) where I have my guest kernel and ran kexec -e. I got the following errors on the console followed by a watchdog-triggered reboot.

http://pastebin.com/uj3cDC3g
It seems that a fault occurs due to invalid address (offending address: c010be10) & the kernel turns into the oops message. Check the debugging suggestions here!
 

CalcProgrammer1

Senior Member
Oct 8, 2007
649
756
0
Kansas City
Tracked it down further. It crashes after "preparing kexec_start_address" is printed in the following code:

Code:
void machine_kexec(struct kimage *image)
{
	unsigned long page_list;
	unsigned long reboot_code_buffer_phys;
	void *reboot_code_buffer;
	unsigned int temp;

	printk(KERN_EMERG "machine_kexec: calling arch_kexec\n");
	arch_kexec();

	printk(KERN_EMERG "machine_kexec: getting page list\n");
	page_list = image->head & PAGE_MASK;

	printk(KERN_EMERG "machine_kexec: getting reboot code buffer physical address\n");
	/* we need both effective and real address here */
	reboot_code_buffer_phys =
	    page_to_pfn(image->control_code_page) << PAGE_SHIFT;
	printk(KERN_EMERG "machine_kexec: getting reboot code buffer virtual address\n");
	reboot_code_buffer = page_address(image->control_code_page);

	printk(KERN_EMERG "machine_kexec: preparing parameters for reboot code buffer\n");
	printk(KERN_EMERG "machine_kexec: testing image->start pointer\n");
	temp = image->start;
	printk(KERN_EMERG "machine_kexec: tested image->start value: %X\n", temp);

	printk(KERN_EMERG "machine_kexec: preparing kexec_start_address\n");
	/* Prepare parameters for reboot_code_buffer*/
	kexec_start_address = image->start;
	printk(KERN_EMERG "machine_kexec: preparing kexec_indirection_page\n");
	kexec_indirection_page = page_list;
	printk(KERN_EMERG "machine_kexec: preparing kexec_mach_type\n");
	kexec_mach_type = machine_arch_type;
	printk(KERN_EMERG "machine_kexec: preparing kexec_boot_atags\n");
	kexec_boot_atags = image->start - KEXEC_ARM_ZIMAGE_OFFSET + KEXEC_ARM_ATAGS_OFFSET;

	printk(KERN_EMERG "machine_kexec: copying kernel relocation code to control code page\n");

	/* copy our kernel relocation code to the control code page */
	memcpy(reboot_code_buffer,
	       relocate_new_kernel, relocate_new_kernel_size);

	printk(KERN_EMERG "machine_kexec: flushing icache\n");
	flush_icache_range((unsigned long) reboot_code_buffer,
			   (unsigned long) reboot_code_buffer + KEXEC_CONTROL_PAGE_SIZE);
	printk(KERN_EMERG "Bye!\n");

	if (kexec_reinit)
		kexec_reinit();

	soft_restart(reboot_code_buffer_phys);
}
Edit: added more print statements and a test. The image->start is valid and prints out value=0x8000, so that must mean that kexec_start_address is invalid? It has to be failing on that line or else the next printout would be printed. kexec_start_address is an extern so I'll have to track down where it's defined (probably an asm file somewhere).

Edit: Looks like we need to use mem_text_write_kernel_word() to set these values, same way the hardboot patch does it. Not sure why.
 
Last edited:

ryanbg

Inactive Recognized Developer
Jan 3, 2008
855
1,734
0
movr0.com
@CalcProgrammer1 @hsbadr

Here is the original NC2 .config. I compiled a kernel module to generate the retail NC2 config. Using this, you can compile from NC2 source, and have no issues with mismatched symbol CRCs which prevent you from loading kernel modules (because the config was different in Hashcode/Samsung source.) Enjoy
 
Last edited:

hsbadr

Inactive Recognized Developer
May 18, 2014
3,930
22,397
0
@CalcProgrammer1 @hsbadr

Here is the original NC2 .config. I compiled a kernel module to generate the retail NC2 config. Using this, you can compile from NC2 source, and have no issues with mismatched symbol CRCs which prevent you from loading kernel modules (because the config was different in Hashcode/Samsung source.) Enjoy
Thanks @ryanbg! Glad you're contributing here! I already patched all configs & the modules load successfully into NC2 & NC4 kernels. The current status is debugging to boot the guest kernel. @CalcProgrammer1 is working on another device/problem; he wants to dual/multi boot different OSs –including Debian– on N900T via kexecboot.
 

CalcProgrammer1

Senior Member
Oct 8, 2007
649
756
0
Kansas City
Thanks @ryanbg! Glad you're contributing here! I already patched all configs & the modules load successfully into NC2 & NC4 kernels. The current status is debugging to boot the guest kernel. @CalcProgrammer1 is working on another device/problem; he wants to dual/multi boot different OSs –including Debian– on N900T via kexecboot.
What's the status of your kexec? Have you attempted to load a guest kernel on the N900V? If so, did you get a log of what happened up to the reboot? Making the changes I mentioned in my previous post it gets all the way through machine_kexec(), prints "Bye" and then my console stops responding followed by a reboot 5 seconds later. I assume it's attempting to boot the new kernel and failing, so the watchdog kicks a reboot. What might be useful is figuring out the physical address of the UART transmit register, as long as the driver doesn't shut down the UART then we should be able to write to it in ASM functions directly to transmit bytes to the serial console, even if it's just single-letter messages to see that something gets called. If you haven't already, I'd recommend building a UART cable so you can get a serial console. All you need is a USB micro cable you don't mind cutting apart, a TTL-level serial adapter (cheap on ebay), and a 615K ohm resistor (RadioShack, add values in series to make 615K). It's been super useful in debugging kernel messages.

Another thing I could do is apply your module-based kexec patches to my N900T kernel and test with your kexec method so we're all debugging the same thing. Better to have one working solution than three failing ones.

EDIT:

Regarding my crash, it appears a configuration option is protecting kernel memory which is causing the crash. Apparently turning off CONFIG_STRICT_MEMORY_RWX solves the issue so I'm going to try it. I'm also going to pull in new commits for machine_kexec.c from mainline as it appears the Note 3 kernel is behind quite a bit comparing git logs.

EDIT 2:

That change worked, setting CONFIG_STRICT_MEMORY_RWX to not set eliminates the crash. This doesn't make kexec work, but it does make the function at least make it to the actual reset. On my serial console the last message I get (if I change KERN_INFO to KERN_EMERG in machine_kexec.c) is "Bye", then it reboots a few seconds later.

EDIT 3:

Going to do a test to see if the guest kernel is being loaded at all. I'm going to add a few lines in head.S to write a value to a defined location outside of normally-used memory (like way out in the far end of RAM) and see if I can read the value back after a reboot. To test this...test...I added code to set 32-bit word address 0x50000000 to value 0xFFFFFFFF in head.S of host kernel. Booting up normally (since this is host not guest) produces the following:

Code:
/ # viewmem 0x50000000 0x100 > test.bin
[INFO] Reading 256 bytes at 0x50000000...
/ # hexdump test.bin 
0000000 ffff ffff 0000 0000 0000 0000 0000 0000
0000010 0000 0000 0000 0000 0000 0008 0000 0000
0000020 0000 0000 0000 0000 0000 0000 0000 0000
*
0000080 0000 8000 0000 0000 0000 0000 0000 0000
0000090 0000 0000 0000 0000 0000 0000 0000 0000
*
0000100
So it should work, just have to use this zImage as guest instead of host. But I'm done for tonight, so I'll continue testing this tomorrow or over the weekend.

Edit 4: The additional asm code in head.S:

Code:
		mov	r8, r2			@ save atags pointer

		ldr	r9, =0x50000000
		ldr	r10, =0xFFFFFFFF
		str	r10, [r9, #0]

#ifndef __ARM_ARCH_2__
Edit 5: Forget weekend, let's do this!

Edit 6: :D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D (I think you can figure out what happened!) More to come this weekend as I'll dig deeper with this memory tagging approach.
 
Last edited:

hsbadr

Inactive Recognized Developer
May 18, 2014
3,930
22,397
0
What's the status of your kexec? Have you attempted to load a guest kernel on the N900V? If so, did you get a log of what happened up to the reboot? Making the changes I mentioned in my previous post it gets all the way through machine_kexec(), prints "Bye" and then my console stops responding followed by a reboot 5 seconds later. I assume it's attempting to boot the new kernel and failing, so the watchdog kicks a reboot. What might be useful is figuring out the physical address of the UART transmit register, as long as the driver doesn't shut down the UART then we should be able to write to it in ASM functions directly to transmit bytes to the serial console, even if it's just single-letter messages to see that something gets called. If you haven't already, I'd recommend building a UART cable so you can get a serial console. All you need is a USB micro cable you don't mind cutting apart, a TTL-level serial adapter (cheap on ebay), and a 615K ohm resistor (RadioShack, add values in series to make 615K). It's been super useful in debugging kernel messages.

Another thing I could do is apply your module-based kexec patches to my N900T kernel and test with your kexec method so we're all debugging the same thing. Better to have one working solution than three failing ones.

EDIT:

Regarding my crash, it appears a configuration option is protecting kernel memory which is causing the crash. Apparently turning off CONFIG_STRICT_MEMORY_RWX solves the issue so I'm going to try it. I'm also going to pull in new commits for machine_kexec.c from mainline as it appears the Note 3 kernel is behind quite a bit comparing git logs.

EDIT 2:

That change worked, setting CONFIG_STRICT_MEMORY_RWX to not set eliminates the crash. This doesn't make kexec work, but it does make the function at least make it to the actual reset. On my serial console the last message I get (if I change KERN_INFO to KERN_EMERG in machine_kexec.c) is "Bye", then it reboots a few seconds later.

EDIT 3:

Going to do a test to see if the guest kernel is being loaded at all. I'm going to add a few lines in head.S to write a value to a defined location outside of normally-used memory (like way out in the far end of RAM) and see if I can read the value back after a reboot. To test this...test...I added code to set 32-bit word address 0x50000000 to value 0xFFFFFFFF in head.S of host kernel. Booting up normally (since this is host not guest) produces the following:

Code:
/ # viewmem 0x50000000 0x100 > test.bin
[INFO] Reading 256 bytes at 0x50000000...
/ # hexdump test.bin 
0000000 ffff ffff 0000 0000 0000 0000 0000 0000
0000010 0000 0000 0000 0000 0000 0008 0000 0000
0000020 0000 0000 0000 0000 0000 0000 0000 0000
*
0000080 0000 8000 0000 0000 0000 0000 0000 0000
0000090 0000 0000 0000 0000 0000 0000 0000 0000
*
0000100
So it should work, just have to use this zImage as guest instead of host. But I'm done for tonight, so I'll continue testing this tomorrow or over the weekend.

Edit 4: The additional asm code in head.S:

Code:
		mov	r8, r2			@ save atags pointer

		ldr	r9, =0x50000000
		ldr	r10, =0xFFFFFFFF
		str	r10, [r9, #0]

#ifndef __ARM_ARCH_2__
Edit 5: Forget weekend, let's do this!

Edit 6: :D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D (I think you can figure out what happened!) More to come this weekend as I'll dig deeper with this memory tagging approach.
Good luck bro! The status of my kexec is similar to yours & probably other implementations (debugging some memory errors when booting the guest kernel). I'll send you a complete log to see if you can suggest a solution. I've been very busy in my RL/work & releasing 2 ROMs/ports for 2 different devices :)
 

ryanbg

Inactive Recognized Developer
Jan 3, 2008
855
1,734
0
movr0.com
What's the status of your kexec? Have you attempted to load a guest kernel on the N900V? If so, did you get a log of what happened up to the reboot? Making the changes I mentioned in my previous post it gets all the way through machine_kexec(), prints "Bye" and then my console stops responding followed by a reboot 5 seconds later. I assume it's attempting to boot the new kernel and failing, so the watchdog kicks a reboot. What might be useful is figuring out the physical address of the UART transmit register, as long as the driver doesn't shut down the UART then we should be able to write to it in ASM functions directly to transmit bytes to the serial console, even if it's just single-letter messages to see that something gets called. If you haven't already, I'd recommend building a UART cable so you can get a serial console. All you need is a USB micro cable you don't mind cutting apart, a TTL-level serial adapter (cheap on ebay), and a 615K ohm resistor (RadioShack, add values in series to make 615K). It's been super useful in debugging kernel messages.

Another thing I could do is apply your module-based kexec patches to my N900T kernel and test with your kexec method so we're all debugging the same thing. Better to have one working solution than three failing ones.

EDIT:

Regarding my crash, it appears a configuration option is protecting kernel memory which is causing the crash. Apparently turning off CONFIG_STRICT_MEMORY_RWX solves the issue so I'm going to try it. I'm also going to pull in new commits for machine_kexec.c from mainline as it appears the Note 3 kernel is behind quite a bit comparing git logs.

EDIT 2:

That change worked, setting CONFIG_STRICT_MEMORY_RWX to not set eliminates the crash. This doesn't make kexec work, but it does make the function at least make it to the actual reset. On my serial console the last message I get (if I change KERN_INFO to KERN_EMERG in machine_kexec.c) is "Bye", then it reboots a few seconds later.

EDIT 3:

Going to do a test to see if the guest kernel is being loaded at all. I'm going to add a few lines in head.S to write a value to a defined location outside of normally-used memory (like way out in the far end of RAM) and see if I can read the value back after a reboot. To test this...test...I added code to set 32-bit word address 0x50000000 to value 0xFFFFFFFF in head.S of host kernel. Booting up normally (since this is host not guest) produces the following:

Code:
/ # viewmem 0x50000000 0x100 > test.bin
[INFO] Reading 256 bytes at 0x50000000...
/ # hexdump test.bin 
0000000 ffff ffff 0000 0000 0000 0000 0000 0000
0000010 0000 0000 0000 0000 0000 0008 0000 0000
0000020 0000 0000 0000 0000 0000 0000 0000 0000
*
0000080 0000 8000 0000 0000 0000 0000 0000 0000
0000090 0000 0000 0000 0000 0000 0000 0000 0000
*
0000100
So it should work, just have to use this zImage as guest instead of host. But I'm done for tonight, so I'll continue testing this tomorrow or over the weekend.

Edit 4: The additional asm code in head.S:

Code:
		mov	r8, r2			@ save atags pointer

		ldr	r9, =0x50000000
		ldr	r10, =0xFFFFFFFF
		str	r10, [r9, #0]

#ifndef __ARM_ARCH_2__
Edit 5: Forget weekend, let's do this!

Edit 6: :D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D:D (I think you can figure out what happened!) More to come this weekend as I'll dig deeper with this memory tagging approach.
Good luck bro! The status of my kexec is similar to yours & probably other implementations (debugging some memory errors when booting the guest kernel). I'll send you a complete log to see if you can suggest a solution. I've been very busy in my RL/work & releasing 2 ROMs/ports for 2 different devices :)
Maybe all three of us should get together in Google Hangouts or something and hammer this thing out once and for all. Three minds are better than one! I still need to pickup a PL2302 or some other TTL other than my buspirate.
 

hsbadr

Inactive Recognized Developer
May 18, 2014
3,930
22,397
0
Maybe all three of us should get together in Google Hangouts or something and hammer this thing out once and for all. Three minds are better than one! I still need to pickup a PL2302 or some other TTL other than my buspirate.
I agree. I've also added both of you as contributors. I feel that we can easily make it for the unlocked N900T & hopefully other devices will follow.
 

ryanbg

Inactive Recognized Developer
Jan 3, 2008
855
1,734
0
movr0.com
I agree. I've also added both of you as contributors. I feel that we can easily make it for the unlocked N900T & hopefully other devices will follow.
Thanks! :) We should get a repo going on github so we can share code more efficiently and have a central location to report/fix bugs. Surely lots of great things on the road ahead.
 
  • Like
Reactions: RuggedHunter

KennyG123

Senior Moderator / Moderator Committee / Spider-Mo
Staff member
Nov 1, 2010
39,179
51,633
263
Right behind you!
Maybe all three of us should get together in Google Hangouts or something and hammer this thing out once and for all. Three minds are better than one! I still need to pickup a PL2302 or some other TTL other than my buspirate.
Or you guys can hammer it out right here for all to see and learn. This is the reason and hopes of developer only sections. Maybe a 4th person will follow along, get an idea and join you 3. Keep up the great work all.
 

hsbadr

Inactive Recognized Developer
May 18, 2014
3,930
22,397
0
Thanks! :) We should get a repo going on github so we can share code more efficiently and have a central location to report/fix bugs. Surely lots of great things on the road ahead.
Yep, let's do it :) We need 1st to decide on the implementation we want to develop on. I've been testing 3 implementations (Hash's kexec & other 2 kexec-mod sources from Sony devs). All of them are built & loaded into NC2/4 kernels with a few patches I made in configs. Also, it'd be great to select the host/guest kernels (say NC2 kernel as a host & ? as a guest); we might make a custom guest kernel too. I'm busy at work now, but we could start our Hangouts discussions this weekend.

Or you guys can hammer it out right here for all to see and learn. This is the reason and hopes of developer only sections. Maybe a 4th person will follow along, get an idea and join you 3. Keep up the great work all.
THANKS!

EDIT: we can post all technical information/discussions here; Hangouts chat will make communication easier & this thread will be cleaner & easy to follow.
 
Last edited:

hsbadr

Inactive Recognized Developer
May 18, 2014
3,930
22,397
0
Thanks! :) We should get a repo going on github so we can share code more efficiently and have a central location to report/fix bugs. Surely lots of great things on the road ahead.
Ok, let's build on Hashcode's NC2 kernel source, which includes commits for building kexec as a module + other kexec patches. I'll clone the VZW NC2 branch to the new repo here. @CalcProgrammer1 may create a branch for TMO with any required patches OR continue working on his own sources & add commits to VZW branch if necessary.

I can also upload the other kexec-mod sources & Samsung source code for N900V KK kernel to Github, if you wish.
 
Last edited:

CalcProgrammer1

Senior Member
Oct 8, 2007
649
756
0
Kansas City
If the kexec-mod changes are all in individual commits that don't change anything else I should just be able to cherry pick them into the CM11 kernel that I'm testing on so I can mess with kexec mod as well. Ultimately though, if it turns out we're all getting the guest kernel loaded, the issue is mainly the guest which we can all use the CM kernel for (it should work on N900V as the developer edition is officially supported).
 

hsbadr

Inactive Recognized Developer
May 18, 2014
3,930
22,397
0
If the kexec-mod changes are all in individual commits that don't change anything else I should just be able to cherry pick them into the CM11 kernel that I'm testing on so I can mess with kexec mod as well. Ultimately though, if it turns out we're all getting the guest kernel loaded, the issue is mainly the guest which we can all use the CM kernel for (it should work on N900V as the developer edition is officially supported).
Makes sense! I've created a new repo here & will clone the forked source + my own commits ASAP.
 

CalcProgrammer1

Senior Member
Oct 8, 2007
649
756
0
Kansas City
Looking at this code:

Code:
		mov	r8, r2			@ save atags pointer

		ldr	r9, =0x50000000
		str	r0, [r9, #0]
		str	r1, [r9, #4]
		str	r2, [r9, #8]
		ldr	r10, =0xFFFFFFFF
		str	r10, [r9, #12]

#ifndef __ARM_ARCH_2__
produces this output:

Code:
/ # hexdump test.bin
0000000 0000 0000 ffff ffff 1000 0000 ffff ffff
If these articles are correct:

http://www.simtec.co.uk/products/SWLINUX/files/booting_article.html#d0e428
https://www.kernel.org/doc/Documentation/arm/Booting

then r0 should be zero (it is), r1 should be machine type, and r2 should be the address of the parameter list (or device tree image in our case). I don't think machine type should be 0xFFFFFFFF. I don't know what it should be though.

Edit: This is what I get when I flash the kernel and boot it via bootloader:

Code:
/ # hexdump test.bin 
0000000 0000 0000 0000 0000 0000 0270 ffff ffff
The device tree location is different but that's probably OK if kexec is actually copying the dt image there, but the machine ID is 0 on bootloader and FFFFFFFF on kexec? That doesn't sound right at all.

Edit again: I don't know what the byte/bit order is on the hexdump printout but I'm guessing it's not 0x12345678. I'll change my dummy value to that and see what it is.

Edit: I edited machine_kexec.c to always return 0 for machine_arch_type (which is the value that ends up in r1). It successfully changed the result booting the guest kernel but that didn't fix whatever's blocking the guest from booting.

Code:
/ # hexdump test.bin 
0000000 0000 0000 0000 0000 1000 0000 ffff ffff
Edit:

Going by the bootloader log (left from sec_log at 0x10000008) we can see that the tags address is 0x2700000:

Code:
reboot_mode = 0x12345678, boot_mode = 0, por = 0x20040
LK Area: 0xF800000 - 0xFA00000
Kernel Addr: 0x8000
Ramdisk Addr: 0x2900000
Second Addr: 0xF00000
Tags Adddr: 0x2700000
This corresponds with the 0000 0270 in my recorded data, so the 16-bit word order appears reversed (0270 0000).

Here's what prints when loading kexec:

Code:
/ # kexec --load /mnt/debian/boot/zImage --append=`cat /proc/cmdline` --dtb=/mnt
/boot/dt.img --initrd=/mnt/boot/initrd.img
kernel: 0xb6ad0008 kernel_size: 517538
DTB: Using DTB from file /mnt/boot/dt.img
DTB: platform 2114060033 hw 7 soc 0x0 board 3087138561
DTB: match 0x0 3087138561, my id 0x0 3087138561, len 217355
DTB: add dtb segment 0x14f4000
kexec_load: entry = 0x8000 flags = 280000
nr_segments = 3
segment[0].buf   = 0xb6ad0008
segment[0].bufsz = 517538
segment[0].mem   = 0x8000
segment[0].memsz = 518000
segment[1].buf   = 0xb6a3a008
segment[1].bufsz = 9568b
segment[1].mem   = 0x145e000
segment[1].memsz = 96000
segment[2].buf   = 0xb68c1008
segment[2].bufsz = 3550b
segment[2].mem   = 0x14f4000
segment[2].memsz = 36000
This is getting longer and longer as I type out my train of thought...

Reading this: https://www.kernel.org/doc/Documentation/arm/Booting says that the first byte of device tree should be 0xd00dfeed (lolwut?) and according to viewmem, it still is even after Android is booted:

Code:
[email protected]:/ # /data/local/viewmem 0x2700000 0x100 | hexdump                                                                                                                                                                              
[INFO] Reading 256 bytes at 0x2700000...
0000000 0dd0 edfe 0300 1055 0000 3800 0200 98f3
0000010 0000 2800 0000 1100 0000 1000 0000 0000
0000020 0000 7861 0200 60f3 0000 0000 0000 0000
I wonder if I can bypass the trouble of kexec-loading a dtb altogether and just reuse the one that's already there. Also it appears I was wrong about byte order, here it's 0x12345678 --> 0x34127856 (or Android ABI uses a different byte ordering idk).
 
Last edited:

hsbadr

Inactive Recognized Developer
May 18, 2014
3,930
22,397
0
Looking at this code:

Code:
		mov	r8, r2			@ save atags pointer

		ldr	r9, =0x50000000
		str	r0, [r9, #0]
		str	r1, [r9, #4]
		str	r2, [r9, #8]
		ldr	r10, =0xFFFFFFFF
		str	r10, [r9, #12]

#ifndef __ARM_ARCH_2__
produces this output:

Code:
/ # hexdump test.bin
0000000 0000 0000 ffff ffff 1000 0000 ffff ffff
If these articles are correct:

http://www.simtec.co.uk/products/SWLINUX/files/booting_article.html#d0e428
https://www.kernel.org/doc/Documentation/arm/Booting

then r0 should be zero (it is), r1 should be machine type, and r2 should be the address of the parameter list (or device tree image in our case). I don't think machine type should be 0xFFFFFFFF. I don't know what it should be though.

Edit: This is what I get when I flash the kernel and boot it via bootloader:

Code:
/ # hexdump test.bin 
0000000 0000 0000 0000 0000 0000 0270 ffff ffff
The device tree location is different but that's probably OK if kexec is actually copying the dt image there, but the machine ID is 0 on bootloader and FFFFFFFF on kexec? That doesn't sound right at all.

Edit again: I don't know what the byte/bit order is on the hexdump printout but I'm guessing it's not 0x12345678. I'll change my dummy value to that and see what it is.

Edit: I edited machine_kexec.c to always return 0 for machine_arch_type (which is the value that ends up in r1). It successfully changed the result booting the guest kernel but that didn't fix whatever's blocking the guest from booting.

Code:
/ # hexdump test.bin 
0000000 0000 0000 0000 0000 1000 0000 ffff ffff
Edit:

Going by the bootloader log (left from sec_log at 0x10000008) we can see that the tags address is 0x2700000:

Code:
reboot_mode = 0x12345678, boot_mode = 0, por = 0x20040
LK Area: 0xF800000 - 0xFA00000
Kernel Addr: 0x8000
Ramdisk Addr: 0x2900000
Second Addr: 0xF00000
Tags Adddr: 0x2700000
This corresponds with the 0000 0270 in my recorded data, so the 16-bit word order appears reversed (0270 0000).

Here's what prints when loading kexec:

Code:
/ # kexec --load /mnt/debian/boot/zImage --append=`cat /proc/cmdline` --dtb=/mnt
/boot/dt.img --initrd=/mnt/boot/initrd.img
kernel: 0xb6ad0008 kernel_size: 517538
DTB: Using DTB from file /mnt/boot/dt.img
DTB: platform 2114060033 hw 7 soc 0x0 board 3087138561
DTB: match 0x0 3087138561, my id 0x0 3087138561, len 217355
DTB: add dtb segment 0x14f4000
kexec_load: entry = 0x8000 flags = 280000
nr_segments = 3
segment[0].buf   = 0xb6ad0008
segment[0].bufsz = 517538
segment[0].mem   = 0x8000
segment[0].memsz = 518000
segment[1].buf   = 0xb6a3a008
segment[1].bufsz = 9568b
segment[1].mem   = 0x145e000
segment[1].memsz = 96000
segment[2].buf   = 0xb68c1008
segment[2].bufsz = 3550b
segment[2].mem   = 0x14f4000
segment[2].memsz = 36000
This is getting longer and longer as I type out my train of thought...

Reading this: https://www.kernel.org/doc/Documentation/arm/Booting says that the first byte of device tree should be 0xd00dfeed (lolwut?) and according to viewmem, it still is even after Android is booted:

Code:
[email protected]:/ # /data/local/viewmem 0x2700000 0x100 | hexdump                                                                                                                                                                              
[INFO] Reading 256 bytes at 0x2700000...
0000000 0dd0 edfe 0300 1055 0000 3800 0200 98f3
0000010 0000 2800 0000 1100 0000 1000 0000 0000
0000020 0000 7861 0200 60f3 0000 0000 0000 0000
I wonder if I can bypass the trouble of kexec-loading a dtb altogether and just reuse the one that's already there. Also it appears I was wrong about byte order, here it's 0x12345678 --> 0x34127856 (or Android ABI uses a different byte ordering idk).
I'll add a quick comment for now & extend my reply later, after reading the linked articles. The byte order is OK & it's reversed as a result of big/little-endian conversion. For example, you can see this when you check an address printed by "modprobe --dump-modversions" (list of module versioning information) & the address obtained from hexdump/hexedit for any kernel module. To test it yourself, pick any working kernel module xxx.ko from /system/lib/modules & run the following:
Code:
modprobe --dump-modversions xxx.ko
Then, open the same kernel module via any hex editor & search for any header printed by the command above. You'll see the reversed byte order due to endian conversion. It works as follow:
Code:
_atomic	0x70616d6b189dab8d
from "modprobe --dump-modversions" is equivalent to
Code:
8dab9d186b6d6170
in the hex editor.

I hope this helps for now. I'll provide more info very soon.
 

CalcProgrammer1

Senior Member
Oct 8, 2007
649
756
0
Kansas City
I just found the -C option for hexdump which corrects the endianness issue:

Code:
[email protected]:/ # /data/local/viewmem 0x2700000 0x100 | hexdump -C                                                                                                                                                                           
[INFO] Reading 256 bytes at 0x2700000...
00000000  d0 0d fe ed 00 03 55 10  00 00 00 38 00 02 f3 98  |......U....8....|
00000010  00 00 00 28 00 00 00 11  00 00 00 10 00 00 00 00  |...(............|
00000020  00 00 61 78 00 02 f3 60  00 00 00 00 00 00 00 00  |..ax...`........|
00000030  00 00 00 00 00 00 00 00  00 00 00 01 00 00 00 00  |................|
00000040  00 00 00 03 00 00 00 04  00 00 00 00 00 00 00 01  |................|
00000050  00 00 00 03 00 00 00 04  00 00 00 0f 00 00 00 01  |................|
00000060  00 00 00 03 00 00 00 14  00 00 00 1b 53 61 6d 73  |............Sams|
00000070  75 6e 67 20 48 4c 54 45  20 72 65 76 30 2e 37 00  |ung HLTE rev0.7.|
00000080  00 00 00 03 00 00 00 27  00 00 00 21 71 63 6f 6d  |.......'...!qcom|
00000090  2c 6d 73 6d 38 39 37 34  2d 63 64 70 00 71 63 6f  |,msm8974-cdp.qco|
000000a0  6d 2c 6d 73 6d 38 39 37  34 00 71 63 6f 6d 2c 63  |m,msm8974.qcom,c|
000000b0  64 70 00 00 00 00 00 03  00 00 00 04 00 00 00 2c  |dp.............,|
000000c0  00 00 00 01 00 00 00 03  00 00 00 24 00 00 00 3d  |...........$...=|
000000d0  7e 01 ff 01 00 00 00 07  00 00 00 00 b8 01 ff 01  |~...............|
000000e0  00 00 00 07 00 00 00 00  b9 01 ff 01 00 00 00 07  |................|
000000f0  00 00 00 00 00 00 00 01  63 68 6f 73 65 6e 00 00  |........chosen..|
00000100
So again with the dtb address:
Code:
/ # viewmem 0x50000000 0x100 | hexdump -C
[INFO] Reading 256 bytes at 0x50000000...
00000000  00 00 00 00 ff ff ff ff  00 10 00 00 ff ff ff ff  |................|
00000010  03 aa 0c 99 33 46 ef f7  5b ff 0e 90 00 b1 e7 e7  |....3F..[.......|
00000020  03 9b c8 f8 00 30 0d a8  37 21 01 22 33 46 ed f7  |.....0..7!."3F..|
00000030  ac fa 18 b0 bd e8 f0 81  06 6d 0d 00 b8 f8 ff ff  |.........m......|
00000040  27 fa 0b 00 2d e9 f0 43  1e 46 33 4b 17 46 33 4a  |'...-..C.F3K.F3J|
I'm going to hard code dtb address as 0x2700000 and see what happens.

Edit: Nothing, doesn't solve the problem :(

Ok, I'm stepping through head.S now. I know that it's getting to the call to decompress_kernel, so now I'm going to see if it gets past that. If decompress_kernel fails it goes into an infinite loop which would definitely cause a watchdog reset.
 
Last edited:

hsbadr

Inactive Recognized Developer
May 18, 2014
3,930
22,397
0
I just found the -C option for hexdump which corrects the endianness issue:
Code:
[email protected]:/ # /data/local/viewmem 0x2700000 0x100 | hexdump -C                                                                                                                                                                           
[INFO] Reading 256 bytes at 0x2700000...
00000000  d0 0d fe ed 00 03 55 10  00 00 00 38 00 02 f3 98  |......U....8....|
00000010  00 00 00 28 00 00 00 11  00 00 00 10 00 00 00 00  |...(............|
00000020  00 00 61 78 00 02 f3 60  00 00 00 00 00 00 00 00  |..ax...`........|
00000030  00 00 00 00 00 00 00 00  00 00 00 01 00 00 00 00  |................|
00000040  00 00 00 03 00 00 00 04  00 00 00 00 00 00 00 01  |................|
00000050  00 00 00 03 00 00 00 04  00 00 00 0f 00 00 00 01  |................|
00000060  00 00 00 03 00 00 00 14  00 00 00 1b 53 61 6d 73  |............Sams|
00000070  75 6e 67 20 48 4c 54 45  20 72 65 76 30 2e 37 00  |ung HLTE rev0.7.|
00000080  00 00 00 03 00 00 00 27  00 00 00 21 71 63 6f 6d  |.......'...!qcom|
00000090  2c 6d 73 6d 38 39 37 34  2d 63 64 70 00 71 63 6f  |,msm8974-cdp.qco|
000000a0  6d 2c 6d 73 6d 38 39 37  34 00 71 63 6f 6d 2c 63  |m,msm8974.qcom,c|
000000b0  64 70 00 00 00 00 00 03  00 00 00 04 00 00 00 2c  |dp.............,|
000000c0  00 00 00 01 00 00 00 03  00 00 00 24 00 00 00 3d  |...........$...=|
000000d0  7e 01 ff 01 00 00 00 07  00 00 00 00 b8 01 ff 01  |~...............|
000000e0  00 00 00 07 00 00 00 00  b9 01 ff 01 00 00 00 07  |................|
000000f0  00 00 00 00 00 00 00 01  63 68 6f 73 65 6e 00 00  |........chosen..|
00000100
So again with the dtb address:
Code:
/ # viewmem 0x50000000 0x100 | hexdump -C
[INFO] Reading 256 bytes at 0x50000000...
00000000  00 00 00 00 ff ff ff ff  00 10 00 00 ff ff ff ff  |................|
00000010  03 aa 0c 99 33 46 ef f7  5b ff 0e 90 00 b1 e7 e7  |....3F..[.......|
00000020  03 9b c8 f8 00 30 0d a8  37 21 01 22 33 46 ed f7  |.....0..7!."3F..|
00000030  ac fa 18 b0 bd e8 f0 81  06 6d 0d 00 b8 f8 ff ff  |.........m......|
00000040  27 fa 0b 00 2d e9 f0 43  1e 46 33 4b 17 46 33 4a  |'...-..C.F3K.F3J|
I'm going to hard code dtb address as 0x2700000 and see what happens.

Edit: Nothing, doesn't solve the problem :(
I recall my old post on dtb quoted bellow. It seems that you need to apply similar patches for dtb address/size. "The boot loader must provide either a tagged list or a dtb image for passing configuration data to the kernel. The physical address of the boot data is passed to the kernel in register r2."
I'm still learning kexec & its challenges, but I'll try to answer your question as much as I can. To use kexec, device tree information needs to be passed to the kernel during boot either by specifying a DTB file or using legacy ATAGs. kexec checks that the location where it wants to load the guest kernel into memory is physically contiguous. This requires accurate memory computations + patches for kexec-tools & machine driver. You may study the code changes in these patches (as an example) to understand how/why kexec uses DTB or ATAGS & some of the problems that makes it fails to work on arm with proposed solutions.