eMMC sudden death research

Oranav · Jan 22, 2013

As far is it seems right now, it isn't caused by flash wear or anything like that. It seems that it's caused by a bug which is triggered in a very specific case. Then, it causes the device to corrupt its inner structures or its firmware - I'm not sure which one yet.

The specific bug is that they don't check the return value of some function returning a pointer, which may be NULL. It then leads to a NULL pointer dereference which corrupts things.

So, as far as it seems currently, there is no negative effect of using an unpatched kernel (except for the risk of it suddenly dying, of course).

By the way, it's worthy to note that the firmware actually resides on the flash itself. There is a very small boot ROM (which is probably a mask ROM) that loads the firmware out of the NAND device.
Why am I mentioning this? It means that a bug in the firmware may actually corrupt the firmware itself, bricking the device.

Sent from my GT-I9300 using xda app-developers app

liamR · Jan 22, 2013

That is awesome research. Assuming that samsung just made a quick "fix" with the new kernels (and it does causing random freezes), Do you think that they can make a proper fix without side effects ?

Assuming they know all about it since SGS2 and it still effects SGS3 this makes samsung a terrible company.

Oranav · Jan 22, 2013

liamR said:
Assuming that samsung just made a quick "fix" with the new kernels (and it does causing random freezes), Do you think that they can make a proper fix without side effects ?

Absolutely yes.

Sent from my GT-I9300 using xda app-developers app

odoto · Jan 23, 2013

Oranav said:
Absolutely yes.

Sent from my GT-I9300 using xda app-developers app

question is, will they be smart enough do actually do it

(I'm not talking about their engineers, but their management)

Rob2222 · Jan 24, 2013

@Oranav:
Do you know, if the fix is applied in download mode, too?
I assume that the download mode does _not_ load a kernel or recovery, so the following assumption would be, that in download mode the eMMC is not protected.
Could that be?

BR
Rob

Product F(RED) · Jan 24, 2013

Rob2222 said:
@Oranav:
Do you know, if the fix is applied in download mode, too?
I assume that the download mode does _not_ load a kernel or recovery, so the following assumption would be, that in download mode the eMMC is not protected.
Could that be?

BR
Rob

You have to have a kernel. I'm sure it shares the recovery kernel since the recovery kernel is basically a backup/fail-safe kernel.

Rob2222 · Jan 24, 2013

Product F(RED) said:
You have to have a kernel. I'm sure it shares the recovery kernel since the recovery kernel is basically a backup/fail-safe kernel.

I am not sure about this. From my understanding the (second?) bootloader already has eMMC and display driver. So there are enough parts already initialized to make the eMMC aviable for USB access. No real need to load the kernel for that.

If download mode would need kernel/recovery, it would not be aviable if you flash a wrong kernel/recovery. And if I remember right I've seen wrong kernel and wrong recovery flashs got repaired by just flashing the correct kernel/recovery, so download mode was still working.

BR
Rob

Product F(RED) · Jan 24, 2013

Rob2222 said:
I am not sure about this. From my understanding the (second?) bootloader already has eMMC and display driver. So there are enough parts already initialized to make the eMMC aviable for USB access. No real need to load the kernel for that.

If download mode would need kernel/recovery, it would not be aviable if you flash a wrong kernel/recovery. And if I remember right I've seen wrong kernel and wrong recovery flashs got repaired by just flashing the correct kernel/recovery, so download mode was still working.

BR
Rob

You could be right but I know that recovery mode has its own separate kernel. That's why I thought maybe download mode shared it.

Sent from my GT-I9300 using Tapatalk 2

Oranav · Jan 24, 2013

Download mode has nothing in common with the recovery partition. It is implemented in sboot (the device's bootloader).
It has its own implementation of hardware drivers. If it doesn't patch the eMMC RAM, then it isn't safe!

However, I haven't checked it enough yet to conclude whether it's safe or not. Right now, I'd recommend anyone to avoid flashing via download mode. Recovery and Mobile Odin (or just dd) are good enough.

Sent from my GT-I9300 using xda app-developers app

AndreiLux · Jan 25, 2013

Oranav said:
Download mode has nothing in common with the recovery partition. It is implemented in sboot (the device's bootloader).
It has its own implementation of hardware drivers. If it doesn't patch the eMMC RAM, then it isn't safe!

However, I haven't checked it enough yet to conclude whether it's safe or not. Right now, I'd recommend anyone to avoid flashing via download mode. Recovery and Mobile Odin (or just dd) are good enough.

Sent from my GT-I9300 using xda app-developers app

Makes sense into why they upgraded the bootloader with LLA then, the increased modification detection would be just a side-effect of a newer bootloader version which already had heightened warranty enforcements on the 9305 and the Note 2's.

Entropy512 · Jan 28, 2013

AndreiLux said:
Makes sense into why they upgraded the bootloader with LLA then, the increased modification detection would be just a side-effect of a newer bootloader version which already had heightened warranty enforcements on the 9305 and the Note 2's.

That alone would be enough to upgrade bootloader...

I don't believe SBOOT is encrypted or compressed - just run strings on it. If you see lots of recognizable strings, but don't see the eMMC model number, you can be fairly certain the BL doesn't contain a fix.

drakester09 · Jan 28, 2013

Entropy512 said:
That alone would be enough to upgrade bootloader...

I don't believe SBOOT is encrypted or compressed - just run strings on it. If you see lots of recognizable strings, but don't see the eMMC model number, you can be fairly certain the BL doesn't contain a fix.

I ran strings in the following BL versions:
- My current EMA1
- First changed release ELLA
- ICS BL ALEF

I'm attaching the results and a diff for each version, a lot of the content is gibberish but there are some quite interesting differences, mostly around line 1100.

Maybe it can help us understand a bit more or determine if BL plays a role.

AndreiLux · Jan 28, 2013

drakester09 said:
I ran strings in the following BL versions:
- My current EMA1
- First changed release ELLA
- ICS BL ALEF

I'm attaching the results and a diff for each version, a lot of the content is gibberish but there are some quite interesting differences, mostly around line 1100.

Maybe it can help us understand a bit more or determine if BL plays a role.

There are some bootloader MMC changes but if they're related to SDS is to be determined... no VTU00M string in there at least, but that still doesn't rule it out.

usb_write reg, val
Read the usb ic register
sdcard test command
+mmcdtest

They added that mmcdtest into the bootloader utility commands, I wonder what it does.

Oranav · Jan 29, 2013

I'm reversing sboot to see what have changed (no "VTU00M" string doesn't mean there's no fix).
It should be very easy since we have kernel sources (we know how to communicate with the eMMC controller - MMIO addresses etc.).

* If someone has a BinDiff license and wants to help, it'd be great!

AndreiLux said:
They added that mmcdtest into the bootloader utility commands, I wonder what it does.

It reads 0xFFC00 bytes from the eMMC boot partition and copies them to 0x50000000 (maybe this is an output buffer? I don't know yet).
I also think 0xFFC00 is the boot partition size, so it just reads it all...

AndreiLux · Jan 30, 2013

Oranav said:
I'm reversing sboot to see what have changed (no "VTU00M" string doesn't mean there's no fix).
It should be very easy since we have kernel sources (we know how to communicate with the eMMC controller - MMIO addresses etc.).

* If someone has a BinDiff license and wants to help, it'd be great!

It reads 0xFFC00 bytes from the eMMC boot partition and copies them to 0x50000000 (maybe this is an output buffer? I don't know yet).
I also think 0xFFC00 is the boot partition size, so it just reads it all...

The U-Boot sources might interesst you, to get the general idea of the bootloader buildup. I don't know what the practical differences between U-Boot and S-Boot will be though, especially since the former will be outdated.

Rob2222 · Jan 30, 2013

@Oranav/All:

We have some news regarding to the freezes that occur on some S3's with the new firmware...
Someone posted that he waited until the freeze is over and I asked how long. He said after 10-15 minutes the phone was back to normal without reboot.

So as my phone froze with screen on I waited and I really was suprised that after around 23 minutes freeze the phone just continued to work as it had never frozen.

Maybe the eMMC has a watchdog after all.

Maybe it is a interesing point that the phone is able to continue after a long time freeze.
Maybe we can get some infromation out of some log files?

Since 2-3 days my S3 has 5-20 freezes per day. I am on stock, unrooted XXELL5.

BR
Rob

Rob2222 · Jan 30, 2013

Got a kernel log from just after such a freeze.

I was about to power on the screen but nothing happen. Then I waited around 10 minutes and the screen came finally up and I dumped the log.

Is this interesting?

Full log is attached.

Code:

U/ 4002.738352  c0 [keys]PWR 1
U/ 4002.983296  c0 [keys]PWR 0
...
U/ 4587.514100  c0 mshci: ===========================================
W/ 4587.514336  c0 mmc0: it occurs a critical error on eMMC it'll try to recover eMMC to normal state
....
V/ 4587.850296  c0 mmc0: recovering eMMC has been done
...
W/ 4587.850849  c0 mmcblk0: unknown error -131 sending read/write command, card status 0x900
W/ 4587.851982  c0 end_request: I/O error, dev mmcblk0, sector 3126872
W/ 4587.852174  c0 end_request: I/O error, dev mmcblk0, sector 3126880
W/ 4587.852330  c0 end_request: I/O error, dev mmcblk0, sector 3126888

EDIT: Added another log. Will add more, if I get more.

BR
Rob

sodeknetters · Feb 1, 2013

Yes, so it looks like the freezes have emmc involvement indeed!

Entropy512 · Feb 1, 2013

Oranav said:
So I decided to do a small RAM dump after all.

Before the patch, 0x5C7EA reads FD F7 C2 FA, which is "BL 0x59D72".
As I thought, they replace a function call to the new one.

I will dump function 0x59D72 later this week.

Could you perhaps post the RAM dump? I might be able to hand it to someone with a copy of hex-rays decompiler.

How big is the RAM?

I'm thinking of maybe dumping VYL00M fwrev 0x19 while I'm at it, and maybe seeing if someone else can dump 0x25.

Oranav · Feb 2, 2013

I have a Hex-Rays license. I actually reverse most of the time using it; I posted assembly code since it's easier to understand with these short snippets (in my point of view).

I won't post a RAM dump since it contains (probably?) licensed code.
I can however post the memory map:
0x00000000 - 0x00020000 BootROM (I guess it's a mask ROM)
0x00040000 - 0x00060000 Firmware (resides in RAM, the BootROM reads it from the NAND chip itself so it's upgradable!)
0x00060000 - 0x00080000 Data (no dynamic memory there BTW)
0x20000000 - 0x20028000 eMMC interface MMIO
0x20080000 - 0x20080400 I don't know, maybe another eMMC interface MMIO?
0x40000000 - 0x40010000 NAND interface MMIO

I can send you my RAM dump over IRC if you'd like. Besides that, I contemplate posting a .ko which exports the RAM over a character device (this is how I dumped it).

And, yes, dumping the new firmwares to see what has changed is super-cool

eMMC sudden death research

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Senior Recognized Developer

Senior Member

Attachments

Senior Member

Senior Member

Senior Member

Senior Member

Senior Member

Attachments

Senior Member

Senior Recognized Developer

Senior Member

Similar threads

Top Liked Posts