FORUMS
Remove All Ads from XDA

Discussion thread for /data EMMC lockup/corruption bug

5,342 posts
Thanks Meter: 7,242
 
By sfhub, Senior Member on 9th May 2012, 01:08 PM
Post Reply Email Thread
31st May 2012, 04:01 AM |#411  
garwynn's Avatar
Retired Forum Moderator / Inactive Recognized Developer / XDA Portal Team
Flag NE Ohio
Thanks Meter: 8,731
 
Donate to Me
More
Did you try using an ODIN one-click instead of CWM? I'll bet that will get you back to a safe point - then reload CWM and try again.
 
 
31st May 2012, 03:20 PM |#412  
OP Senior Member
Thanks Meter: 7,242
 
More
Updated the first post with select posts from our discussion that detail the call path in the source code of how the problem occurs and also the discussion on where a workaround can be placed and the pros/cons of each. Easier than reading 40 pages.
The Following 3 Users Say Thank You to sfhub For This Useful Post: [ View ] Gift sfhub Ad-Free
31st May 2012, 04:37 PM |#413  
musashiro's Avatar
Senior Member
Flag Tarlac
Thanks Meter: 89
 
More
i'm reading posts on twitter and stumbled upon samsung support page on facebook.

i tried to explain the situation and please correct me if i stated wrong facts.. i tried my best and i will post later the whole conversation.

they reply quick!

edit: too bad, they only support US and ICS is not available there.. now they are asking my location so they can redirect me to proper support team.. lols... i tried to reason that they should use the information to patch the bug on US release...

im not sure if my "little" attempt contributed anything to the issue but here is the full conversation, please do correct me if i stated anything wrong and i will message it to them..

ME
Quote:

Good day,

we have been seeing reports of bricks or phone unusable after doing factory reset on Stock ICS firmware on Galaxy note. and it is related to a "bug" that bricks the system when wiping.

we tried talking to an android developer Ken Sumrall and he confirmed that the bug is present.

here is some threads on xda discussing the bug.. http://forum.xda-developers.com/show....php?t=1644364

we are quite alarmed that even the stock or official ICS firmware elicits the bug. particularly the XXLPY and ZSLPF firmwares.

my apologies if you already know the problem, but right now, we are yet to hear any updates from Adam Panasiuk of samsung developers. and we need every information or action regarding the issue.

thanks a lot,

*my name*


16 minutes ago

Samsung Support

Quote:

Hi Ken! Thank you for this information! Can you provide me with the model number of your Note and where you are located? Thanks!

13 minutes ago

ME

Quote:

me myself avoided to trigger the bug at all cost since the only remedy to it is motherboard replacement and its quite expensive.
but if you are asking what device is affected, the N7000 International Note is affected. my location i think is unrelated to the case.

10 minutes ago

Samsung Support

Quote:

Thanks Ken! Unfortunately, we only support US based devices on this page and ICS has not yet been released to the note at this time. I wanted to know your location so I can direct you to the correct support team. I will go ahead and send this information over to your product specialists. Thank you again and have a great day!
Our*

5 minutes ago

me

Quote:

i hope this information is used so that the bug gets patched on the US release of the firmware. If ever the US release of the rom has the bug as well, wouldn't you be alarmed? thanks and have a nice day.. thanks for fast responses..

about a minute ago
Samsung Support

Quote:

No problem! We definitely understand your position and want to thank you for your support! Thanks so much Ken!


sorry if i failed to go technical regarding the bug itself, but i gave a link to here... my apologies..

by the way my name is Ken but i'm not in any way related to Ken Sumrall
31st May 2012, 05:03 PM |#414  
Senior Member
Thanks Meter: 6
 
More
I have a question that I dont know if it has been answered, if we disable the MMC_CAP_ERASE flag, then wiping data as we do when we switch roms will never happen, right? how do we wipe then?
31st May 2012, 07:15 PM |#415  
Senior Recognized Developer
Flag Owego, NY
Thanks Meter: 25,477
 
Donate to Me
More
Quote:
Originally Posted by sniperbr0

I have a question that I dont know if it has been answered, if we disable the MMC_CAP_ERASE flag, then wiping data as we do when we switch roms will never happen, right? how do we wipe then?

The full "wipe" offered by ERASE is only needed for privacy concerns, it's unnecessary for flashaholics - all a flashaholic needs is a format. (Which is all Gingerbread recoveries offered.)

(This is why Google added the MMC erase stuff to ICS recovery - better for privacy because it truly erases the data instead of just clobbering the data structures that point to it.)

In theory, if ERASE were working, you'd get better write performance to a filesystem after a full ERASE - but right now disabling MMC_CAP_ERASE makes you as good as Gingerbread, but not worse.

Mr. Sumrall's recommendation of replacing the wipe functions with zero-overwrite was only for those who want to ensure that wiping is as good for privacy as the ICS ERASE-based method - but if you're not prepping the phone for sale/return there is no need for this.
The Following 5 Users Say Thank You to Entropy512 For This Useful Post: [ View ]
31st May 2012, 09:37 PM |#416  
OP Senior Member
Thanks Meter: 7,242
 
More
So I wanted to answer the question of whether "nandroid backup/restores" using ICS-based Recovery are safe.

Nandroid backups are definitely safe.

Nandroid restores have the same issue as wipe data/factory reset. Namely they will eventually call make_ext4fs() (from libext4_utils.a) when restoring *ext4* partitions (that means /system /data /cache).

For "images" like zImage, they do not call make_ext4fs() and I verified they don't actually erase anything when restoring images on our phone.

The conclusion is that when you restore, using an ICS-based Recovery compiled and linked against existing unpatched ICS libext4_utils.a, the boot.img is restored safely, but /system /data and /cache are NOT restored safely.

So right now, if we were NOT to make any changes to address the issue, feel free to *create* your backups using ICS-based CWM Recovery. However if you want to *restore* your backups, flash a GB-based CWM Recovery, boot into the GB-based CWM Recovery, then restore your CWM backup. This should allow you to backup and restore safely w/o needing to wait for any devs to fix things.

This is the call trace I used to come up with this conclusion (this is all pseudo-code with lots of stuff eliminated to aid readability):

nandroid.c
Code:
nandroid_restore()
  nandroid_restore_partition(/boot)
  nandroid_restore_partition(/system)
  nandroid_restore_partition(/data)
  nandroid_restore_partition_extended(/cache)
Code:
nandroid_restore_partition()
  if (vol->fs_type == "emmc") {
      format_volume()
      restore_raw_partition()
      return 0
  }
  return nandroid_restore_partition_extended()
Code:
nandroid_restore_partition_extended()
  if ("*.ext4.tar") {
      backup_filesystem = filesystem;
      restore_handler = tar_extract_wrapper;
  }
  format_device()
Code:
tar_extract_wrapper()
  cd FILESYSTEM_PATH ; tar xvf FILE.ext4.tar
roots.c
Code:
format_volume() [for emmc case]
  format_unknown_device()
extendedcommands.c
Code:
format_unknown_device() [for emmc case]
  erase_raw_partition()
Code:
format_device() [for ext4 case]
  if (strcmp(fs_type, "ext4") == 0) {
      reset_ext4fs_info();
      int result = make_ext4fs(device, NULL, NULL, 0, 0, 0);
      if (result != 0) {
          LOGE("format_volume: make_extf4fs failed on %s\n", device);
          return -1;
      }
      return 0;
  }
flashutils/flashutils.c
Code:
erase_raw_partition()
    switch (type) {
        case MTD:
            return cmd_mtd_erase_raw_partition(partition);
        case MMC:
            return cmd_mmc_erase_raw_partition(partition);
        case BML:
            return cmd_bml_erase_raw_partition(partition);
    }
Code:
restore_raw_partition()
    switch (type) {
        case MTD:
            return cmd_mtd_restore_raw_partition(partition, filename);
        case MMC:
            return cmd_mmc_restore_raw_partition(partition, filename);
        case BML:
            return cmd_bml_restore_raw_partition(partition, filename);
    }
mmcutils/mmcutils.c
Code:
cmd_mmc_erase_raw_partition()
  return 0;
Code:
cmd_mmc_restore_raw_partition()
  mmc_raw_copy()
Code:
mmc_raw_copy()
  while ( ( ch = fgetc ( in ) ) != EOF )
        fputc ( ch, out );
The Following 5 Users Say Thank You to sfhub For This Useful Post: [ View ] Gift sfhub Ad-Free
31st May 2012, 11:11 PM |#417  
OP Senior Member
Thanks Meter: 7,242
 
More
I've gone through and looked at the following publicly available checkins for recovery:
1) Cyanogen
2) CWM - chris41g
3) Rogue CWM - Steady

The latest from Cyanogen has the new ICS 2-argument interface (ie the one that can trigger the EMMC lockup). Technically the 2-argument change was on checked in on 19-Jan-2011 while the "wipe" change was on 26-Jan-2011, but they are close enough in time that it is probably ok to use the interface change as a proxy for determining if a version of CWM is linking with the "wipe" code.
https://github.com/CyanogenMod/andro...s/roots.c#L384
Code:
    if (strcmp(v->fs_type, "ext4") == 0) {
        int result = make_ext4fs(v->device, v->length);
        if (result != 0) {
            LOGE("format_volume: make_extf4fs failed on %s\n", v->device);
            return -1;
        }
        return 0;
    }
This is what chris41g has checked in for CWM:
https://github.com/chris41g/android_...d/roots.c#L349
Code:
    if (strcmp(v->fs_type, "ext4") == 0) {
        reset_ext4fs_info();
        int result = make_ext4fs(v->device, NULL, NULL, 0, 0, 0);
        if (result != 0) {
            LOGE("format_volume: make_extf4fs failed on %s\n", v->device);
            return -1;
        }
        return 0;
    }
Notice how this is the "old" multiple-argument GB interface to make_ext4fs()/libext4_utils.a *IF* the version of CWM source code chris checked into github is the same that is currently being used (and I believe it is from conversations with him) then his CWM should be safe when repacked with ICS-kernel for
1) wipe data/factory reset
2) restoring nandroid backups

It CANNOT BE DETERMINED whether CWM from chris41g would be safe for flashing ICS ROMs because that depends on the update-binary executable and updater-script that is bundled with the ROM install.


This is what Steady has checked in for CWM:
https://github.com/TDR/android_boota...y/roots.c#L410
Code:
    if (strcmp(v->fs_type, "ext4") == 0) {
        reset_ext4fs_info();
        int result = make_ext4fs(v->device, NULL, NULL, 0, 0, 0);
        if (result != 0) {
            LOGE("format_volume: make_extf4fs failed on %s\n", v->device);
            return -1;
        }
        return 0;
    }
Notice how this is the "old" multiple-argument GB interface to make_ext4fs()/libext4_utils.a *IF* the version of CWM source code Steady checked into github is the same that is currently being used then his CWM should be safe when repacked with ICS-kernel for
1) wipe data/factory reset
2) restoring nandroid backups

However, I stress I have NOT spoken to Steady about whether he has modified/updated the source code for Rogue CWM when building the ICS releases. If he had updated to latest CWM from Cyanogen's github, then he would have picked up the latest CWM changes and would be susceptible to the EMMC lockup/superbrick issue.

It is CANNOT BE DETERMINED whether Rogue CWM would be safe for flashing ICS ROMs because that depends on the update-binary executable and updater-script that is bundled with the ROM install.

====

I was not able to locate source code for CWM Touch. I suspect it is based off the latest from Cyanogen. This could explain why LOTS of people were bricking with it when doing the temp/fake flash of CWM Touch with the ICS kernel.

If someone can point me to the source for CWM Touch I'd like to look at it and verify these theories.
The Following 7 Users Say Thank You to sfhub For This Useful Post: [ View ] Gift sfhub Ad-Free
1st June 2012, 03:33 PM |#418  
Senior Member
Flag Hong Kong
Thanks Meter: 13
 
More
Quote:
Originally Posted by Entropy512

The full "wipe" offered by ERASE is only needed for privacy concerns, it's unnecessary for flashaholics - all a flashaholic needs is a format. (Which is all Gingerbread recoveries offered.)

does it has anything do with TRIM command? which affect performance.
or the TRIM command is working fine ?
2nd June 2012, 12:21 AM |#419  
OP Senior Member
Thanks Meter: 7,242
 
More
I was able to confirm with chris41g and T.C.P that the update-binary, included with both the CM9 and AOKP ROM update.zip files, was compiled against ICS source code.

This means that this update-binary can potentially trigger the EMMC firmware lockup/superbrick bug if "format()" is used in the Edify install script. Both CM9 and AOKP updater-script format("/system") as part of their installs.

This explains why they brick so often with the bluelight-only style brick, it is because the format("/system") elicits the lockup when make_ext4fs() is called on the /system partition and eventually results in an mmc_erase(). With a boarked /system, you are left with the blue-light special and ODIN will hang on factoryfs.img.

The other style of brick is the one that hangs on the logo. That one is caused by Wipe Data/Factory Reset from an "unsafe" cWM Recovery. Just as above, the "unsafe" CWM Recovery would have been compiled against the ICS libext4_utils.a and make_ext4fs() would be called on /data and /cache. Eventually that would lead to mmc_erase() being called. With a boarked /data, you are left with a phone that hangs on the log and ODIN will hang on data.img

As an aside, I now believe the main "unsafe" Recovery we were dealing with was CWM Touch fake-flashed onto the ICS kernel. I don't have the source code for CWM Touch, but I believe it was compiled against the ICS libext4_utils.a. I am not sure if Rogue CWM repacked with an ICS kernel was compiled against ICS sources, but the version that Steady checked into github was not.

Now previously I recall discussions where it was determined that GB source/kernels also had mmc_erase() and it was an open question why, if both GB and ICS had mmc_erase(), Recovery Wipe Data/Factory Reset would brick one but not the other. Well we finally got the answer to that, the GB version of make_ext4fs() from libext4_utils.a did not perform the ioctl() calls which would eventually lead to mmc_erase() being called, while the ICS version of libext4_utils.a had the ioctl() calls.

Now if you've been following closely so far, you may be wondering why if the update-binary that is bundled with both CM9 and AOKP ROM update.zip files is using the ICS version of libext4_utils.a (that makes the ioctl() calls which lead to mmc_erase()) it doesn't brick when run under a GB-kernel/CWM. Both ICS and GB have mmc_erase() functionality so it should brick both right?

Well I wondered the same thing, went digging a little, and believe have found the answer.

It turns out that even though the GB kernel MMC driver has mmc_erase() the ioctl() calls that would eventually lead to mmc_erase() being called were ifdef'd out. So if you had an "unsafe" update-binary which made the ioctl() calls, they would NOT result in mmc_erase() being called when the ROMs update.zip was run under a GB kernel/recovery.

Previously the discussion centered around MMC_CAP_ERASE as the explanation, which I believe is a red-herring. This is neither a pre-processor directive nor a direct control mechanism. It is a bit field used to specify whether an MMC device supports the capability of doing erases. It was not ifdef'd out of GB so the functionality was still there.

The actual conditional compile was on the ioctl() function for mmc cards, which was basically disabled under GB kernels. This is how I came to that conclusion:

Notice in the kernel config file Samsung provided, both CONFIG_MMC_DISCARD and CONFIG_TARGET_LOCALE_NTT are disabled.

.config
Code:
# CONFIG_MMC_DEBUG is not set
# CONFIG_MMC_DISCARD is not set
CONFIG_MMC_UNSAFE_RESUME=y
# CONFIG_MMC_EMBEDDED_SDIO is not set
# CONFIG_MMC_PARANOID_SD_INIT is not set
Code:
# CONFIG_MACH_C1_NA_SPR_EPIC2_REV00 is not set
# CONFIG_TARGET_LOCALE_EUR is not set
# CONFIG_TARGET_LOCALE_KOR is not set
# CONFIG_TARGET_LOCALE_NTT is not set
CONFIG_TARGET_LOCALE_NA=y
Now if you look at the mmc block device driver, due to the kernel config above, the ioctl() function call in the function table is left unpopulated. This basically means ioctl() calls are not supported for the mmc device (based on E4GT GB source, don't know about other platforms).

drivers/mmc/card/block.c
Code:
static const struct block_device_operations mmc_bdops = {
	.open			= mmc_blk_open,
	.release		= mmc_blk_release,
#ifdef CONFIG_MMC_DISCARD
	.ioctl			= mmc_blk_ioctl,
#endif
#if defined(CONFIG_TARGET_LOCALE_NTT)
#if 0 //def CONFIG_MMC_CPRM
	.ioctl			= mmc_blk_ioctl,	//int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);  in blkdev.h
#endif
#endif
Now when the "unsafe" update-binary from CM9/AOKP is paired with a GB kernel the wipe.c from libext4_utils.a will try and make these calls:

system/extras/ext4_utils/wipe.c
Code:
int wipe_block_device(int fd, s64 len)
{
  u64 range[2];
  int ret;

  range[0] = 0;
  range[1] = len;

  ret = ioctl(fd, BLKSECDISCARD, &range);

  if (ret < 0) {
    range[0] = 0;
    range[1] = len;
    ret = ioctl(fd, BLKDISCARD, &range);
  }
  return 0;
}
I believe that both those calls have basically been disabled because ioctl() calls are not supported in the mmc block device driver under the GB kernel (for E4GT EL29 GB kernel source).

So what does that all mean?

I believe GB systems are doubly "safe". The kernel will not execute mmc_erase() even though the functionality is there because the ioctl() call entry point from user space to kernel space has been disabled and also GB recovery and GB update-binary never even attempt to make the ioctl() calls.

That explains why, if you pair an "unsafe" update-binary from the CM9/AOKP ROM update.zip with a GB kernel, it is still safe.
The Following 15 Users Say Thank You to sfhub For This Useful Post: [ View ] Gift sfhub Ad-Free
2nd June 2012, 12:28 AM |#420  
OP Senior Member
Thanks Meter: 7,242
 
More
I also looked into whether it made sense for CM9/AOKP to package their update.zip with the "safe" EL29/GB update-binary. It turns out their Edify install script uses some extended functionality introduced in the ICS source. Basically their updater-script is not comptiable with a GB update-binary.

So are they out of luck? I believe with a "simple" change to their install script, they can make their ROM update.zip files safe for flashing in ICS recoveries. Basically just replace the format("/system") with delete_recursive("/system"). This will ensure make_ext4fs() is not called and thus the ioctl() in wipe.c will never be called.

Code:
*** updater-script-cm9  Thu Feb 28 20:33:00 2008
--- updater-script-cm9-new      Fri Jun 01 07:44:33 2012
***************
*** 9,17 ****
  set_perm(0, 0, 0644, "/tmp/backuptool.functions");
  run_program("/tmp/backuptool.sh", "backup");
  show_progress(0.500000, 0);
! unmount("/system");
! format("ext4", "EMMC", "/dev/block/mmcblk0p9", "0");
! mount("ext4", "EMMC", "/dev/block/mmcblk0p9", "/system");
  package_extract_dir("recovery", "/system");
  package_extract_dir("system", "/system");
  symlink("Roboto-Bold.ttf", "/system/fonts/DroidSans-Bold.ttf");
--- 9,15 ----
  set_perm(0, 0, 0644, "/tmp/backuptool.functions");
  run_program("/tmp/backuptool.sh", "backup");
  show_progress(0.500000, 0);
! delete_recursive("/system");
  package_extract_dir("recovery", "/system");
  package_extract_dir("system", "/system");
  symlink("Roboto-Bold.ttf", "/system/fonts/DroidSans-Bold.ttf");
The Following 7 Users Say Thank You to sfhub For This Useful Post: [ View ] Gift sfhub Ad-Free
2nd June 2012, 01:01 AM |#421  
Esoteric68's Avatar
Senior Member
Flag Hellabama
Thanks Meter: 1,482
 
More
Samsung really needs to hire you, garwynn and Entropy.
The Following 3 Users Say Thank You to Esoteric68 For This Useful Post: [ View ] Gift Esoteric68 Ad-Free
Post Reply Subscribe to Thread

Guest Quick Reply (no urls or BBcode)
Message:
Previous Thread Next Thread
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes