Attend XDA's Second Annual Developer Conference, XDA:DevCon 2014!
5,770,105 Members 52,449 Now Online
XDA Developers Android and Mobile Development Forum

Misc partition: Why does it die? How do I fix it?

Tip us?
 
MV10
Old
(Last edited by MV10; 17th July 2010 at 06:15 PM.)
#1  
MV10's Avatar
Senior Member - OP
Thanks Meter 31
Posts: 378
Join Date: Jul 2010
Location: FL
Default Misc partition: Why does it die? How do I fix it?

This is a long thread because I'm trying to provide the maximum detail possible in the hopes of luring some experts to assist. I am a developer with 30+ years experience, though with little *nix experience, since I hitched my wagon to WinTel when people stopped hiring assembly programmers and the term "GUI" began appearing in help wanted ads.

Yesterday, based upon my experience with one phone that I successfully upgraded to CM6-RC1 and another one that failed, I posted a new thread in the G1 General section, which was probably the wrong place for it. Both phones are US TMo G1's purchased within a few days of each other, around December 2009.

During the subsequent 12 hours I read everything I could find about the dreaded "E: Can't find MISC: / (No space left on device)" problem, which I eventually determined was preventing me from proceeding further.

I found many, many examples of people on all types of hardware who were (and many still are) stuck with a hosed-up misc with no idea how to proceed. This was somewhat alarming to me.

I found a few people who were apparently able to fix it by simply doing a flash_image of a misc.img copied from elsewhere. I found a few who seemed to have fixed it with dd. I found others who went through various combinations of installing other things until the problem mysteriously vanished. I found great info about what the misc partition is and how it's used.

What I did not find is:
(a) any clear explanation of how it gets hosed in the first place,
(b) any clear explanation of how to troubleshoot it,
and most importantly (c) any clear explanation of ways to fix it.

This thread is a request for an expert to step in and fill those gaps. Maybe if we can get some "misc lore" in a single place, other people who encounter the problem won't be left hanging.

So first the back story:

Two days ago I decided to install CM6-RC1 on my own G1. It went very smoothly. I was already on Cupcake, so I formatted the card, downgraded back to RC29, I installed Cupcake, formatted again from the phone, used flashrec to install RA 1.7 (which is amazing, by the way; I may be a n00b to phone-guts but that is already apparent), verified the radio version, installed DangerSPL, installed CM6-RC1, and installed the Google Apps. Flawless process.

Loved it. CM6 is great. So the next morning I had my wife leave her phone at home with me. I had seen a thread which led me to believe that the card didn't necessarily have to be formatted twice. I was under the impression I could format it once and drop all the files out there -- only Cupcake needed to be named update.zip for the process outlined above.

So I connected her phone to my laptop, reformatted to FAT32 over USB from Win7, copied all 211 MB of files over, disconnected and went into flashboot. The RC29 downgrade worked fine. I restarted and logged in just to be sure RC29 was on there. I powered off and restarted in recovery mode -- and the misc problem was already there.

In the stock /!\ recovery screen, ALT+L showed the misc error. I couldn't remember if I had seen that previously (having only done this once before), so I hit ALT+S and hoped for the best. The progress bar went about halfway then bombed on an assert in line 4. And that's as far as I got updating my wife's phone: in theory my story could stop here, but being a lifelong geek-type, I decided to forge ahead. I didn't yet know the importance of misc or even recognize it as my main problem, so bear with me.

I rebooted and rooted via telnet and used flashrec to install RA, and tried installing Cupcake that way. I get a different error from RA: No signature, verification failed. I thought I might have a bad file, somehow, despite having used the same update.zip that went into my G1 just fine, so I downloaded it again from megaupload. Then I downloaded the other one named signed-kila-ota. Then I did a file compare and confirmed they're identical. That won't load through RA. Not sure what's up with that.

But after thinking about it and doing more reading, I concluded I probably didn't need Cupcake for CM6-RC1, I just needed the correct radio image to support DangerSPL. So I grabbed the G1 2.22.23 radio image and tried installing that through RA. It extracts and installs ok, then dumps the Can't read Misc error, then tells me to reboot to complete. So I reboot -- and it goes back into the running OS, of course. And then the light goes on, since I did clearly remember on my own G1 it went back into RA, not into Android.

More digging uncovers the radio/SPL thread that explains how misc is used to control reboots, and I finally clearly realize that misc is my problem. (Actually I still don't know why Cupcake won't load from RA, but I still suspect if I can just load the right radio image, it shouldn't matter.)

During the following six hours I have tried a huge variety of things to fix misc, primarily working through an adb connection.

First I tried making a nandroid backup from my working G1. Took me awhile to figure out I had to do it from the command line to force it to backup misc, then I wasted time trying to get the command line to restore that backup, then I finally made another backup on the non-working G1 and copied the "good" misc over -- and still couldn't get it to restore (kept telling me something about being the current version, which I interpreted to mean it wasn't restoring because it thought the backup already matched the live filesystem).

Again, not knowing much about *nix, at this point I was convinced misc was simply dead and gone. I know what a disk partition is, but I didn't see misc (or the others like recovery) in parted, so I don't think I even understand what it means to say misc is a partition. But I didn't see it anywhere, so I thought it had been erased or overwritten or something along those lines.

Then I ran across a thread in which someone suggested doing a "cat /proc/mtd" which yielded the following:

Code:
dev:    size   erasesize  name
mtd0: 00040000 00020000 "misc"
mtd1: 00500000 00020000 "recovery"
mtd2: 00280000 00020000 "boot"
mtd3: 04380000 00020000 "system"
mtd4: 04380000 00020000 "cache"
mtd5: 04ac0000 00020000 "userdata"
I don't know what it means, but at least I see the system still knows something about misc.

Someone else asked for "dump_image misc /dev/zero" for diagnostic purposes, which yields:

Code:
mtd: ECC errors (0 soft, 1 hard) at 0x00000000
mtd: ECC errors (0 soft, 1 hard) at 0x00020000
Someone suggested "cat /dev/zero > /dev/mtd/mtd0" which results in the error message "cat: write error: No space left on device".

I tried copying misc.img out of the backup folder to the sdcard root and doing "flash_image misc /sdcard/misc.img" and was rewarded with the following lines which I can't interpret, although they're clearly related to the output shown above (I assume flash_image is probably a script or something, which is just doing those same steps internally?):

Code:
mtd: ECC errors (0 soft, 1 hard) at 0x00000000
mtd: ECC errors (0 soft, 1 hard) at 0x00020000
mtd: erase failure at 0x00000000 (I/O error)
mtd: erase failure at 0x00000000 (I/O error)
mtd: skipping write block at 0x00000000
error writing misc: No space left on device
I ran across another thread which suggested the command "dd if=/sdcard/misc.img of=/dev/block/mtd0"... that produced this initially encouraging-looking output, though I don't know what it means and it didn't fix misc:

Code:
512+0 records in
512+0 records out
I also saw a few steps and suggestions relating to fastboot. I didn't try any of these since the only instructions I could find for setting up fastboot (in that stickied noob thread) requires a version 2 radio image, which I can't install because misc is fried.

So, in short, searching xda and the Internet in general hasn't helped much, except perhaps to better prepare me to follow somebody else's instructions . In reality I have gone through several different sets of instructions multiple times and tried a variety of other things, but it always comes back to not being able to complete a radio image installation because of that problem with misc.

I'm willing to try just about anything... and I know there are quite a few others out there with a misc problem who can't seem to make any progress or get any input, so hopefully my exhaustive description of how I got here and what I've tried already will be useful to one of the local experts.
 
MV10
Old
#2  
MV10's Avatar
Senior Member - OP
Thanks Meter 31
Posts: 378
Join Date: Jul 2010
Location: FL
I know that ECC refers to the error correction checksum used to detect memory errors... but I find it awfully suspicious that the two supposed ECC errors fall on the very first and last slots on the misc range -- particularly since everybody else with this problem who posts the results of attempts to troubleshoot it or fix it reports exactly the same thing.

In other words, I assume the error message is wrong. This is pretty much the only reason I don't just conclude that the memory is actually hosed and go shopping for a new phone.

Oh, and... bump.
 
lbcoder
Old
#3  
Account currently disabled
Thanks Meter 95
Posts: 2,645
Join Date: Jan 2009
You are certainly telling the truth about it being quite long. That fact does, unfortunately, make it somewhat difficult to read.

I assume that you've seen a few of ezterry's and/or my own posts about the partitions, which is probably where you saw the info on the misc partition.

In any case, the misc partition isn't a "filesystem" partition as you are familiar with. It is actually just a simple data structure. In fact, only the system, cache, and userdata partitions are actually filesystem partitions, and the cache partition is only a filesystem partition part of the time -- during radio and spl updates, it also is used as a simple data structure with a header field and a payload field. That, along with the misc partition, instructs the SPL to perform a radio or spl update.

Now there is a possibility that it may be possible to salvage the device without a working misc partition. Specifically, the requirement is that you get yourself a high-engineering SPL (one with the ability to fastboot a radio image -- note: it is FAST boot, not flashboot).

One important thing to note that might make things easier is that an error "finding" the misc partition *might not imply a failed misc partition*. It could possibly be a failed CACHE partition. Have you tried FORMATTING your cache partition?

In any case, you are no doubt really wondering about my statement that you might be able to update the SPL without the use of a misc partition.... Read THIS thread and you will see how the partition tables are defined and how they can be overridden. This suggests a way that you can actually DEFINE the SPL partition to the linux kernel, which in turn, should allow you to flash_image an SPL update. What you need to do is determine the starting offset and length of the SPL partition, and define it along with the rest of the partitions on the kernel command line. Once this is done, you should be able to fastboot flash a radio update to the device.

Note: Having just done an RC29 NBH file, there is PRECISELY ONE high-engineering SPL that you can install to the device safely.... 1.33.2003 (ending with a THREE -- very important, a 5 is a brick when combined with an rc29's radio).

Also note: I don't take any responsibility if you fry it completely trying this idiotic procedure without a jtag standing by. It is quite risky. I suggest it because it may be your best chance of getting through this.

Note: fastboot does NOT require a 2.x radio image. Fastboot requires an engineering SPL, which for the same reason, you can't install.


Now as for the location of the read/write errors.... you think that it is suspicious that they occur at the first and last slot of the memory range...

Well this is not unexpected since there are only two slots. Each of 128 kB. The first at 0 offset wrt the start, the second at 20000 offset wrt the start. The ECC error itself says that each of the two blocks has failed whatever operation it was trying to perform.

I suggest that your first step might be to try again writing the RC29 NBH file.
 
MV10
Old
#4  
MV10's Avatar
Senior Member - OP
Thanks Meter 31
Posts: 378
Join Date: Jul 2010
Location: FL
Thank you for the explanations and all the details.

I have actually reloaded RC29 quite a few times. I followed the directions from scratch a couple times in case I had gotten something wrong (of course, this was easy to do since I get stuck pretty early in the process).

I'll try formatting CACHE and I'll take a look at using the SPL you reference and report back later.

I really appreciate the assistance.
 
MV10
Old
(Last edited by MV10; 19th July 2010 at 10:43 PM.)
#5  
MV10's Avatar
Senior Member - OP
Thanks Meter 31
Posts: 378
Join Date: Jul 2010
Location: FL
Ah, just realized that when you do "Wipe cache" from RA recovery, formatting cache is the second step. Since that is immediately followed by another "Can't read MISC" error message, I guess formatting doesn't fix my misc issue.

In this paragraph:
Quote:
In any case, you are no doubt really wondering about my statement that you might be able to update the SPL without the use of a misc partition.... Read THIS thread and you will see how the partition tables are defined and how they can be overridden.
Your "THIS" didn't link to anything. I'll go search for what you're referring to, since this would appear to be my only remaining solution. No JTAG handy, but if someone of your experience thinks this is probably my last-ditch option, I don't have much to lose anyway, right? I'll take it slowly.

Edit: I think this is it? forum.xda-developers.com/showthread.php?t=704560 Pretty clever... crazy and dangerous, sure, but what the hell, it's just a phone, lol...

Again, thanks for taking the time to help out.
 
ezterry
Old
(Last edited by ezterry; 20th July 2010 at 03:04 AM.)
#6  
ezterry's Avatar
Retired Recognized Developer
Thanks Meter 1000
Posts: 1,823
Join Date: Jan 2010
Location: Asheville, NC

 
DONATE TO ME
Quote:
Originally Posted by MV10 View Post
Ah, just realized that when you do "Wipe cache" from RA recovery, formatting cache is the second step. Since that is immediately followed by another "Can't read MISC" error message, I guess formatting doesn't fix my misc issue.

In this paragraph:


Your "THIS" didn't link to anything. I'll go search for what you're referring to, since this would appear to be my only remaining solution. No JTAG handy, but if someone of your experience thinks this is probably my last-ditch option, I don't have much to lose anyway, right? I'll take it slowly.

Edit: I think this is it? forum.xda-developers.com/showthread.php?t=704560 Pretty clever... crazy and dangerous, sure, but what the hell, it's just a phone, lol...

Again, thanks for taking the time to help out.
Before re-writing partitions find a recovery with 'erase_image' (I hear tell clockwork has it) install and try:

erase_image misc

then

flash_image misc <misc.img>

where misc.img is an old nandroid backup from a phone of the same region as your own (least its preferable its the same region your CID is in the structure)

It may correct the issue... if not we can try to flash an engineering SPL via flash_image..

I feel this is very safe in theory (as we don't have to worry about boot mode 3.. thus if a valid SPL is flashed you won't completely brick).. However we have no safeguards at this point in time so be careful that you really understand what is going on.. else you will write garbage to the SPL, and there is no helping that w/o JTAG.

(btw.. the SPL .. even the full engineering ones like 1.33.2003 and 1.33.2005 wont actually let you erase misc.. but will let you flash it)
Dream Sapphire:
> Radio 2708+ (+15MB kernels)
> MT3G ota froyo rom dream sapphire port
> ezGingerbread (rom/source): Dream/Sapphire
> DS JTAG: Soft load of SPL (to unbrick/re-root) / JTAG WIKI


Acer A500:
> ezT20 kernel A500 and A100
> A500 Public Recovery (Clockwork Mod based recovery + source for the A500)
> Acer A500 ICS Rooted w/ Busybox
> Acer_A500 OTA 7.014.14 --HC 3.2.1-- Rooted w/ Busybox


Donations for beer/rent are always appreciated.
Twitter: xdaterry -~- Google+: profile -~- GitHub: ezterry
 
MV10
Old
#7  
MV10's Avatar
Senior Member - OP
Thanks Meter 31
Posts: 378
Join Date: Jul 2010
Location: FL
Thank you, I'll try it later today.

Not that it's relevant to getting me fixed, probably, but no idea how/why this problem crops up? Or is it more a case of an error that can have multiple causes? I found it interesting that so many people were reporting it across the various Android forums, and there seemed to be no attempt to explain it. That kind of thing always makes me curious, particularly in an environment like this -- a room full of curious "dig in and figure it out" personalities...
 
lbcoder
Old
#8  
Account currently disabled
Thanks Meter 95
Posts: 2,645
Join Date: Jan 2009
If it ever happened to me, I would certainly try to figure it out, however this is really difficult since it has never happened to me. I don't think that it is anywhere near as common as you think.

What I believe about the situation at the moment is that it is *probably* a failure somewhere else along the line that simply has this SIDE EFFECT.

ezterry: Do you remember which memory address ranges are written by an nbh file? I recall that the nbh file has divisions for the different partitions, so I suspect that it may not write *everything*. Maybe misc and/or cache are not written?

Note: I have seen plenty of instances of the cache partition getting borked and having weird side-effect. The problem with the cache partition and why IT gets into weird states is that it is a dual-purpose partition -- sometimes a yaffs2 filesystem, sometimes a simple data structure, so if it gets into the data structure mode and something tries to use it as a filesystem, you end up with some interesting side-effects.
 
ezterry
Old
#9  
ezterry's Avatar
Retired Recognized Developer
Thanks Meter 1000
Posts: 1,823
Join Date: Jan 2010
Location: Asheville, NC

 
DONATE TO ME
Quote:
Originally Posted by lbcoder View Post
ezterry: Do you remember which memory address ranges are written by an nbh file? I recall that the nbh file has divisions for the different partitions, so I suspect that it may not write *everything*. Maybe misc and/or cache are not written?
The nbh is just a custom archive the header has 3 arrays of 32bit indicating the following for each partition included

> Partition type (this determines the partition via some mapping to flash radio,hboot,misc,cache,recovery,boot,system,splash 1,diag)

> Partition offset from start of the nbh file (signature removed if included)

> size of image

The diagnostic nbh only has the fake 'diag' image.. however most others in the wild seem to have radio, hboot, splash1, recovery, system, cache, userdata...

I don't think I've seen one with misc.

Certainly none of my current collection have it. I Wonder if they allow it?
Dream Sapphire:
> Radio 2708+ (+15MB kernels)
> MT3G ota froyo rom dream sapphire port
> ezGingerbread (rom/source): Dream/Sapphire
> DS JTAG: Soft load of SPL (to unbrick/re-root) / JTAG WIKI


Acer A500:
> ezT20 kernel A500 and A100
> A500 Public Recovery (Clockwork Mod based recovery + source for the A500)
> Acer A500 ICS Rooted w/ Busybox
> Acer_A500 OTA 7.014.14 --HC 3.2.1-- Rooted w/ Busybox


Donations for beer/rent are always appreciated.
Twitter: xdaterry -~- Google+: profile -~- GitHub: ezterry
 
MV10
Old
#10  
MV10's Avatar
Senior Member - OP
Thanks Meter 31
Posts: 378
Join Date: Jul 2010
Location: FL
Clockwork's "erase_image misc" returns an error:

mtd: erase failure at 0x00000000

I also tried wiping and formatting the cache again, on the off chance that maybe clockwork did something differently. Nothing new to report there.

As for this kernel partition approach, do I correctly understand that I would be telling the kernel to create a new partition name mapped to a range which precedes misc where the SPL is located? I assume I can derive the size from an img of the stock SPL of the same version. Any tips on how I can figure out where it starts? (Apologies if it's in that thread Ibcoder referenced, I haven't finished reading it yet.)

Or am I thinking about this completely wrong?

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes


Android App Review: Hide Your Files with Androignito – XDA Developer TV

Stop right now, look at your phone, and see how many pictures, … more

BrightNotes Makes Cloud Notes Simple and Easy

There are literally thousands of note taking apps available on Android, with practically every … more

Google Glass XE20.1 Update Brings Improved Contacts, Head Nudge, New Cards and Commands

Ever since Google unveiled Android Wear earlier this … more

Remote Control Your Android Device with Monitordroid

The Android OS showsgreat potential in many areas. One of most interesting things that … more