Solving the thermal problems of HD2 or other snapdragon powered devices

Search This thread

motoi_bogdan

Senior Member
Sep 8, 2007
319
337
UPDATE -

As of June 14. 2011 it appears the only viable way to solve this problem is to reheat the cpu with specialized equipment. I am currently testing if this can be done in a standard gas/electric oven (NOT MICROWAVE !!). If successful this method will solve the problem and the following guide (cpu cooling system) may help prevent it from happening again. Only building this cooling system is not enough and will not solve your problems.


Hello dear members of XDA

First of all please excuse my english, I will try to explain myself as well as I can. It will be a long post, it could be boring, it could be scary or whatever you like but bear with me on this one.

So.. got myself a HTC (T-Mobile) HD2. A bit late in the game, but hell no, still a good phone. I've dreamed of having one but couldn't afford. Anyway finally i got one. A second hand broken one, damn it. :p
As i found out the problem is pretty common: damn thing restart itself - thermally related - the old CPU overheat problem. By searching the net I found out that it's pretty common with some HTC models. HD2 has it, Desire has it, Nexus One has it, hell even some xperia models have it.. about half of the devices powered by anything from the Snapdragon series could have it.
The problem could be easely described as : phone hanging, restarts, the dreaded 7-8 short vibrate sequence - phone locked etc.
Mine was worst then i've seen on the forums or with other people. It locked itself for just about every reason i could get. Taking pictures, browsing the menu, using gps, the browser, 3g or wifi, watching a movie ... all concluded with restarts or lock-ups after some couple of minutes. I've found out that keeping the phone at 4-5 degrees celsius would solve my problems in most cases, but anything above 10-15 degrees would make the thing go crazy.

Well, I'm pasionate about electronics, development in this area, trying to solve problems and things like that. Also experienced in heat and semiconductor related problems. I also had one macbook air that suffered from core shutdown because of overheating (also a well known problem for MBA rev 1.0) and managed to design an alternate cooling system that solved the problem. So i gave it a shot, i know there are many users that have similar problems and altrough i don't suggest them explicitly to make this hacks to their phones.. this is one way to solve the problem if you buy your unit second hand or don't have some form of warranty.

So here we go.

Big fat warning!!! Don't attempt these things with your phones unless you are familiar with the concepts or the tools involved in the process. Also, there is a real risk to permanently damage your phone. Not just real.. but big if you get something wrong.

First step is to run some simple tests to determine the cause of the problem or the range it extends to.
So, I used a multimeter with a K type thermal probe to measure the temperature of various components of the phone during intensive use.



this is the back of the mainboard of my HD2. If you notice, HTC placed a blue-ish thermal pad over one metallic shield covering the back components. I don't know what's the purpose as the back casing in that area is made of plastic - no heat dissipation, or a bad one. Anyway that's a good place to place my probe. Some tape holded the probe in position. Because we don't have perfect mechanical contact between the probe tip and the casing or chips i expect +1 or +2 degrees celsius to be added to each measurement i will later describe.



i now placed the battery over the back of the phone and secured it with some other tape and some toothpicks :D



we're at 19.3 degrees. That's were we'll going to start from.

there's a usefull little app that allows users to overclock or stress test their phones cpu. Found it here on XDA, i'll use it for some heat making purposes.



as you can see.. we're already at 25.8 degrees, after 5 minutes of testing.. not to mention the actual heat making primary suspect - qualcomm chipset is on the other side. At 29.5 degrees at this point.. the phone locked itself. I reapeted the experiment 2 more times - got exactly the same result.. at least the readings were consistent.

Ok, i then removed the motherboard to take some readings from the actual CPU.



same procedure.. next readings. - at around 33.4 - 34.2 degrees (varies) on the CPU itself the phone will either restart or lock itself up. So you see how serious my problem is. Summer will come so I won't be able to use my phone... :p
Measures have to be taken.

Let's make a small introduction about heat related to semiconductors.

Well, simply put a conductor (semiconductors act the same way) generates some amount of heat when an electric current is passed along it. This is because of the fact that small electrons moving along the conductor (in a simple way that's the definition of any electric current) will ocasionaly collide with the atoms of the material their passing through. In the collision the electron loses some amount of energy. That energy is heat. Also, heat itself can be described at an atomic level as the intensification of natural ocuring brownian movement of atoms. If they move a lot, if they are more agitated they create more heat. If they are more agitated, they are more likely to be hit by passing electrons. So a hot conductor is more likely to get even hotter because of that. There is a point were the heat generated makes the conductor's atoms prone to more hits from passing electrons in kind of like a geometric progression. That's called thermal runaway. It will tend to destroy electronics by overheating, melting or burning themselves up.
Back to our phones now. The CPU produces heat. Because of the same effect described above. The heat in this case will either melt or break the small "balls" that comprise the BGA matrix on what these cips are mounted on. The small balls will either melt (extreme cases) or dilatate with increasing temperature. However it seems most of the new processors used by HTC are mounted in some epoxy resin that has both dilatation point and melting point higher then the flux and welding compound used to solder those cips. So the actual cip will tend to stay fixed in a particular position, unable to expand or contract with temperature variations, but the balls used in the BGA matrix underneath it will contract or expand with these variations. This could lead to a case when at least one of that balls (some couple hundreds in total) become "loose" or out of position, thus breaking the electrical contact it should have made. Therefore our problems. At fist large amounts of heat must be applied in order to actually break the bond between the cpu and board, but after that, once broken the tiny links are very sensible to temperature variations and they will expand or contract freely.
Most users notice that at it's core, the problem seemed related to overheating (in the begining) but after time it's effects are degenerative.. phones seem to restart with no apparent reason. It's still overheating, but things are starting to get more and more worse as the chip and it's connections become more sensitive to heat variations. Thus, even small variations now produce these problems - my CPU restarts at 34 degrees .. that sucks.

So, my only option was to try to reheat the cpu in the attempt to partially melt the broken "balls" in the bga matrix and hopefully.. i repeat HOPEFULLY they remake contact with the mainboard. A re-ball of this chip is not possible, as the resin placed around it by HTC doesn't melt at the normal temperature i could remove the chip itself, so heating it at even higher temperatures would risk killing the cpu long before the resin melts. Strange move by HTC to make things like this.

Anyway.. here goes nothing..


I've placed the usual aluminium foil designed to protect surrounding components by the heat generated by the rework station and the hot air used to heat up the CPU.



I preheated the CPU for about 10 minutes, from both sides of the board, then switched to heating it at 360 degrees. I applied even pressure above it after it was heated in order to tighten the space between it and the board, just a little bit. THIS IS VERY RISKY. Normally not recommended because of the risk damaging the BGA. In this case the resin would prevent me from moving the chip to much so it's less risky. Not safe.. but less risky. :p

I've let the board to cool on it's own for half an hour and repeated the temperature monitoring tests.
Now i had an increase of maximum temperature before a restart from 34 degrees on the cpu to about 42. It's not much but it's a start. However above these temperature.. the phone will still lock or restart.
I went for another round of reheating with the hot air station. After this, i've got slightly better results. Some 2-3 degrees more. My lucky break was when i suspected thermal runaway for the CPU. So i tried to make some sort of a heat sink for that chip using some mica foils for to220 can transistors, some thermal grease and a bunch of aluminum and copper foils. My theory was that heat dissipation will eventually accelerate faster above a specific level, a point from witch thermal runaway occur. In my case in the initial tests, even after the phone locked itself and i manually restarted (battery out - in) the temperature continued to increase even faster altrough the phone wasn't doing anything intensive.
The role of my "heat sink" would be to dissipate more heat rapidly and in some manner to press the cpu against the board.
After I placed the mica foils directly above the cpu with thermal grease above and beyond i mounted back the metal shield over that area. On it, i placed some more silicon paste and some thick copper foil (used in some broken laptops i have over here). It looks ugly but.. worth a shot:



after that i begin making the rest of the heat sink using aluminum foil. I folded about 12 layers, between each of them having placed... more thermal grease and at the 6-7 layer another round of mica crystal foil.

Here's the aluminum foil



I then pressed the foils very hard between two flat surfaces in order to remove the excess thermal grease.
I "anodized" the first layer (the one in contact with the cpu shielding) with some ferric chloride. Before that, the board looked like this:



After the logic board was mounted back, i remade all the connections and after some preliminary tests, mounted the phone back together. It now looks like this



I only have to re-attach the serial no. and imei, plastic sticker.

Of course i then run tests. I heated up the phone with a hair dryer to simulate a hot summer day. About 40 degrees, just to be sure. I then run cpu stress tests and a full divx movie (impossible in the past). On preliminary testing, i had indications that i avoided the thermal runaway the cpu now running stable at 24 degrees (19.3 in the room - ambient temperature). No more, heating up by itself to about 40 degrees then restart.
On the final testing, with the phone put together, i heated it with the hair dryer and achieved 40 degrees. I started it and run stress tests. No more lockups or restarts, not even a single one. However with the phone put together i can't measure inside temperature on it's components. As i feel it, it get's warmer, it heats up to some degree, but now it's spread all over it's surface. For some particular reason it doesn't restart anymore.

I then tried, cpu stress test, wlan connection, pc connection and browsing the net all at the same time. NO RESTART :D I watched a full 1.30 hour movie at max playable quality, the phone was really hot (43-44 degree at it's surface) but still no problems.
It appears that for the moment i saved the phone. However, future behavior is still to be determined.

I'll get back with more testing, in the following days and eventually i hope to devise a general method for building heat sinks for phones (yeaah, ridiculous....) using combinations of metal and thermal conductive cristals. The ideea is to find out if reheating the chip by hot air station can be avoided (this involves the most risk). But the start is promising. By the time warranties will expire and phones like the new droids or winmo 7's start to break from thermal problems, maybe i'll have some sort of a more user friendly solution.

EDIT JUNE 04.2011
since i have a dead hd2 motherboard here, i tried to remove the cpu to expose the BGA soldering. Just for fun, no chance of BGA reball, as there aren't any tools available for this particular chip. The resin prevents a proper removal, at about 450 degrees celsius it was still kind of hard, so i had to forcefully remove the chip and break some of the BGA. The chip is very thin, kind of like a micro sd card. It heats up pretty quick and fast, the solder points underneath it got melted in about 2-3 min at 370 degrees celsius.
Here's how it looks.
This is the motherboard without the chip. The BGA matrix is broken, some balls were simply ripped out when i forcefully removed the chip.



This is the actual chip compared with a mini sd and and standard sd card.



...and this is the underside of the chip. belive it or not, the chip is actually alive and it's pins are ok. It cannot be used because it cannot be properly soldered to a board. Guess i'm gonna punch a hole through it and use it at my key chain, along with a laptop cpu already there ;)



In the following days i will experiment with the solder points&materials in order to try to produce a more safer method to reheat future boards with thermal problems. It seems this board died because of overheating and a short circuit made over the center of the array by 3 solder balls that got in contact once they were melted.
 
Last edited:

T-Macgnolia

Senior Member
Sep 30, 2010
3,796
2,023
Shannon, Ms.
You sir are the man. :)

I personally do not have any over heating issues with my HD2. But there is so many people here on XDA tbat do. So your work will be greatly appriciate and followed here. From what you have posted you may just be onto something that will be very useful not just to HD2 owners but to a large veriety of smartphone and maybe even tablet owners, present and future. I too find the the epoxy resin that HTC placed around the chip odd. It is almost like HTC did not want this part of the hardware to be replacable. But make you have to buy a whole new main board, corperate crap :mad:

Anyways please keep up the good work and I will be following this thread very attentiavely. So please do post back here with your future findings.
 
Last edited:

d3m0n1

Senior Member
Mar 7, 2009
84
24
The epoxy resin is there to hold the CPU (which is the biggest and probably the heaviest part on the PCB) in it's place, and thus making it more resistant to mechanical forces (such as accidental drops, shocks, etc...).
It could also be, that HTC wanted to prevent access to the CPU-s I/O pins, making it impossible (or at least very difficult) to desolder it. This way it is difficult to "reaserchers - hackers" to chart a schematic diagram of the connections between CPU, flash chip, ram, etc..., or to attach logic probes to the IO lines, and that leads to being difficult to make software hacks such as HardSPL, or sim unlocking, etc... I know that there are other methods to connect to CPU (such as JTAG), but fewer options mean, less chances to succeed.

B.r.:
d3m0n
 

xlr8me

Senior Member
Apr 19, 2010
612
61
This is an excellent post and thnx you for your testing and investigation. Do let us know how your unit goes within a week of real world testing.

The engineers @ htc should have incorporated a better CPU cooling solution on hd2. Your testing and modifications are a wealth of information.
 

motoi_bogdan

Senior Member
Sep 8, 2007
319
337
thanks for the support

The resin on the cpu has a higher melting point then that of the solder joints of the actual cpu to motherboard. I guess the reason for it's presence is that when the phone is hot (and HTC knows the cpu can gen hot) the solder joints become less mechanical resistant, they could be more easily broken. In case of a BGA mechanical failure, the resin however would pose a problem as the chip can't be desoldered safely or even reheated. I took a risk there.
I don't think HTC didn't want us to desolder the chip because of JTAG pinnout mapping. Even with acces to the pins, it would be very hard to find the pins without some form of CPU datasheet. Same goes for CPLD for example.

Anyway, at this time i still don't know if either reheating or making some sort of cooling system helped me to solve this problem. So far, still good, not a single problem noticed so far. I can now reflash the phone, before the procedure, i experienced that vibration pattern during a flash. I've put android on it, stress test it even more, i'm now trying to play some 720p on it. It still heats up, it feels as warm in the hand as before, but for whatever reason, it doesn't restart anymore.

However...more testing still to come.

On another note, i'm now working on a broken Eten M810, some CPU problems. It this case, the CPU doesn't have that resin, however the nand memory has. :p Different brand, different choices.
 
  • Like
Reactions: ttersu

motoi_bogdan

Senior Member
Sep 8, 2007
319
337
as i found out until now the steps from a good working hd2 to this problems are something like this:

1. phone working ok. mainboard (lower part of the device) heats in some conditions - demanding programs etc battery can reach about 40-45 degrees max. without problems. The phone will restart or freeze (cpu halt) in any of these situations :
- battery temp exceeds 45 degrees and stays over this value for at least 5-10 minutes in order to trigger the thermistor used to measure the temperature in this area over i2c. at this time, it will prevent further charging and restart or lock the phone. This is normal behavior.
-CPU exceeds 60-65 degrees (exact value still to be determined. i'm trying to get acces to some similar chipset datasheets). This produces CPU halt. Depending on what you're doing, the halt will either reset the phone or simply lock it up. Restarting by soft reset or by itself will probably return the user back in the home screen with the phone still working. This is also normal behavior, related to qualcomm chip.

2. phone starts to malfunction. This condition starts by either large variation in temperature - mainboard al low temperature gets fast to full load or simply sustained full load. All HTC HD2's revisions have the same type of soldering in the cpu area. Visually speaking (no conclusive data yet) first revision used a bit more epoxy resin to secure the cpu in place. In the context of overheating and solder balls dilatation, that's not quite a good thing. Some sort of thermal spike must occur in order to break the contact between cpu and motherboard. Warning, if your phone will lock up and doesn't restart by itself, it's imperative that you disconnect the battery because as I measured, even with the phone locked, the CPU still overheats even more, thermal runaway occurs and temperature climbs to dangerous levels. I never left the phone do this for a long time, therefore I don't know how much it will still overheat, but it does and it will. In the initial stage of the problem, only extended heavy load use can trigger the problem. A common case is keeping the phone on in the car and using it for gps navigation in a hot summer. If the phone will restart before either 45 degrees at battery or 60-65 degrees at cpu level (however the last one is harder to measure) then you certainly have problems and they are just at the start.

3. problems get worse. At this stage it is possible to notice the 7 short vibrates at boot time if the phone is warm or kept in a warm environment. You don't have to push it very hard, it only needs to be warm. The vibration pattern is an error code made by the actual qualcomm chipset, not sent by either bootloader, spl or operating system. When in happens the cpu will lock itself up, however file transfer (including nand memory acces, storage card acces and basic operations) or other chipset functions will still work for some time. It appears only cpu processing is being halted. So if this occurs when you boot the phone, it will lock up, but if this occurs when you are flashing a rom, you might continue to see the progress bar still filling. The vibration pattern signals a physical damage to the qualcomm chipset has ocurred. There's no way around it, when it occur it will never just .. heal up by itself.
You will notice that the temperatures needed to induce a restart/lockup will decrease with time (both battery & cpu).

4. Problem at it's worse. CPU can lock itself even at 35-40 degrees (measured at it's level). Ambient temperature of only 10-12-15 degrees is enought to have the phone experience problems. The cpu start to suddenly produce either lock-outs or hard faults or simply work intermittently. The OS may give errors relating ARM CORE failure or fatal errors regarding execution of certain "lines" (related to code lines in the os core programming). At this stage, the phone doesn't need to feel warm in your hands to produce these problems. This could trick some people not to still relate this to thermal problems and look for the solution or problem cause elsewhere. It's still related... but at it's worse.

5. Total CPU collapse. If the phone locks and remains locked in whatever screen or program it was running, like i've said before, it will still overheat. If a stage 4 phone is left overheating, chances are that more balls connecting the chipset to the motherboard will fail. If any one needed to correctly initialise the chip or to power it on, fails - then it's end game for that phone. It will simply stop working and never turn on. Some other variants are that the phone will only start if placed in a freezer or start but never complete a boot sequence (either os or bootloader .. or both could be unable to start)
 

modex

Member
Feb 10, 2008
17
1
as i found out until now the steps from a good working hd2 to this problems are something like this:

1. phone working ok. mainboard (lower part of the device) heats in some conditions - demanding programs etc battery can reach about 40-45 degrees max. without problems. The phone will restart or freeze (cpu halt) in any of these situations :
- battery temp exceeds 45 degrees and stays over this value for at least 5-10 minutes in order to trigger the thermistor used to measure the temperature in this area over i2c. at this time, it will prevent further charging and restart or lock the phone. This is normal behavior.
-CPU exceeds 60-65 degrees (exact value still to be determined. i'm trying to get acces to some similar chipset datasheets). This produces CPU halt. Depending on what you're doing, the halt will either reset the phone or simply lock it up. Restarting by soft reset or by itself will probably return the user back in the home screen with the phone still working. This is also normal behavior, related to qualcomm chip.

2. phone starts to malfunction. This condition starts by either large variation in temperature - mainboard al low temperature gets fast to full load or simply sustained full load. All HTC HD2's revisions have the same type of soldering in the cpu area. Visually speaking (no conclusive data yet) first revision used a bit more epoxy resin to secure the cpu in place. In the context of overheating and solder balls dilatation, that's not quite a good thing. Some sort of thermal spike must occur in order to break the contact between cpu and motherboard. Warning, if your phone will lock up and doesn't restart by itself, it's imperative that you disconnect the battery because as I measured, even with the phone locked, the CPU still overheats even more, thermal runaway occurs and temperature climbs to dangerous levels. I never left the phone do this for a long time, therefore I don't know how much it will still overheat, but it does and it will. In the initial stage of the problem, only extended heavy load use can trigger the problem. A common case is keeping the phone on in the car and using it for gps navigation in a hot summer. If the phone will restart before either 45 degrees at battery or 60-65 degrees at cpu level (however the last one is harder to measure) then you certainly have problems and they are just at the start.

3. problems get worse. At this stage it is possible to notice the 7 short vibrates at boot time if the phone is warm or kept in a warm environment. You don't have to push it very hard, it only needs to be warm. The vibration pattern is an error code made by the actual qualcomm chipset, not sent by either bootloader, spl or operating system. When in happens the cpu will lock itself up, however file transfer (including nand memory acces, storage card acces and basic operations) or other chipset functions will still work for some time. It appears only cpu processing is being halted. So if this occurs when you boot the phone, it will lock up, but if this occurs when you are flashing a rom, you might continue to see the progress bar still filling. The vibration pattern signals a physical damage to the qualcomm chipset has ocurred. There's no way around it, when it occur it will never just .. heal up by itself.
You will notice that the temperatures needed to induce a restart/lockup will decrease with time (both battery & cpu).

4. Problem at it's worse. CPU can lock itself even at 35-40 degrees (measured at it's level). Ambient temperature of only 10-12-15 degrees is enought to have the phone experience problems. The cpu start to suddenly produce either lock-outs or hard faults or simply work intermittently. The OS may give errors relating ARM CORE failure or fatal errors regarding execution of certain "lines" (related to code lines in the os core programming). At this stage, the phone doesn't need to feel warm in your hands to produce these problems. This could trick some people not to still relate this to thermal problems and look for the solution or problem cause elsewhere. It's still related... but at it's worse.

5. Total CPU collapse. If the phone locks and remains locked in whatever screen or program it was running, like i've said before, it will still overheat. If a stage 4 phone is left overheating, chances are that more balls connecting the chipset to the motherboard will fail. If any one needed to correctly initialise the chip or to power it on, fails - then it's end game for that phone. It will simply stop working and never turn on. Some other variants are that the phone will only start if placed in a freezer or start but never complete a boot sequence (either os or bootloader .. or both could be unable to start)

my phone stuck after it fall on the ground facing the LCD down on the tiles. there is no physical damage the screen is in perfect condition, touch screen works very well but after it hits on the tiles my phone is getting stuck randomly.. since its been a month after the incident i have tried a lot of ways to fix this even tried removing all the parts except the LCD and digitizer to see if there is something lose inside but still not fixed.. :( i m noob to all this dont really know names for all the parts. the reason i have found for this freeze is due to a little press near sim card where the main board is. even a very lite press from the back near sim card results the freeze..i can say this because it wont get stuck if didn't touch the mentioned area, every time when its stuck i have to remove battery cover and press the red button and it will get stuck again if i put the cover back after it boots (becouse when putting cover back it definitely press the area....so i have to be very careful to put it back) so can anyone help me on this? what could be the problem..:( sorry for my english :)
 

xlr8me

Senior Member
Apr 19, 2010
612
61
An interesting observation. I have been in air conditioned room last 4 hrs and it really cools the hd2 down. Perhaps the glass digitizer is quite conducting and non-insulating.

During summer hot days I have hit 42C without any issues. Dont want to hit 45C though this hd2 is a beast.. Imagine staying at 1100MHz o/c all the time... aiiii caramba...:p we could cook some eggs on it.
 

electronspeed

Member
Jul 6, 2010
10
0
NJ
Yeah, I've experienced some of the freeze while HD2 got hot and stayed like that until it cooled off. I've noticed that when I try to charge it in the car and run google map it tends not to fully charge as it should but it heats up after a while until it becomes unresponsive then i'd let it cool off again.
Hope it hasn't effected CPU's connections much but at this point I'll have to monitor its heat situation to prevent future disruptions.

Good work/thinking on the "cooling adapter" , i've seen similar approach on IBM's graphic card which fail due to the same reason and ppl would heat them up to reconnect chips connection to pcb.
 
Last edited:

motoi_bogdan

Senior Member
Sep 8, 2007
319
337
the back of the lcd display (the actual lcd, not touchscreen) is made of metal and on top of it there is some copper foil. So.. yes, if you cool the screen it helps cools the cpu. However when i disassembled my hd2 i noticed that the actual cpu isn't in direct contact with the metallic back of the screen. So, although it could help if you cool the screen, it isn't very effective. I adapted my "cooling" pad to have the cpu thermally connected to the back of the screen via that DIY aluminium and copper foils setup.

@ modex if the phone drops, the bga connection between either the qualcomm chipset or nand memory (the 2 largest chips onboard) could get damaged. As we know the connection between the cpu (inside qualcomm chipset) and motherboard can be faulty or get that way over time, it's a prime suspect in your case. It is very difficult to predict the outcome if you send in the phone for repairs or have it reheated. I know of no service center that can effectively reball the cpu to the motherboad (means that the chip would actually have to be removed, connections remade - chip resoldered)


My phone is still doing well, one week after the intervention i made. About 14 roms installed, running wp7, android builds, custom wm 6.5, ubuntu and etc. Not a single restart or freeze. It does heat up, but it's now spread over it's surface.
From what i can tell, the diy heatsync helped more then the actual reheating of the chip via hot air station. If in the future, someone else without a warranty to the phone, would try making a similar hack to the phone, we will know for sure if the problem can be solved by simply cooling down the cpu to some extent.
 
Nov 2, 2008
28
0
London
When i will get back from my holiday i will try your thing on my hd2 as it does all the freaky parts even when doing nothing.

today i noticed 7vibrations and got scared, got artemis as a backup, using now, but after a week will try re-solder and give back feedback.

still-after few years of silence in doing electronics- i got my hermes back to life-white screen due to faulty front pcb keyboard, had tp2 and exchanged for hd2-want to see it fully working for the price i paid for it.

regards
 

motoi_bogdan

Senior Member
Sep 8, 2007
319
337
first of all heating up the qualcom chip is recomanded as a last resort option. however if you reheat it, pressing the chip to the board is VERRY dangerous, as it could permanently damage the BGA connection.

Here's some sort of guide on doing this. You will need a screwdriver, some 4-5 mica foil pads (you can get them from any electronic component store (get them for either TO3 or TO220 casing and cut them to the size of the cpu inside hd2) some good thermal grease (arctic silver or something for pc cpu's) an aluminum sheet for you to cut a piece of it.
* i don't recommend silicon thermal pads, use only mica crystal pads
* you can substitute the aluminum plate with aluminum/copper foil - the first is the one used for food wrapping)
* i don't recommend using anything beside a smd rework station (either hot air or infrared) to heat up the board. Although a heat gun can develop high temperatures, the air debit is to high (dangerous, you can blow up other components) and you will lack precise temperature control needed for this job.


1. Disassemble the phone following HTC's official videos. Completely remove the motherboard from the phone's casing.
2. Once you have the motherboard de-attached remove all metallic shields on both sides. Normally these prevent EM interferences from the outside to get in and mess with electric signals over the PCB. We can use them as part of the "cooling" system later.
3. OPTIONAL - efficiency yet to be determined/great risk involved - use either a special oven (not microwave !! it WILL kill the phone!) or a smd rework station to pre-heat the mainboard. Temperature must be set at around 95-110 degrees. Board must be heated from both sides, or at least one at a time, beginning with the one opposing the cpu side. Let it preheat at least 10 minutes.
3a. after preheating, use an aluminum foil to cover the rest of the components, anything other then the cpu itself then get to the actual heating, switching first to 250 degrees and directing the air stream on the cpu itself (using a larger nozzle for the tip of the heating gun). After 2-3 minutes of 250 degrees, swich to 340-360 degrees and heat the chip for another 5minutes. Move the heating gun around the surface of the chip and try to heat it evenly. If you have the guts and you are crazy enough use a knife with a larger blade and put the tip of the blade in the hot air stream in front of the cpu. Let it heat for a while, and also, continue heating the cpu. When the blade tip is hot enough press the chip with it , starting from the center and following each side. Apply even force on each press and try to have the blade as parallel with the chip possible. Don't press too hard, if you haven't kill the chip yet, that will kill it.
3b. let the board to cool down on it's own and during cooling try not to move it or do anything to it.

4. place a little amount of thermal grease on top of the cpu then place 1-2 mica foil pads (depending on thickness) over the cpu. Gently press the mica foil with one finger over the cpu. Now place more thermal grease over that mica foil and try to place the metalic shield over that area. If successfully done, the metallic shield should be in contact with the mica foil and the grease. Place back all shields on the main board.
5. On the phone's casing, measure the back of the display and try to cut an aluminum sheet of exactly the same size. If the sheet you can find is too thick - polish it and place it in a solution of either caustic soda or ferric chloride. This will get it thinner, but you have to supervise the process as if you leave it for long, the sheet could get completely dissolved. Check the sheet on short intervals (1min) to see the progress. Always use gloves and eye protection as both substances are dangerous (never mix them, use only one of them, the one you can get or already have). Once done, you will have a thin aluminum sheet that's flexible and about 1mm thick.
6. notice there are some ribbons connecting the display to the motherboard or other exposed metallic contacts. Before placing the aluminum sheet over the display's back, place some insulating tape over those metallic contacts to prevent any shortcircuit forming between them and the aluminum sheet. Next place the aluminum sheet over the display's back. Be careful not to damage any connector or ribbon in the process.
7. place more thermal grease on the cpu's metallic shield and check to see if the motherboard gets in good thermal contact with the aluminum sheet you just placed over the display's back. If there is still some space between them, use another mica foil and place thermal grease on both sides of it.
8. reassemble the phone, and make some tests to see if you get some improvements.

One more thing, this little project of our is in a "more to be seen/tested" state. As of now... only one device was fixed by this method - mine, it could have been simple luck. I don't know yet. :p more then a week later (strange weather also, + 20 degrees outside then last time i wrote the original post) the phone still works ok. Now running 1.3Ghz overclocked with NAND Android


@ januszgorlewski i remember the first time the phone was vibrating 7 times and i didn't know about this problem, i though it was an WM6.5 Energy Rom feature :) .
 
Last edited:

motoi_bogdan

Senior Member
Sep 8, 2007
319
337
yep.. more than 2 weeks have passed and after i completed all possible tests the phone still works ok.
About 22-25 roms flashed (wp7, wm6.5, android, ubuntu) phone was used either normally or heated with a hair dryer. At about 30 degrees ambient room temperature, i run some 720p testing and manage to run sample videos until battery died out, then rerun the videos while charging (charging induces more heat also).
In all those 2 weeks i had only 2 restarts, both in wp7 (can't remember what rom version did that) and both occurring when i was setting up the phone after the phone update. Phone was cold however. I didn't manage to produce more restarts either when the phone heated up or i tried running intensive apps on it. Guess it was software related.
So.. i guess it's over with this problem.
 

profahmad

Senior Member
Sep 8, 2006
131
6
This thread is awesome. I've opened my HD2 a few times in the past week to replace the LCD. Between the first time and opening it last night the LCM (LCD and touchscreen module) was slightly loose from the chassis so the screen was protruding a little and the front AP buttons were sunken. This was the case for a period of about a week, during which I noticed the phone would get very, very hot towards the bottom on the posterior surface, beneath the battery cover (around the area of the main board). Last night I properly assembled the whole device and it's now completely flush. The overheating doesn't seem to be occurring now.

I wasn't experiencing any restarts or lockups during the time it was overheating.

When I can get hold of the materials I'm definitely trying your heatsink. Thanks for sharing this.
 

sqeeza

Member
Jul 21, 2008
28
0
Packaging Reliability - 7 vibration lock-ups

Thanks facdemol for your investigation and sharing.

I have been typing a lengthy description of what happend to me and then my browser hang - annoying so now only the short version ;)
My long awaited factory fresh HD2 that failed exactly as described within 2 month at winter season. I was lucky get the original SPL back after storing it on the balcony at minus 5 and flashing from SD. Please mind, that for me the issue was heavily accelerated with the HD2 plugged in.

Yesterday I got mine back from warranty repair with the main board swapped. Since I am now anxious about this to happen again I asked about similar experience from others which has been denied. After reading this I recommend starting a petition, as this is obviously an wrong thermal design. I work as an packaging engineer and can access this as x-ray, ultrasound microscope (water bath only) and infrared imaging. Even though I dont have the time start this petition I would offer to help putting some serious reliability research behind.
So you could donate malicious hardware for inspection, as mine is still in warranty.

Few years back I had a good RMA experience with my Canon camera that died in warm humidity. After some research in the net I found the policy that all models will be replaced for exactly this failure -no matter when it occurs, as it was a design error (wrong material for CCD attach at this case).

So please people with thermally instable snapdragon devices STAND UP and ask HTC for seriously handling these mistakes. They should replace even after the warranty expiration if they only admit, it was their design flaw...

I for myself will probably try to stress my repiared HD2 in order to have this failure again and then I can opt for exchanging the device. Buth then, what device to buy? For the dual cores this might be even worse. Suppliers do not have long enough life cycles for their products to really do good redesign.

Keep it up
 
Nov 2, 2008
28
0
London
I had 7 vibrations while on the plane, just switched on, play solitaire 5min, then reset-7vibrations, took battery out, start it up, same 7vibrations, about five times same cycle.

Then i thought, ok I am done now-it's a brick, as PC's have BIOS which can tell you by beeping what the hell the problem is. In this case, no idea... Then i found this thread.

@ facdemol I might try only a heatsink - reheating cpu not needed, right? but i will wait for insurance exchange of my phone...
 

harrrysekhon

New member
Sep 9, 2010
3
0
sir first of all i want to thank u for this excellent post . . . . Cn u tell some other easy material than mica foil pads which may b available .. I hav same prob with my hd2 with expired warranty .
 

motoi_bogdan

Senior Member
Sep 8, 2007
319
337
the mica crystal pads should be available at any electronic components store. If you can't find any, you could try to substitute them with any other similar purpose material. Use only thermal pads used in electronics for semiconductor (transistors mostly) thermal dissipation. However from what i know or can test, the mica ones are superior to other designs or materials.
Also, good quality thermal paste is a MUST. Cheap one tend to dry out or loose effectiveness over time.

@ profahmad - yes, the back of the lcd unit is metallic. Normally it was not intended to provide heat dissipation, neither is in direct contact with the heat making components, but it takes some of the heat and spreads it over it's surface. What i did is to forcefully use this piece of metal along with the materials i used for the "heat sync" in order to facilitate better thermal dissipation. The HD2 is build on the "edge" as you can see, even if the display unit is removed or improperly mounted, the small effect in cooling the board it once had is enough now to provoke some of the thermal issues.

@januszgorlewski reheating is very risky without solid previous experience. Simply reheating the cpu didn't solve the problem for me, it only ameliorated it a bit. The new heat sync did the trick so i suspect you can skip reheating with not much of a loss in effectiveness. However i should have experienced with more devices in order to know for sure the effects of each stage of my experiment.

@sqeeza yes, a petition could be filed out. However, there are 20-30 topics in this area about hd2 freezing or restarting but most people don't know there is a thermal problem related with these events. If we advertise the problem and it's cause to these people they could run some simple test to determine if their phones are also suffering from this problem.
 

Top Liked Posts

  • There are no posts matching your filters.
  • 61
    UPDATE -

    As of June 14. 2011 it appears the only viable way to solve this problem is to reheat the cpu with specialized equipment. I am currently testing if this can be done in a standard gas/electric oven (NOT MICROWAVE !!). If successful this method will solve the problem and the following guide (cpu cooling system) may help prevent it from happening again. Only building this cooling system is not enough and will not solve your problems.


    Hello dear members of XDA

    First of all please excuse my english, I will try to explain myself as well as I can. It will be a long post, it could be boring, it could be scary or whatever you like but bear with me on this one.

    So.. got myself a HTC (T-Mobile) HD2. A bit late in the game, but hell no, still a good phone. I've dreamed of having one but couldn't afford. Anyway finally i got one. A second hand broken one, damn it. :p
    As i found out the problem is pretty common: damn thing restart itself - thermally related - the old CPU overheat problem. By searching the net I found out that it's pretty common with some HTC models. HD2 has it, Desire has it, Nexus One has it, hell even some xperia models have it.. about half of the devices powered by anything from the Snapdragon series could have it.
    The problem could be easely described as : phone hanging, restarts, the dreaded 7-8 short vibrate sequence - phone locked etc.
    Mine was worst then i've seen on the forums or with other people. It locked itself for just about every reason i could get. Taking pictures, browsing the menu, using gps, the browser, 3g or wifi, watching a movie ... all concluded with restarts or lock-ups after some couple of minutes. I've found out that keeping the phone at 4-5 degrees celsius would solve my problems in most cases, but anything above 10-15 degrees would make the thing go crazy.

    Well, I'm pasionate about electronics, development in this area, trying to solve problems and things like that. Also experienced in heat and semiconductor related problems. I also had one macbook air that suffered from core shutdown because of overheating (also a well known problem for MBA rev 1.0) and managed to design an alternate cooling system that solved the problem. So i gave it a shot, i know there are many users that have similar problems and altrough i don't suggest them explicitly to make this hacks to their phones.. this is one way to solve the problem if you buy your unit second hand or don't have some form of warranty.

    So here we go.

    Big fat warning!!! Don't attempt these things with your phones unless you are familiar with the concepts or the tools involved in the process. Also, there is a real risk to permanently damage your phone. Not just real.. but big if you get something wrong.

    First step is to run some simple tests to determine the cause of the problem or the range it extends to.
    So, I used a multimeter with a K type thermal probe to measure the temperature of various components of the phone during intensive use.



    this is the back of the mainboard of my HD2. If you notice, HTC placed a blue-ish thermal pad over one metallic shield covering the back components. I don't know what's the purpose as the back casing in that area is made of plastic - no heat dissipation, or a bad one. Anyway that's a good place to place my probe. Some tape holded the probe in position. Because we don't have perfect mechanical contact between the probe tip and the casing or chips i expect +1 or +2 degrees celsius to be added to each measurement i will later describe.



    i now placed the battery over the back of the phone and secured it with some other tape and some toothpicks :D



    we're at 19.3 degrees. That's were we'll going to start from.

    there's a usefull little app that allows users to overclock or stress test their phones cpu. Found it here on XDA, i'll use it for some heat making purposes.



    as you can see.. we're already at 25.8 degrees, after 5 minutes of testing.. not to mention the actual heat making primary suspect - qualcomm chipset is on the other side. At 29.5 degrees at this point.. the phone locked itself. I reapeted the experiment 2 more times - got exactly the same result.. at least the readings were consistent.

    Ok, i then removed the motherboard to take some readings from the actual CPU.



    same procedure.. next readings. - at around 33.4 - 34.2 degrees (varies) on the CPU itself the phone will either restart or lock itself up. So you see how serious my problem is. Summer will come so I won't be able to use my phone... :p
    Measures have to be taken.

    Let's make a small introduction about heat related to semiconductors.

    Well, simply put a conductor (semiconductors act the same way) generates some amount of heat when an electric current is passed along it. This is because of the fact that small electrons moving along the conductor (in a simple way that's the definition of any electric current) will ocasionaly collide with the atoms of the material their passing through. In the collision the electron loses some amount of energy. That energy is heat. Also, heat itself can be described at an atomic level as the intensification of natural ocuring brownian movement of atoms. If they move a lot, if they are more agitated they create more heat. If they are more agitated, they are more likely to be hit by passing electrons. So a hot conductor is more likely to get even hotter because of that. There is a point were the heat generated makes the conductor's atoms prone to more hits from passing electrons in kind of like a geometric progression. That's called thermal runaway. It will tend to destroy electronics by overheating, melting or burning themselves up.
    Back to our phones now. The CPU produces heat. Because of the same effect described above. The heat in this case will either melt or break the small "balls" that comprise the BGA matrix on what these cips are mounted on. The small balls will either melt (extreme cases) or dilatate with increasing temperature. However it seems most of the new processors used by HTC are mounted in some epoxy resin that has both dilatation point and melting point higher then the flux and welding compound used to solder those cips. So the actual cip will tend to stay fixed in a particular position, unable to expand or contract with temperature variations, but the balls used in the BGA matrix underneath it will contract or expand with these variations. This could lead to a case when at least one of that balls (some couple hundreds in total) become "loose" or out of position, thus breaking the electrical contact it should have made. Therefore our problems. At fist large amounts of heat must be applied in order to actually break the bond between the cpu and board, but after that, once broken the tiny links are very sensible to temperature variations and they will expand or contract freely.
    Most users notice that at it's core, the problem seemed related to overheating (in the begining) but after time it's effects are degenerative.. phones seem to restart with no apparent reason. It's still overheating, but things are starting to get more and more worse as the chip and it's connections become more sensitive to heat variations. Thus, even small variations now produce these problems - my CPU restarts at 34 degrees .. that sucks.

    So, my only option was to try to reheat the cpu in the attempt to partially melt the broken "balls" in the bga matrix and hopefully.. i repeat HOPEFULLY they remake contact with the mainboard. A re-ball of this chip is not possible, as the resin placed around it by HTC doesn't melt at the normal temperature i could remove the chip itself, so heating it at even higher temperatures would risk killing the cpu long before the resin melts. Strange move by HTC to make things like this.

    Anyway.. here goes nothing..


    I've placed the usual aluminium foil designed to protect surrounding components by the heat generated by the rework station and the hot air used to heat up the CPU.



    I preheated the CPU for about 10 minutes, from both sides of the board, then switched to heating it at 360 degrees. I applied even pressure above it after it was heated in order to tighten the space between it and the board, just a little bit. THIS IS VERY RISKY. Normally not recommended because of the risk damaging the BGA. In this case the resin would prevent me from moving the chip to much so it's less risky. Not safe.. but less risky. :p

    I've let the board to cool on it's own for half an hour and repeated the temperature monitoring tests.
    Now i had an increase of maximum temperature before a restart from 34 degrees on the cpu to about 42. It's not much but it's a start. However above these temperature.. the phone will still lock or restart.
    I went for another round of reheating with the hot air station. After this, i've got slightly better results. Some 2-3 degrees more. My lucky break was when i suspected thermal runaway for the CPU. So i tried to make some sort of a heat sink for that chip using some mica foils for to220 can transistors, some thermal grease and a bunch of aluminum and copper foils. My theory was that heat dissipation will eventually accelerate faster above a specific level, a point from witch thermal runaway occur. In my case in the initial tests, even after the phone locked itself and i manually restarted (battery out - in) the temperature continued to increase even faster altrough the phone wasn't doing anything intensive.
    The role of my "heat sink" would be to dissipate more heat rapidly and in some manner to press the cpu against the board.
    After I placed the mica foils directly above the cpu with thermal grease above and beyond i mounted back the metal shield over that area. On it, i placed some more silicon paste and some thick copper foil (used in some broken laptops i have over here). It looks ugly but.. worth a shot:



    after that i begin making the rest of the heat sink using aluminum foil. I folded about 12 layers, between each of them having placed... more thermal grease and at the 6-7 layer another round of mica crystal foil.

    Here's the aluminum foil



    I then pressed the foils very hard between two flat surfaces in order to remove the excess thermal grease.
    I "anodized" the first layer (the one in contact with the cpu shielding) with some ferric chloride. Before that, the board looked like this:



    After the logic board was mounted back, i remade all the connections and after some preliminary tests, mounted the phone back together. It now looks like this



    I only have to re-attach the serial no. and imei, plastic sticker.

    Of course i then run tests. I heated up the phone with a hair dryer to simulate a hot summer day. About 40 degrees, just to be sure. I then run cpu stress tests and a full divx movie (impossible in the past). On preliminary testing, i had indications that i avoided the thermal runaway the cpu now running stable at 24 degrees (19.3 in the room - ambient temperature). No more, heating up by itself to about 40 degrees then restart.
    On the final testing, with the phone put together, i heated it with the hair dryer and achieved 40 degrees. I started it and run stress tests. No more lockups or restarts, not even a single one. However with the phone put together i can't measure inside temperature on it's components. As i feel it, it get's warmer, it heats up to some degree, but now it's spread all over it's surface. For some particular reason it doesn't restart anymore.

    I then tried, cpu stress test, wlan connection, pc connection and browsing the net all at the same time. NO RESTART :D I watched a full 1.30 hour movie at max playable quality, the phone was really hot (43-44 degree at it's surface) but still no problems.
    It appears that for the moment i saved the phone. However, future behavior is still to be determined.

    I'll get back with more testing, in the following days and eventually i hope to devise a general method for building heat sinks for phones (yeaah, ridiculous....) using combinations of metal and thermal conductive cristals. The ideea is to find out if reheating the chip by hot air station can be avoided (this involves the most risk). But the start is promising. By the time warranties will expire and phones like the new droids or winmo 7's start to break from thermal problems, maybe i'll have some sort of a more user friendly solution.

    EDIT JUNE 04.2011
    since i have a dead hd2 motherboard here, i tried to remove the cpu to expose the BGA soldering. Just for fun, no chance of BGA reball, as there aren't any tools available for this particular chip. The resin prevents a proper removal, at about 450 degrees celsius it was still kind of hard, so i had to forcefully remove the chip and break some of the BGA. The chip is very thin, kind of like a micro sd card. It heats up pretty quick and fast, the solder points underneath it got melted in about 2-3 min at 370 degrees celsius.
    Here's how it looks.
    This is the motherboard without the chip. The BGA matrix is broken, some balls were simply ripped out when i forcefully removed the chip.



    This is the actual chip compared with a mini sd and and standard sd card.



    ...and this is the underside of the chip. belive it or not, the chip is actually alive and it's pins are ok. It cannot be used because it cannot be properly soldered to a board. Guess i'm gonna punch a hole through it and use it at my key chain, along with a laptop cpu already there ;)



    In the following days i will experiment with the solder points&materials in order to try to produce a more safer method to reheat future boards with thermal problems. It seems this board died because of overheating and a short circuit made over the center of the array by 3 solder balls that got in contact once they were melted.
    4
    the mica crystal pads should be available at any electronic components store. If you can't find any, you could try to substitute them with any other similar purpose material. Use only thermal pads used in electronics for semiconductor (transistors mostly) thermal dissipation. However from what i know or can test, the mica ones are superior to other designs or materials.
    Also, good quality thermal paste is a MUST. Cheap one tend to dry out or loose effectiveness over time.

    @ profahmad - yes, the back of the lcd unit is metallic. Normally it was not intended to provide heat dissipation, neither is in direct contact with the heat making components, but it takes some of the heat and spreads it over it's surface. What i did is to forcefully use this piece of metal along with the materials i used for the "heat sync" in order to facilitate better thermal dissipation. The HD2 is build on the "edge" as you can see, even if the display unit is removed or improperly mounted, the small effect in cooling the board it once had is enough now to provoke some of the thermal issues.

    @januszgorlewski reheating is very risky without solid previous experience. Simply reheating the cpu didn't solve the problem for me, it only ameliorated it a bit. The new heat sync did the trick so i suspect you can skip reheating with not much of a loss in effectiveness. However i should have experienced with more devices in order to know for sure the effects of each stage of my experiment.

    @sqeeza yes, a petition could be filed out. However, there are 20-30 topics in this area about hd2 freezing or restarting but most people don't know there is a thermal problem related with these events. If we advertise the problem and it's cause to these people they could run some simple test to determine if their phones are also suffering from this problem.
    4
    first of all heating up the qualcom chip is recomanded as a last resort option. however if you reheat it, pressing the chip to the board is VERRY dangerous, as it could permanently damage the BGA connection.

    Here's some sort of guide on doing this. You will need a screwdriver, some 4-5 mica foil pads (you can get them from any electronic component store (get them for either TO3 or TO220 casing and cut them to the size of the cpu inside hd2) some good thermal grease (arctic silver or something for pc cpu's) an aluminum sheet for you to cut a piece of it.
    * i don't recommend silicon thermal pads, use only mica crystal pads
    * you can substitute the aluminum plate with aluminum/copper foil - the first is the one used for food wrapping)
    * i don't recommend using anything beside a smd rework station (either hot air or infrared) to heat up the board. Although a heat gun can develop high temperatures, the air debit is to high (dangerous, you can blow up other components) and you will lack precise temperature control needed for this job.


    1. Disassemble the phone following HTC's official videos. Completely remove the motherboard from the phone's casing.
    2. Once you have the motherboard de-attached remove all metallic shields on both sides. Normally these prevent EM interferences from the outside to get in and mess with electric signals over the PCB. We can use them as part of the "cooling" system later.
    3. OPTIONAL - efficiency yet to be determined/great risk involved - use either a special oven (not microwave !! it WILL kill the phone!) or a smd rework station to pre-heat the mainboard. Temperature must be set at around 95-110 degrees. Board must be heated from both sides, or at least one at a time, beginning with the one opposing the cpu side. Let it preheat at least 10 minutes.
    3a. after preheating, use an aluminum foil to cover the rest of the components, anything other then the cpu itself then get to the actual heating, switching first to 250 degrees and directing the air stream on the cpu itself (using a larger nozzle for the tip of the heating gun). After 2-3 minutes of 250 degrees, swich to 340-360 degrees and heat the chip for another 5minutes. Move the heating gun around the surface of the chip and try to heat it evenly. If you have the guts and you are crazy enough use a knife with a larger blade and put the tip of the blade in the hot air stream in front of the cpu. Let it heat for a while, and also, continue heating the cpu. When the blade tip is hot enough press the chip with it , starting from the center and following each side. Apply even force on each press and try to have the blade as parallel with the chip possible. Don't press too hard, if you haven't kill the chip yet, that will kill it.
    3b. let the board to cool down on it's own and during cooling try not to move it or do anything to it.

    4. place a little amount of thermal grease on top of the cpu then place 1-2 mica foil pads (depending on thickness) over the cpu. Gently press the mica foil with one finger over the cpu. Now place more thermal grease over that mica foil and try to place the metalic shield over that area. If successfully done, the metallic shield should be in contact with the mica foil and the grease. Place back all shields on the main board.
    5. On the phone's casing, measure the back of the display and try to cut an aluminum sheet of exactly the same size. If the sheet you can find is too thick - polish it and place it in a solution of either caustic soda or ferric chloride. This will get it thinner, but you have to supervise the process as if you leave it for long, the sheet could get completely dissolved. Check the sheet on short intervals (1min) to see the progress. Always use gloves and eye protection as both substances are dangerous (never mix them, use only one of them, the one you can get or already have). Once done, you will have a thin aluminum sheet that's flexible and about 1mm thick.
    6. notice there are some ribbons connecting the display to the motherboard or other exposed metallic contacts. Before placing the aluminum sheet over the display's back, place some insulating tape over those metallic contacts to prevent any shortcircuit forming between them and the aluminum sheet. Next place the aluminum sheet over the display's back. Be careful not to damage any connector or ribbon in the process.
    7. place more thermal grease on the cpu's metallic shield and check to see if the motherboard gets in good thermal contact with the aluminum sheet you just placed over the display's back. If there is still some space between them, use another mica foil and place thermal grease on both sides of it.
    8. reassemble the phone, and make some tests to see if you get some improvements.

    One more thing, this little project of our is in a "more to be seen/tested" state. As of now... only one device was fixed by this method - mine, it could have been simple luck. I don't know yet. :p more then a week later (strange weather also, + 20 degrees outside then last time i wrote the original post) the phone still works ok. Now running 1.3Ghz overclocked with NAND Android


    @ januszgorlewski i remember the first time the phone was vibrating 7 times and i didn't know about this problem, i though it was an WM6.5 Energy Rom feature :) .
    3
    as i found out until now the steps from a good working hd2 to this problems are something like this:

    1. phone working ok. mainboard (lower part of the device) heats in some conditions - demanding programs etc battery can reach about 40-45 degrees max. without problems. The phone will restart or freeze (cpu halt) in any of these situations :
    - battery temp exceeds 45 degrees and stays over this value for at least 5-10 minutes in order to trigger the thermistor used to measure the temperature in this area over i2c. at this time, it will prevent further charging and restart or lock the phone. This is normal behavior.
    -CPU exceeds 60-65 degrees (exact value still to be determined. i'm trying to get acces to some similar chipset datasheets). This produces CPU halt. Depending on what you're doing, the halt will either reset the phone or simply lock it up. Restarting by soft reset or by itself will probably return the user back in the home screen with the phone still working. This is also normal behavior, related to qualcomm chip.

    2. phone starts to malfunction. This condition starts by either large variation in temperature - mainboard al low temperature gets fast to full load or simply sustained full load. All HTC HD2's revisions have the same type of soldering in the cpu area. Visually speaking (no conclusive data yet) first revision used a bit more epoxy resin to secure the cpu in place. In the context of overheating and solder balls dilatation, that's not quite a good thing. Some sort of thermal spike must occur in order to break the contact between cpu and motherboard. Warning, if your phone will lock up and doesn't restart by itself, it's imperative that you disconnect the battery because as I measured, even with the phone locked, the CPU still overheats even more, thermal runaway occurs and temperature climbs to dangerous levels. I never left the phone do this for a long time, therefore I don't know how much it will still overheat, but it does and it will. In the initial stage of the problem, only extended heavy load use can trigger the problem. A common case is keeping the phone on in the car and using it for gps navigation in a hot summer. If the phone will restart before either 45 degrees at battery or 60-65 degrees at cpu level (however the last one is harder to measure) then you certainly have problems and they are just at the start.

    3. problems get worse. At this stage it is possible to notice the 7 short vibrates at boot time if the phone is warm or kept in a warm environment. You don't have to push it very hard, it only needs to be warm. The vibration pattern is an error code made by the actual qualcomm chipset, not sent by either bootloader, spl or operating system. When in happens the cpu will lock itself up, however file transfer (including nand memory acces, storage card acces and basic operations) or other chipset functions will still work for some time. It appears only cpu processing is being halted. So if this occurs when you boot the phone, it will lock up, but if this occurs when you are flashing a rom, you might continue to see the progress bar still filling. The vibration pattern signals a physical damage to the qualcomm chipset has ocurred. There's no way around it, when it occur it will never just .. heal up by itself.
    You will notice that the temperatures needed to induce a restart/lockup will decrease with time (both battery & cpu).

    4. Problem at it's worse. CPU can lock itself even at 35-40 degrees (measured at it's level). Ambient temperature of only 10-12-15 degrees is enought to have the phone experience problems. The cpu start to suddenly produce either lock-outs or hard faults or simply work intermittently. The OS may give errors relating ARM CORE failure or fatal errors regarding execution of certain "lines" (related to code lines in the os core programming). At this stage, the phone doesn't need to feel warm in your hands to produce these problems. This could trick some people not to still relate this to thermal problems and look for the solution or problem cause elsewhere. It's still related... but at it's worse.

    5. Total CPU collapse. If the phone locks and remains locked in whatever screen or program it was running, like i've said before, it will still overheat. If a stage 4 phone is left overheating, chances are that more balls connecting the chipset to the motherboard will fail. If any one needed to correctly initialise the chip or to power it on, fails - then it's end game for that phone. It will simply stop working and never turn on. Some other variants are that the phone will only start if placed in a freezer or start but never complete a boot sequence (either os or bootloader .. or both could be unable to start)
    3
    yep.. more than 2 weeks have passed and after i completed all possible tests the phone still works ok.
    About 22-25 roms flashed (wp7, wm6.5, android, ubuntu) phone was used either normally or heated with a hair dryer. At about 30 degrees ambient room temperature, i run some 720p testing and manage to run sample videos until battery died out, then rerun the videos while charging (charging induces more heat also).
    In all those 2 weeks i had only 2 restarts, both in wp7 (can't remember what rom version did that) and both occurring when i was setting up the phone after the phone update. Phone was cold however. I didn't manage to produce more restarts either when the phone heated up or i tried running intensive apps on it. Guess it was software related.
    So.. i guess it's over with this problem.