FORUMS

XPrivacy for Android Lollipop – XDA Xposed Tuesday

Have you ever been on the Google Play Store and saw an app that you wanted to … more

How a HTC Droid Eris Changed a Members Life

Rarely can a member say that Android development or XDA had a profound effect on … more

XDA Office Space: Frankenstein’s Perfect IM Client?

The portal’s decentralized XDA office lies in a Hangouts chatroom, where … more

Which IM Client on Android is best?

With so many different messengers to choose from, it can be tough to find the best one for you and … more
Post Reply Subscribe to Thread Email Thread

[COMMIT] [AOSP] [LOLLIPOP] JustArchi's ArchiDroid Optimizations - Unleash the power!

18th May 2014, 02:38 AM |#1  
JustArchi's Avatar
OP Recognized Contributor / Recognized Developer
Flag Warsaw
Thanks Meter: 33,460
 
Donate to Me
More
Hello dear developers.

I'd like to share with you effect of nearly 200 hours spent on trying to optimize Android and push it to the limits.

In general. You should be already experienced in setting up your buildbox, using git, building AOSP/CyanogenMod/OmniROM from source and cherry-picking things from review/gerrit. If you don't know how to build your own ROM from source, this is not a something you can apply to your ROM. Also, as you probably noticed, this is not a something you can apply to every ROM, as these optimizations are applied during compilation, so only AOSP roms, self-compiled from source may use this masterpiece.

So, what is it about? As we know, Android contains a bunch of low-level C/C++ code, which is compiled and acts as a backend for our java's frontend and android apps. Unfortunately, Google didn't put their best at focusing on optimization, so as a result we're using the same old flags set back in 2006 for Android Donut or anything which existed back then. As you guess, in 2006 we didn't have as powerful devices as now, we had to sacrifice performance for smaller code size, to fit to our little devices and run well on very low amount of memory. However, this is no longer a case, and by using newest compilers such as GCC 4.8 and properly setting flags, we can achieve something, which I call "Android in 2014".

You probably may heard of some developers claiming using of "O3 Flags" in their ROMs. Well, while this may be true, they've applied only to low-level ARM code, mostly used during kernel compilation. Additionally it overwrites O2 flag, which is already fast, so as you may guess, this is more likely a placebo effect and disappears right after you change the kernel. Take a look at the most cherry-picked "O3 Flags commit". You see big "-Os" in "TARGET_thumb_CFLAGS"? This is what I'm talking about.

However, the commit I'm about to present you is not a placebo effect, as it applies flags to everything what is compiled, and mostly important - target THUMB, about 90% of an Android.

Now I'll tell you some facts. We have three interesting optimization levels. Os, O2, O3. O2 enables all optimizations that do not involve a space-speed tradeoff. Os is similar to O2, but it disables all flags that increase code size. It also performs further optimizations to reduce code size to the minimum. O3 enables all O2 optimizations and some extra optimizations that like to increase code size and may, or may not, increase performance. If you want to ask if there's something more like O4, there is - Ofast, however it breaks IEEE standard and doesn't work with Android, as i.e. sqlite3 is not compatible with Ofast's -ffast-math flag. So no go for us.

Now here comes the fun part. Android by default is compiled with O2 flag for target ARM (about 10% of Android, mostly kernel) and Os flag for target THUMB (about 90% of Android, nearly everything apart from kernel). Some guys think that Os is better than O2 or O3 because smaller code size is faster due to better fitting in cpu cache. Unfortunately, I proven that it is a myth. Os was good back in 2006, as only with this flag Google was able to compile Dalvik and it's virtual machine while keeping good amount of free memory and space on eMMC cards. As or now, we have plenty of space, plenty of ram, plenty of CPU power and still good old Os flag for 90% of Android.

Now you should ask - where is your proof?, here I have it for you:


As you may noticed, I compiled whetstone.c benchmark using three different optimization flags - Os, O2 and O3. I repeated each test additional two times, just to make sure that Android doesn't lie to me. Source code of this test is available here and you may download it, compile for our beloved Android and try yourself. As you can see O3 > O2 >> Os, Os performs about 2.5x times worse than O2, and about 3.0x times worse than O3.

But, of course. Android is not a freaking benchmark, it's operating system. We can't tell if things are getting better or worse according to a simple benchmark. I kept that in mind and provided community with JustArchi's Mysterious Builds for test. I gave both mysterious builds and didn't tell them what is the mysterious change. Both builds have been compiled with the same toolchain, same version, same commits. The one and only mysterious change was the fact that every component compiled as target thumb (major portion of an android) has been optimized for speed (O3) in build #1 (experimental), and optimized for size (Os) in build #2 (normal behaviour). Check poll yourself, 9 votes on build 1 in terms of performance, and 1 vote on build 2. I decided that this and benchmark is enough to tell that O2/O3 for target thumb is something that we want.

Now the battle is, O2 or O3? This is tough choice, here are some facts:
1. Kernel compiled with O2 has 4902 KB, with O3 4944 KB, so O3 is 42 KB bigger.
2. ROM compiled with O3 is 3 MB larger than O2 after zip compression. Fast overview: 97 binaries in /system/bin and 2 binaries in /system/xbin + 283 libraries in /system/lib and other files, about 400 files in total. 3 MB / 400 = 7,5 KB per file size increase.
3. No issues

In general, I doubt that this extra chunk of code may cause any significant memory usage or slower performance. I suggest to use O3 if it doesn't cause any issues to you compared to O2, but older devices may use O2 purely for saving on code size, similar way Google did it back in 2006 using Os flag.

Now let's get down to business.

Here is a list of important improvements:
Quote:

- Optimized for speed yet more all instructions - ARM and THUMB (-O3)
- Optimized for speed also parts which are compiled with Clang (-O3)
- Turned off all debugging code (lack of -g)
- Eliminated redundant loads that come after stores to the same memory location, both partial and full redundancies (-fgcse-las)
- Ran a store motion pass after global common subexpression elimination. This pass attempts to move stores out of loops (-fgcse-sm)
- Performed interprocedural pointer analysis and interprocedural modification and reference analysis (-fipa-pta)
- Performed induction variable optimizations (strength reduction, induction variable merging and induction variable elimination) on trees (-fivopts)
- Didn't keep the frame pointer in a register for functions that don't need one. This avoids the instructions to save, set up and restore frame pointers; it also makes an extra register available in many functions (-fomit-frame-pointer)
- Attempted to avoid false dependencies in scheduled code by making use of registers left over after register allocation. This optimization most benefits processors with lots of registers (-frename-registers)
- Tried to reduce the number of symbolic address calculations by using shared “anchor” symbols to address nearby objects. This transformation can help to reduce the number of GOT entries and GOT accesses on some targets (-fsection-anchors)
- Performed tail duplication to enlarge superblock size. This transformation simplifies the control flow of the function allowing other optimizations to do a better job (-ftracer)
- Performed loop invariant motion on trees. It also moved operands of conditions that are invariant out of the loop, so that we can use just trivial invariantness analysis in loop unswitching. The pass also includes store motion (-ftree-loop-im)
- Created a canonical counter for number of iterations in loops for which determining number of iterations requires complicated analysis. Later optimizations then may determine the number easily (-ftree-loop-ivcanon)
- Assumed that loop indices do not overflow, and that loops with nontrivial exit condition are not infinite. This enables a wider range of loop optimizations even if the loop optimizer itself cannot prove that these assumptions are valid (-funsafe-loop-optimizations)
- Moved branches with loop invariant conditions out of the loop (-funswitch-loops)
- Constructed webs as commonly used for register allocation purposes and assigned each web individual pseudo register. This allows the register allocation pass to operate on pseudos directly, but also strengthens several other optimization passes, such as CSE, loop optimizer and trivial dead code remover (-fweb)
- Sorted the common symbols by alignment in descending order. This is to prevent gaps between symbols due to alignment constraints (-Wl,--sort-common)

Looks badass? It is badass. Head over to my ArchiDroid project and see yourself how people react after switching to my ROM. Take a look at just one small example, or another one . No bullsh*t guys, this is future.

However, please read my commit carefully before you decide to cherry-pick it. You must understand that Google's flags weren't touched since 7 years and nobody can assure you that they will work properly for your ROM and your device. You may experiment with them a bit to find out if they're not causing conflicts or other issues.

I can assure you that my ArchiDroid based on CM compiles fine with some fixes mentioned in the commit itself. Just don't forget to clean ccache (rm -rf /home/youruser/.ccache or rm -rf /root/.ccache) and make clean/clobber.

You can use, modify and share my commit anyway you want, just please keep proper credits in changelogs and in the repo itself. If you feel generous, you may also buy me a coke for massive amount of hours put into those experiments.

Now go ahead and show your users how things should be done .

Cherry-picking time!

Android "Lollipop" (5.1.1 & 5.0.2 tested)
JustArchi's ArchiDroid Optimizations V4 for CyanogenMod

Older entries are provided for reference only. I suggest using only latest commit above.

Android "Kitkat" 4.4.4:
JustArchi's ArchiDroid Optimizations V3 for CyanogenMod
JustArchi's ArchiDroid Optimizations V3 for OmniROM
JustArchi's ArchiDroid Optimizations V2
JustArchi's ArchiDroid Optimizations V1

AFTER applying above commit and AFTER EVERY CHANGE regarding flags, ALWAYS make clean/clobber AND empty ccache (rm -rf ~/.ccache)


Troubleshooting

Q: How to properly change toolchains used in local manifest?
Open from your source rootdir .repo/local_manifests/roomservice.xml (or create one). Here is a sample manifest that replaces default 4.8 toolchain (both eabi and androideabi) with 4.8 SaberMod and 4.9 ArchiToolchain:
Code:
<?xml version="1.0" encoding="UTF-8"?>
<manifest>
<remove-project name="platform/prebuilts/gcc/linux-x86/arm/arm-linux-androideabi-4.8" />
<project name="ArchiDroid/Toolchain" path="prebuilts/gcc/linux-x86/arm/arm-linux-androideabi-4.8" remote="github" revision="sabermod-4.8-arm-linux-androideabi" />
<remove-project name="platform/prebuilts/gcc/linux-x86/arm/arm-eabi-4.8" />
<project name="ArchiDroid/Toolchain" path="prebuilts/gcc/linux-x86/arm/arm-eabi-4.8" remote="github" revision="architoolchain-4.9-arm-linux-gnueabi-generic" />
</manifest>
This is only an example, you should use the toolchains that suit you best.
Your repos should pop up in specified "path" after next repo sync. You can then confirm that they're in fact proper just by going to specified PATH and using tool such as git log to compare commits.


Q: Compiler errror:
Code:
(...)/prebuilts/gcc/linux-x86/arm/arm-linux-androideabi-4.8/bin/../libexec/gcc/arm-linux-androideabi/4.8.x-sabermod/cc1: error while loading shared libraries: libcloog-isl.so.4: cannot open shared object file: No such file or directory
This error can be fixed by installing missing library. libcloog-isl.so.4 is provided by libcloog-isl4 package, so on debian-like OSes, you should be able to fix it with:
Code:
apt-get install libcloog-isl4
Q: Compiler errror:
Code:
(...)/prebuilts/gcc/linux-x86/arm/arm-linux-androideabi-4.8/bin/../libexec/gcc/arm-linux-androideabi/4.8.x-sabermod/cc1: error while loading shared libraries: libisl.so.13: cannot open shared object file: No such file or directory
This error is very similar to above, but considers other shared library. libisl.so.13 is provided by libisl13 package. Now the problem is that this package is experimental and doesn't exist in Debian yet, so we'll need to install it from experimental repo

Add to your /etc/apt/sources.list following entries:
Code:
deb http://ftp.debian.org/debian experimental main contrib non-free
deb-src http://ftp.debian.org/debian experimental main contrib non-free
Then apt-get update && apt-get install libisl13.


Issues below are for older commits and should be used for reference only


Kitkat THUMB O2+ errors?
These are the most common issues.
* Change -O3 flag from TARGET_thumb_CFLAGS back to -Os, make clean/clobber, empty ccache and try again. This fixes most of the issues.
* RIL problems for for the Exynos 4210 family? Add -fno-tree-vectorize to TARGET_thumb_CFLAGS.
* Broken exFAT -> https://github.com/JustArchi/android...4bffccee650e0d

Errors caused by toolchain?
1. Try Google's GCC 4.8 if you used Linaro 4.8 or SaberMod 4.8
2. Fallback to Google's GCC 4.7 if above didn't help (change TARGET_GCC_VERSION back to 4.7)

Errors caused by GCC 4.8+?
* ART Fix (bootloop) -> https://github.com/JustArchi/android...4443998d028407
* Not booting kernel -> https://github.com/JustArchi/android...e4bfb3cff64de9 and https://github.com/JustArchi/android...90528feb0c9bdd

Errors caused by GCC 4.9+?
* Graphical glitches in PlayStore -> https://github.com/JustArchi/android...d57f3982191662

Errors caused by Linaro?
* error: unknown CPU architecture -> https://github.com/JustArchi/android...85174baacecb03 (Keep in mind that this is a sample fix for smdk4412 kernel, you may need to use similar solution in your own case. Also, this error happens only with Linaro toolchain, doesn't happen with Google's GCC)

Other errors?
* error: undefined reference to 'memmove' -> https://github.com/XperiaSTE/android...2d8219c1e6807a


Credits
@IAmTheOneTheyCallNeo - For inspiration and first steps
@metalspring - For some nice commits
@sparksco - For SaberMod, some nice commits and support for the optimization idea
Last edited by JustArchi; 2nd May 2015 at 07:51 PM.
The Following 627 Users Say Thank You to JustArchi For This Useful Post: [ View ]
 
 
18th May 2014, 02:55 AM |#2  
ΠΣΘ's Avatar
Forum Moderator / Recognized Developer
USA - VZW
Thanks Meter: 16,693
 
Donate to Me
More
Bravo! I see flags in here that I am yet to try. Thanks for all your work and dedication to this effort.
Going to do a comparison this week against my "neofy" initiative

Thanks!

-Neo
Forum Moderator
The Following 12 Users Say Thank You to ΠΣΘ For This Useful Post: [ View ]
18th May 2014, 03:51 AM |#3  
Boss442's Avatar
Senior Member
Flag Mar Del Plata
Thanks Meter: 520
 
Donate to Me
More
This is amazing man!

Enviado desde mi Moto G mediante Tapatalk
The Following User Says Thank You to Boss442 For This Useful Post: [ View ]
18th May 2014, 06:25 AM |#4  
DexedrineXR's Avatar
Senior Member
Thanks Meter: 285
 
More
Damn... Wish I had an international gs3. I'd be on SlimKat, find your thread and be like:





I'm gonna be 20 years old in about an hour... Should I ask?
The Following 8 Users Say Thank You to DexedrineXR For This Useful Post: [ View ]
18th May 2014, 03:13 PM |#5  
marcomarinho's Avatar
Recognized Contributor
Flag Porto
Thanks Meter: 3,438
 
Donate to Me
More
Great work

I saw you had already a custom 4.10 linaro. If I use it, do I need to change anything on your commit?
The Following 4 Users Say Thank You to marcomarinho For This Useful Post: [ View ]
18th May 2014, 03:23 PM |#6  
matrixzone's Avatar
Senior Member
Thanks Meter: 2,752
 
Donate to Me
More
Thanks

Sent from my SAMSUNG-SGH-I747 using Tapatalk
The Following User Says Thank You to matrixzone For This Useful Post: [ View ]
18th May 2014, 03:28 PM |#7  
JustArchi's Avatar
OP Recognized Contributor / Recognized Developer
Flag Warsaw
Thanks Meter: 33,460
 
Donate to Me
More
Quote:
Originally Posted by _MarcoMarinho_

Great work

I saw you had already a custom 4.10 linaro. If I use it, do I need to change anything on your commit?

It won't boot, stl port has segfaults when using GCC 4.9+. If you want to use it, sure, change TARGET_GCC_VERSION and include proper toolchain.
The Following 8 Users Say Thank You to JustArchi For This Useful Post: [ View ]
18th May 2014, 03:38 PM |#8  
marcomarinho's Avatar
Recognized Contributor
Flag Porto
Thanks Meter: 3,438
 
Donate to Me
More
Quote:
Originally Posted by JustArchi

It won't boot, stl port has segfaults when using GCC 4.9+. If you want to use it, sure, change TARGET_GCC_VERSION and include proper toolchain.

Is this the best toolchain [stability/performance] to compile a ROM?
https://github.com/JustArchi/Linaro/....8-androideabi
The Following 2 Users Say Thank You to marcomarinho For This Useful Post: [ View ]
18th May 2014, 04:08 PM |#9  
JustArchi's Avatar
OP Recognized Contributor / Recognized Developer
Flag Warsaw
Thanks Meter: 33,460
 
Donate to Me
More
Quote:
Originally Posted by _MarcoMarinho_

Is this the best toolchain [stability/performance] to compile a ROM?
https://github.com/JustArchi/Linaro/....8-androideabi

Yes.
The Following 5 Users Say Thank You to JustArchi For This Useful Post: [ View ]
18th May 2014, 04:36 PM |#10  
marcomarinho's Avatar
Recognized Contributor
Flag Porto
Thanks Meter: 3,438
 
Donate to Me
More
Quote:
Originally Posted by JustArchi

Yes.

Just one last question. Do I need to configure envsetup.sh to the custom toolchain directory?
Post Reply Subscribe to Thread

Tags
justarchi archidroid optimizations linaro hack
Previous Thread Next Thread
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes