[R&D] MYTHBUSTER Optimized compiler toolchains

Search This thread

Ezekeel

Retired Recognized Developer
Jun 21, 2011
715
1,680
There are several different cross-compiler toolchains available for building the Android Linux kernel. There is the offical TC released by Google as part of the Android NDK (http://developer.android.com/sdk/ndk/index.html) as well as several TCs released by third parties which claim to be more optimized for the target platform. While the most popular of these third party TC is the one from CodeSourcerey (https://sourcery.mentor.com/sgpp/lite/arm/portal/release1803), Linaro also offers one TC very popular by developers of custom kernels (http://www.linaro.org/downloads/) and there also is the Mjolnir TC by redstar3894 and intersectRaven (https://github.com/redstar3894/Manifest/blob/master/README.mkdn).

I have read of lot of praise for these optimized TC from different kernel developers that all claim that these do much better than the offcial Google TC, however to my knowledge up until now nobody actually took the effort of investigating the effect of the TC and most devs simply assume that a TC marketed as optimized by their creators is actually performing better.

So to investigate the performance of the different TC, I compiled a kernel (GLaDOS kernel for ICS) with the following four different TC:

1. Offical Google arm-linux-androideabi-4.4.3 (part of android-ndk-r7)
2. CodeSourcerey arm-2011.03-41-arm-none-linux-gnueabi
3. Linaro android-toolchain-eabi-linaro-4.6-2011.11-4-2011-11-15_12-22-49-linux-x86
4. Mjolnir arm-eabi-4.6-mjolnir-20111006

Also while I was at it, I investigated the effect of the compiler flags '-fgcse-after-reload' and '-ftree-vectorize' (see https://github.com/Ezekeel/GLaDOS-nexus-s/commit/6e51fc5c0be5a9e7906573f78e00c9aee41f32ee#comments) by compiling a version with the CS TC which did not include these flags.


I performed the following two tests for the kernels:

Test I: Measured the bootup time including a rebuild of the Dalvik-cache after a wipe (2 times).
Test II: Performed a benchmark with AnTuTu Benchmark v2.4.3 including CPU/memory, 2D and 3D graphics (3 times).


Results:

Test I:

Code:
Google		1:34	1:34
CS		1:34	1:34
CS-flags	1:34	1:34
Linaro		1:34	1:34
Mjolnir		1:34	1:34


Test II:

Google TC:
g1rv351.png
g2zr3kl.png
g3ol3rl.png



CodeSourcerey TC:
cs1mqve3.png
cs2zw6q2.png
cs39s2uf.png



CodeSourcerey TC without flags:
csalt1mw5ax.png
csalt2qpw1b.png
csalt32z0qt.png



Linaro:
l1p64ph.png
l2284f3.png
l3j12wr.png



Mjolnir:
m1m20gn.png
m2ug44d.png
m3y56zy.png



Average:
Code:
Google		2715
CS		2725
CS-flags	2718
Linaro		2718
Mjolnir		2716

As one can clearly see from the results, the so-called optimized TCs perform not better than the official TC released by Google in any measurable way. So the improvements that some kernel devs have felt cannot be backed up by these data and is most likely wishful thinking. Also note that the compiler flags '-fgcse-after-reload' and '-ftree-vectorize' do not have to seem any adverse effects regarding the performance of the kernel.

busted.jpg



Update:

In case anyone wants to try some benchmark fun for himself, here are the kernels (GLaDOS kernel for ICS).

Google: http://www.multiupload.com/QK5T6SHUFC
CS: http://www.multiupload.com/25021GRFT0
CS-flags: http://www.multiupload.com/98TCZOKLWA
Linaro: http://www.multiupload.com/XZJN2J9DOE
Mjolnir: http://www.multiupload.com/QK4I6R8DGR
 
Last edited:

aic719

Senior Member
Oct 27, 2011
289
33
Chennai
First! How do you think of this stuff?!

I know I'm spamming but still. :p

Sent from my Nexus S using XDA App
 
B

bedalus

Guest
Let me preface this by saying i haven't compiled since i was a teenager (distant memory) and only built simple msdos apps. My question is, when you compile for Android, does it produce machine code, or something else usable by the VM? Is the kernel also machine code, or something else suitable for the boot loader?

Sent from my SNES
 

FloHimself

Senior Member
May 27, 2010
558
218
Ezekeel, thank you for taking the time to invastigate on this topic. This thread should be moved to the general "Android Software Development" forum to inform the other devs about this aswell.

O.T.: You should really think about a paypal donation button. I know about your reservation about taking money, but you might make a statement about only taking donations for already made development. This way you are free to decide whether you stop working on something or not and users can't complain about abandoned projects they have donated to.
 

DebauchedSloth

Senior Member
Jan 27, 2008
459
76
Awesome, awesome post. I've always figured the TC made little difference to the kernel.

What would be interesting, though, would be to see the effect on the whole OS - in particular, Skia, audio processing, the brwser, etc - all the C/C++ stuff. There's not only a lot more of that code, but there's also a lot more floating point and that could maybe see some useful gains.
 
  • Like
Reactions: adlx.xda

polobunny

Senior Member
Oct 25, 2011
6,223
2,312
Montreal
You can clearly see CS is the winner here! :p

Definitely nothing worth mentioning in terms of gains, if any. On 100 runs I bet they'd all get the same average.

Nevertheless, could it be that one gives better results when meeting specific criteria? Thinking mostly about the compiling hardware than the device it's being compiled for. Maybe older/newer/different hardware?
 

aic719

Senior Member
Oct 27, 2011
289
33
Chennai
O.T.: You should really think about a paypal donation button. I know about your reservation about taking money, but you might make a statement about only taking donations for already made development. This way you are free to decide whether you stop working on something or not and users can't complain about abandoned projects they have donated to.

Seconded.

Sent from my Nexus S using XDA App
 

loudaccord

Senior Member
Aug 12, 2010
490
34
This is awesome... I wish more of these threads existed!

Any thought on how these have to do with battery life?
 

Ezekeel

Retired Recognized Developer
Jun 21, 2011
715
1,680
Let me preface this by saying i haven't compiled since i was a teenager (distant memory) and only built simple msdos apps. My question is, when you compile for Android, does it produce machine code, or something else usable by the VM? Is the kernel also machine code, or something else suitable for the boot loader?

Sent from my SNES

Android apps are written in Java and run of the Dalvik VM. The kernel is written in C and runs directly on the hardware.


This is awesome... I wish more of these threads existed!

Any thought on how these have to do with battery life?

Since the performance is virtually the same, I do not expect the battery runtime to be any different either.


Franco has suggested to try the Mjolnir TC with a different set of compiler flags that does enable the hardware support for floating-point math (http://pastie.org/2951626).

Results:

Test I:

Code:
Google		1:34	1:34
CS		1:34	1:34
CS-flags	1:34	1:34
Linaro		1:34	1:34
Mjolnir		1:34	1:34
Mjolnir-float	1:34	1:34


Test II:
mfloat165dsn.png
mfloat3mqemw.png
mfloat2vgdd8.png



Average:
Code:
Google		2715
CS		2725
CS-flags	2718
Linaro		2718
Mjolnir		2716
Mjolnir-float	2720


As one can see enabling hardware support for floating-point math in the Mjolnir TC does not yield any measurable improvements in performance. I guess the reason is the use of floating-point math in the kernel is avoided at all costs, because for most platforms only a very inefficient software emulation is available and so kernel devs have learned to get along with only integers or if some decimal number are really necessary by using fixed-point math.


GLaDOS ICS kernel with Mjolnir TC and floating-point support: http://www.multiupload.com/AM6COIPI6Z
 

Ezekeel

Retired Recognized Developer
Jun 21, 2011
715
1,680
Mostly games use floating point so I'm sure there will be some improvement in that scenario.

That is not relevant since non-kernel programs already have hardware support for floating-point math. The kernel is a special case for which the hardware support is typically not available, so adding these compiler flags just enables it for the kernel. And since the kernel practically does not use any floating-point math, these flags improve a feature which is not used and so the effects are zilch.
 

Ezekeel

Retired Recognized Developer
Jun 21, 2011
715
1,680
Did you copy paste the code sourcery toolchain version over the linaro version?
And if yes, which linaro toolchain did you use?

Oops. You are right. I did use the Linaro android-toolchain-eabi-linaro-4.6-2011.11-4-2011-11-15_12-22-49-linux-x86.
 

morfic

Inactive Recognized Developer
Aug 3, 2008
7,211
12,879
San Antonio
www.derkernel.com
Awesome, awesome post. I've always figured the TC made little difference to the kernel.

What would be interesting, though, would be to see the effect on the whole OS - in particular, Skia, audio processing, the brwser, etc - all the C/C++ stuff. There's not only a lot more of that code, but there's also a lot more floating point and that could maybe see some useful gains.

Compiler optimization will indeed be most effective on media apps.
Trying it on cm7 is easily foiled on project that would require a lot of clean up with 4.6 toolchains.
I would expect gains in libwebcore.
Anything that's big and bulky.

Media apps should benefit from neon improvements in the TC.


Sent from my HTC Sensation 4G using Tapatalk
 

morfic

Inactive Recognized Developer
Aug 3, 2008
7,211
12,879
San Antonio
www.derkernel.com
Since linaro's 4.6 is actually not creating the same efficient code as 4.5.4 (but sees active development while 4.5.4 is in maintenance mode), it would be interesting if you could add another trinity of screenshots done with the linaro 4.5 toolchain.
 

Ezekeel

Retired Recognized Developer
Jun 21, 2011
715
1,680
As Morfic has suggested I also tried the Linaro 4.5 compiler (android-toolchain-eabi-linaro-4.5-2011.10-1-2011-10-21_15-21-26-linux-x86).

Results:

Test I:

Code:
Google		1:34	1:34
CS		1:34	1:34
CS-flags	1:34	1:34
Linaro		1:34	1:34
Linaro-4.5	1:34	1:34
Mjolnir		1:34	1:34
Mjolnir-float	1:34	1:34


Test II:
lold15gfan.png
lold2pmfhj.png
lold30md3d.png



Average:
Code:
Google		2715
CS		2725
CS-flags	2718
Linaro		2718
Linaro-4.5	2715
Mjolnir		2716
Mjolnir-float	2720

The same results as for all the other TCs.


GLaDOS ICS kernel with Linaro 4.5: http://www.multiupload.com/3SAXXP0LWF
 
Last edited:

Top Liked Posts

  • There are no posts matching your filters.
  • 88
    There are several different cross-compiler toolchains available for building the Android Linux kernel. There is the offical TC released by Google as part of the Android NDK (http://developer.android.com/sdk/ndk/index.html) as well as several TCs released by third parties which claim to be more optimized for the target platform. While the most popular of these third party TC is the one from CodeSourcerey (https://sourcery.mentor.com/sgpp/lite/arm/portal/release1803), Linaro also offers one TC very popular by developers of custom kernels (http://www.linaro.org/downloads/) and there also is the Mjolnir TC by redstar3894 and intersectRaven (https://github.com/redstar3894/Manifest/blob/master/README.mkdn).

    I have read of lot of praise for these optimized TC from different kernel developers that all claim that these do much better than the offcial Google TC, however to my knowledge up until now nobody actually took the effort of investigating the effect of the TC and most devs simply assume that a TC marketed as optimized by their creators is actually performing better.

    So to investigate the performance of the different TC, I compiled a kernel (GLaDOS kernel for ICS) with the following four different TC:

    1. Offical Google arm-linux-androideabi-4.4.3 (part of android-ndk-r7)
    2. CodeSourcerey arm-2011.03-41-arm-none-linux-gnueabi
    3. Linaro android-toolchain-eabi-linaro-4.6-2011.11-4-2011-11-15_12-22-49-linux-x86
    4. Mjolnir arm-eabi-4.6-mjolnir-20111006

    Also while I was at it, I investigated the effect of the compiler flags '-fgcse-after-reload' and '-ftree-vectorize' (see https://github.com/Ezekeel/GLaDOS-nexus-s/commit/6e51fc5c0be5a9e7906573f78e00c9aee41f32ee#comments) by compiling a version with the CS TC which did not include these flags.


    I performed the following two tests for the kernels:

    Test I: Measured the bootup time including a rebuild of the Dalvik-cache after a wipe (2 times).
    Test II: Performed a benchmark with AnTuTu Benchmark v2.4.3 including CPU/memory, 2D and 3D graphics (3 times).


    Results:

    Test I:

    Code:
    Google		1:34	1:34
    CS		1:34	1:34
    CS-flags	1:34	1:34
    Linaro		1:34	1:34
    Mjolnir		1:34	1:34


    Test II:

    Google TC:
    g1rv351.png
    g2zr3kl.png
    g3ol3rl.png



    CodeSourcerey TC:
    cs1mqve3.png
    cs2zw6q2.png
    cs39s2uf.png



    CodeSourcerey TC without flags:
    csalt1mw5ax.png
    csalt2qpw1b.png
    csalt32z0qt.png



    Linaro:
    l1p64ph.png
    l2284f3.png
    l3j12wr.png



    Mjolnir:
    m1m20gn.png
    m2ug44d.png
    m3y56zy.png



    Average:
    Code:
    Google		2715
    CS		2725
    CS-flags	2718
    Linaro		2718
    Mjolnir		2716

    As one can clearly see from the results, the so-called optimized TCs perform not better than the official TC released by Google in any measurable way. So the improvements that some kernel devs have felt cannot be backed up by these data and is most likely wishful thinking. Also note that the compiler flags '-fgcse-after-reload' and '-ftree-vectorize' do not have to seem any adverse effects regarding the performance of the kernel.

    busted.jpg



    Update:

    In case anyone wants to try some benchmark fun for himself, here are the kernels (GLaDOS kernel for ICS).

    Google: http://www.multiupload.com/QK5T6SHUFC
    CS: http://www.multiupload.com/25021GRFT0
    CS-flags: http://www.multiupload.com/98TCZOKLWA
    Linaro: http://www.multiupload.com/XZJN2J9DOE
    Mjolnir: http://www.multiupload.com/QK4I6R8DGR
    4
    Let me preface this by saying i haven't compiled since i was a teenager (distant memory) and only built simple msdos apps. My question is, when you compile for Android, does it produce machine code, or something else usable by the VM? Is the kernel also machine code, or something else suitable for the boot loader?

    Sent from my SNES

    Android apps are written in Java and run of the Dalvik VM. The kernel is written in C and runs directly on the hardware.


    This is awesome... I wish more of these threads existed!

    Any thought on how these have to do with battery life?

    Since the performance is virtually the same, I do not expect the battery runtime to be any different either.


    Franco has suggested to try the Mjolnir TC with a different set of compiler flags that does enable the hardware support for floating-point math (http://pastie.org/2951626).

    Results:

    Test I:

    Code:
    Google		1:34	1:34
    CS		1:34	1:34
    CS-flags	1:34	1:34
    Linaro		1:34	1:34
    Mjolnir		1:34	1:34
    Mjolnir-float	1:34	1:34


    Test II:
    mfloat165dsn.png
    mfloat3mqemw.png
    mfloat2vgdd8.png



    Average:
    Code:
    Google		2715
    CS		2725
    CS-flags	2718
    Linaro		2718
    Mjolnir		2716
    Mjolnir-float	2720


    As one can see enabling hardware support for floating-point math in the Mjolnir TC does not yield any measurable improvements in performance. I guess the reason is the use of floating-point math in the kernel is avoided at all costs, because for most platforms only a very inefficient software emulation is available and so kernel devs have learned to get along with only integers or if some decimal number are really necessary by using fixed-point math.


    GLaDOS ICS kernel with Mjolnir TC and floating-point support: http://www.multiupload.com/AM6COIPI6Z
    2
    As Morfic has suggested I also tried the Linaro 4.5 compiler with -mcpu=cortex-a8.

    Results:

    Test I:

    Code:
    Google		1:34	1:34
    CS		1:34	1:34
    CS-flags	1:34	1:34
    Linaro		1:34	1:34
    Linaro-4.5	1:34	1:34
    Linaro-a8	1:34	1:34
    Mjolnir		1:34	1:34
    Mjolnir-float	1:34	1:34


    Test II:
    lolda8190d36.png
    lolda82t6czk.png
    lolda83dxemf.png



    Average:
    Code:
    Google		2715
    CS		2725
    CS-flags	2718
    Linaro		2718
    Linaro-4.5	2715
    Linaro-a8	2718
    Mjolnir		2716
    Mjolnir-float	2720

    The same results.


    GLaDOS ICS kernel with Linaro 4.5 and -mcpu=cortex-a8: http://www.multiupload.com/ZMSJXGZ6JT
    2
    Faster system only with kernel compiled by different compiler isn't myth too.

    If someone recompile some cpu benchmark with this compilers and then we will have numbers little closer to real impact on running system.


    I have linaro toolchain compiled rom in kindle fire (gedeROM v1.6) and yes, it's little faster. In quadrant:
    cm9 memory benchmark: 1943, total: 2743
    cm9 + linaro memory benchmark: 3441, total: 3173
    So memory benchmark is 70% faster.

    Operations in system are faster, gui is faster. But internet, 3D chip and storage isn't faster.
    1
    I could be wrong but I never thought Linaro claimed to make your phone faster. I could have swore it only claimed to compile faster. Could be wrong though.

    I believe it is supposed to make the device faster. See link:
    http://www.androidauthority.com/linaro-android-is-up-to-twice-as-fast-as-stock-android-92831/

    See the comment by roger - it could be that this mythbusting is in need of an update.