[KERNEL] Performance can be gained from a patch.

Search This thread

zodttd

Member
Jun 2, 2010
5
46
Hi,

I am the author of psx4droid. It's a PSX emulator that uses a dynarec. Due to the nature of this code I can have to invalidate the instruction cache on these Android's ARM processors. Just like Yong must do in GameBoid.

I noticed performance loss on the Evo 4G, MyTouch 4G and potentially others. As do some people running GameBoid (though this emulator lays heavy on frame limiting as it runs faster than 60 FPS so it's not noticed as much).

Both these arch's (ARCH_QSD8X50 for the Evo 4G and ARCH_MSM7X30 for the MyTouch 4G) have an oddity when it comes to flushing the "icache".

As seen in these platform's kernel sources at ./arch/arm/mm/cache-v7.S the *ENTIRE* icache is flushed on each cacheflush syscall!

The fix for my performance loss, and others, is to only clear the range specified for userland cacheflush's. While this may not help much besides apps that use cacheflush a lot like emulators, it will help these apps greatly.

My hope is someone can release ROM(s) for the Evo 4G and/or the MyTouch 4G with this fixed. Thanks!

Here's the offending code. Note "mcr p15, 0, r0, c7, c5, 0" clears the entire icache as well as others.


ENTRY(v7_coherent_user_range)
UNWIND(.fnstart )
dcache_line_size r2, r3
sub r3, r2, #1
bic r0, r0, r3
1:
USER( mcr p15, 0, r0, c7, c11, 1 )
add r0, r0, r2
2:
cmp r0, r1
blo 1b
dsb
mov r0, #0
mcr p15, 0, r0, c7, c5, 0
dsb
isb
mov pc, lr

9001:
mov r0, r0, lsr #12
mov r0, r0, lsl #12
add r0, r0, #4096
b 2b
UNWIND(.fnend )
ENDPROC(v7_coherent_kern_range)
ENDPROC(v7_coherent_user_range)



Here's a faster version of this function from the Samsung Fascinate (Galaxy S) that clears a range as it's supposed to:

ENTRY(v7_coherent_user_range)
dcache_line_size r2, r3
sub r3, r2, #1
bic r0, r0, r3
1: mcr p15, 0, r0, c7, c11, 1
dsb
mcr p15, 0, r0, c7, c5, 1
add r0, r0, r2
cmp r0, r1
blo 1b
mov r0, #0
mcr p15, 0, r0, c7, c5, 6
dsb
isb
mov pc, lr
ENDPROC(v7_coherent_kern_range)
ENDPROC(v7_coherent_user_range)
 

whoamanwtf

Senior Member
Nov 15, 2008
909
159
Best Coast
ZodTTD as a long time user of your various emus on various platforms I would just like to say thank you for your contributions. You have gone above and beyond here by hunting out this problem for us Evo users and posting it here even finding a solution!

You are awesome man keep up the good work, no matter how often you hear the thanks your work is appreciated!
 
  • Like
Reactions: zodttd

Unknownforce

Retired Recognized Developer
Nov 18, 2008
2,044
4,268
Wow, good find! I'm curious to see how many "daily" apps use this function and that can be sped up by putting this patch in...
 

zodttd

Member
Jun 2, 2010
5
46
Thanks!

To continue this discussion, I found out this kernel might be using a VIPT (Virtually Indexed Physically Tagged) instruction cache. Due to this usage, it requires the entire icache to be invalidated.

Can we disable VIPT icache on this kernel with some amount of ease? :(

Noted on ARM's site:

Example 7.1. A situation where software must be aware of the Instruction cache tagging strategy

Two processes, P1 and P2, share some code and have separate virtual mappings to the same region of instruction memory. P1 changes this region, for example as a result of a JIT, or some other self-modifying code operation. P2 must see the modified code.

As part of its self-modifying code operation, P1 must invalidate the changed locations from the instruction cache. If this invalidation is performed by MVA, and the instruction cache is a VIPT cache, then P2 might continue to see the old code. For more information, see the ARM Architecture Reference Manual.

In this situation, if the instruction cache is a VIPT cache, after the code modification the entire instruction cache must be invalidated to ensure P2 observes the new version of the code.
 

zodttd

Member
Jun 2, 2010
5
46
Wow, good find! I'm curious to see how many "daily" apps use this function and that can be sped up by putting this patch in...

Besides some emulators such as GameBoid and psx4droid... perhaps not many. Mono uses it, so games written with Unity 3D that use Mono could benefit. Basically any app that uses a JIT / dynarec will benefit. But those are limited by far.
 

khshapiro

Senior Member
Oct 18, 2009
194
10
Florida
if you post this up in the Desire forum i bet it will have a much greater effect on Devs, the Desire seems to have the trickle down effect to EVOs. Have you post any of this up on MoDaCo.com? Great work!
 

Greenfieldan

Senior Member
Jun 14, 2010
1,844
161
Mason
Zodttd, you're one of my long time heros on Android :D (right up there with Cy and Myn haha).
I do hope a lot of developers make use of this patch for their ROMs :)
 

dmaustin

Senior Member
Oct 6, 2009
52
0
Zod, I use your PSX4Droid all the time! On my Samsung Fascinate... I'm guessing that we don't have you worry about this issue. Never really noticed slow downs since v2. But I do get kicked out to rom selection randomly in FF7. Slightly annoying if I forget to save. But still... Best app on my Fascinate!
 

hagisbasheruk

Senior Member
Nov 4, 2007
136
16
Clydebank
I just hope this come to the attention of kernel developers and is pushed upstream
Thanks For your great work throughout the many years and platforms you have brought your work to.
 

tom62015

Member
Jun 11, 2010
42
3
How could this patch be installed after flashing a rom?

( Is it like running a script... make rich text doc/ place on sd card/ open emulator/ enter code?)

If anyone has the answer could they post the code(s) they used. I didn't have too much success.
 

atticus182

Senior Member
Apr 29, 2010
983
2,040
Lienden
www.antonpost.nl
Hi Zod, long time no see!

I don't know if you remember me, but I'm one of the first beta testers of your SNES & PSX emulator for the iPod touch :).. Good to see you working on Android now!

Cheers, Anton (atticus)
 

Caanon

Senior Member
Aug 1, 2010
101
11
Heya Z

Great find. Couple technical questions for ya:

1. I'm not familiar with this section of the kernel. Is this assembly code autogenerated by the toolchain requiring binary patching after compilation, or does it exist as an asm block in the source?

2. Would it be possible to add a separate function to the kernel that clears only the userspace cache? Something along the lines of v7_coherent_user_range_ZODTTD. This might contribute to kernel bloat though. On the upside, it would retain the original kernel space clears for other programs that use the call so other processes could clear the caches when they need to, but programs like your emulator could clear only the user cache.
 

konistehrad

Member
Jan 8, 2010
9
0
Besides some emulators such as GameBoid and psx4droid... perhaps not many. Mono uses it, so games written with Unity 3D that use Mono could benefit. Basically any app that uses a JIT / dynarec will benefit. But those are limited by far.
I would be very careful before suggesting a fix from the Fascinate kernel for cache-flush optimizations or fixes. The Galaxy S's 2.1 kernels notoriously could not be used run Unity games, and while the root cause was never proven it has been heavily speculated that it was the result of an improperly implemented cache-flush instruction.
 

Top Liked Posts

  • There are no posts matching your filters.
  • 32
    Hi,

    I am the author of psx4droid. It's a PSX emulator that uses a dynarec. Due to the nature of this code I can have to invalidate the instruction cache on these Android's ARM processors. Just like Yong must do in GameBoid.

    I noticed performance loss on the Evo 4G, MyTouch 4G and potentially others. As do some people running GameBoid (though this emulator lays heavy on frame limiting as it runs faster than 60 FPS so it's not noticed as much).

    Both these arch's (ARCH_QSD8X50 for the Evo 4G and ARCH_MSM7X30 for the MyTouch 4G) have an oddity when it comes to flushing the "icache".

    As seen in these platform's kernel sources at ./arch/arm/mm/cache-v7.S the *ENTIRE* icache is flushed on each cacheflush syscall!

    The fix for my performance loss, and others, is to only clear the range specified for userland cacheflush's. While this may not help much besides apps that use cacheflush a lot like emulators, it will help these apps greatly.

    My hope is someone can release ROM(s) for the Evo 4G and/or the MyTouch 4G with this fixed. Thanks!

    Here's the offending code. Note "mcr p15, 0, r0, c7, c5, 0" clears the entire icache as well as others.


    ENTRY(v7_coherent_user_range)
    UNWIND(.fnstart )
    dcache_line_size r2, r3
    sub r3, r2, #1
    bic r0, r0, r3
    1:
    USER( mcr p15, 0, r0, c7, c11, 1 )
    add r0, r0, r2
    2:
    cmp r0, r1
    blo 1b
    dsb
    mov r0, #0
    mcr p15, 0, r0, c7, c5, 0
    dsb
    isb
    mov pc, lr

    9001:
    mov r0, r0, lsr #12
    mov r0, r0, lsl #12
    add r0, r0, #4096
    b 2b
    UNWIND(.fnend )
    ENDPROC(v7_coherent_kern_range)
    ENDPROC(v7_coherent_user_range)



    Here's a faster version of this function from the Samsung Fascinate (Galaxy S) that clears a range as it's supposed to:

    ENTRY(v7_coherent_user_range)
    dcache_line_size r2, r3
    sub r3, r2, #1
    bic r0, r0, r3
    1: mcr p15, 0, r0, c7, c11, 1
    dsb
    mcr p15, 0, r0, c7, c5, 1
    add r0, r0, r2
    cmp r0, r1
    blo 1b
    mov r0, #0
    mcr p15, 0, r0, c7, c5, 6
    dsb
    isb
    mov pc, lr
    ENDPROC(v7_coherent_kern_range)
    ENDPROC(v7_coherent_user_range)
    2
    huh....please post a evo performance for dummies.

    Sent from my PC36100 using XDA App

    This fix isn't for end users, this is for devs to put into their ROMs for you.
    1
    ZodTTD as a long time user of your various emus on various platforms I would just like to say thank you for your contributions. You have gone above and beyond here by hunting out this problem for us Evo users and posting it here even finding a solution!

    You are awesome man keep up the good work, no matter how often you hear the thanks your work is appreciated!
    1
    thanks for the post, i took the liberty of posting it in the kernel forums with a link to your thread
    1
    Zod,

    I've used your emulators over the years and have loved them. Thank you!