to access the VFP reg set look into vfphw.S, there are vfp_get/put_float/double functions. you can get the asm from there.Afte a lot of try & error I finally managed to read the s0 (lower? half of d0) register in assembler, and printed a message when they do not match (and forced reloading of the registers (even when the message is not printed)). I also added printks to the hotplug and cpu_pm notifiers (also merged the two commits I mentioned). Then I ran the FPBug detector and my message got printed far more often than I expected (it also got printed before a few times).
Here are all relevant files: the patches, the kmsg and also a translation from assembler to ("pseudo")-C of the vfp_support_entry procedure, that I manually made (read: may contain errors).
As soon as the screen is turned off, it seems that the FPU register values of CPU0 (!) always get clobbered between the invocation of the vfp_support_entries. Noob question: Do the cores have separate FPU registers?
the code certainly looks like each core has an fpu! and i always suspected smp implied that