[sldev] RFC: Vectorisation control patch
Dzonatas
dzonatas at dzonux.net
Mon Aug 13 09:21:08 PDT 2007
Paul TBBle Hampson wrote:
> In fact, as mentioned I'm getting 10% less speed out of the
> auto-vectorised _vec.cpp than out of the non-vectorised version. I dunno
> if that's the autovectorisation code actually doing a better job of
> vectorising the regular code, or what...
>
We have discovered the results are too random.
Most of that phenomena is related to the terrible thread schedulers of
many OSs. It appears recent Linux versions have tried to address a few
of the issues, but the main slowdown is that the cache space used by
hardware accelerations get easily polluted on thread switches. There is
currently no way for the modern OSs thread schedulers to detect or
govern such cache usage.
The rule of thumb for now is that if the core uses HT, then pair the
core to use two separate tasks. Don't let both sides of the core hit the
same cache with totally unrelated data, or we experience the problems as
seen now.
The OS needs to give up the thread scheduler and just take requests for
control of individual cores.
--
Power to Change the Void
More information about the SLDev
mailing list