[go: up one dir, main page]

[Draft] Using Eigen's vectorized sin and cos to speed up the calculation of twiddles in ei_kissfft_impl.h

We are tring to use Eigen's vectorized sin and cos to speed up kissfft twiddles calculation, this patch just provide an initial implementation, so i'm asking for advices from anyone interested in this issue. I create a new MR not an issue in order to better discussion on initial codes. And it's still draft since we also need the pcos/psin which support vectorized double type but still missing now(#2164 (closed)).

Benchmarks(only for float) on x64 with -O3 -march=native option:

Benchmark                                   Time             CPU      Time Old      Time New       CPU Old       CPU New
------------------------------------------------------------------------------------------------------------------------
test_scalar_float/8                      -0.0099         -0.0099           159           158           159           158
test_scalar_float/64                     +0.0083         +0.0083          2195          2213          2195          2213
test_scalar_float/512                    +0.0213         +0.0213         22816         23303         22816         23303
test_scalar_float/4096                   +0.0024         +0.0024        316340        317103        316341        317104
test_scalar_float/32768                  -0.0021         -0.0021       2179072       2174555       2179040       2174560
test_scalar_float/100000                 -0.0036         -0.0036       7190240       7164364       7190201       7164309
test_complex_float/8                     +0.0351         +0.0351           290           300           290           300
test_complex_float/64                    +0.0003         +0.0003          3550          3551          3550          3551
test_complex_float/512                   -0.0033         -0.0032         58419         58229         58418         58229
test_complex_float/4096                  -0.0023         -0.0023        564808        563525        564809        563515
test_complex_float/32768                 -0.0036         -0.0037       4320990       4305331       4320975       4305118
test_complex_float/100000                -0.0114         -0.0114      14692728      14524624      14692754      14524522
Edited by Guoqiang QI

Merge request reports

Loading