最近搞了一下android开发,牵扯到一些log pow exp的运算需要优化,考虑到并不需要精确的数值结果,就尝试用一些近似解法代替c库函数,个人目前还没有能力设计这些近似算法,在网上搜集一些资源,跟大家共享一下,兼做笔记备忘。
log2的近似算法:http://www.musicdsp.org/showone.php?id=91
适用于IEEE 32-bit floating 型,误差在0.1左右。
/ Fast logarithm (2-based) approximation
// by Jon Watte
#include <assert.h>
int floorOfLn2( float f ) {
assert( f > 0. );
assert( sizeof(f) == sizeof(int) );
assert( sizeof(f) == 4 );
return (((*(int *)&f)&0x7f800000)>>23)-0x7f;
}
float approxLn2( float f ) {
assert( f > 0. );
assert( sizeof(f) == sizeof(int) );
assert( sizeof(f) == 4 );
int i = (*(int *)&f);
return (((i&0x7f800000)>>23)-0x7f)+(i&0x007fffff)/(float)0×800000;
}
pow, exp的查表近似算法:http://www.hxa.name/articles/content/fast-pow-adjustable_hxa7241_2007.html
适用于IEEE 32-bit floating 型,可以通过控制查找表的大小来控制精度。相对误差在0.01%~0.02%之间的情形下,查找表大小仅为8KB。
Code
In C (89) (and assuming 32 bit integers), the core is:
const float _2p23 = 8388608.0f;
/**
* Initialize powFast lookup table.
*
* @pTable length must be 2 ^ precision
* @precision number of mantissa bits used, >= 0 and <= 18
*/
void powFastSetTable
(
unsigned int* const pTable,
const unsigned int precision
)
{
/* step along table elements and x-axis positions */
float zeroToOne = 1.0f / ((float)(1 << precision) * 2.0f); /* A */
int i; /* B */
for( i = 0; i < (1 << precision); ++i ) /* C */
{
/* make y-axis value for table element */
const float f = ((float)pow( 2.0f, zeroToOne ) - 1.0f) * _2p23;
pTable[i] = (unsigned int)( f < _2p23 ? f : (_2p23 - 1.0f) );
zeroToOne += 1.0f / (float)(1 << precision);
} /* D */
}
/**
* Get pow (fast!).
*
* @val power to raise radix to
* @ilog2 one over log, to required radix, of two
* @pTable length must be 2 ^ precision
* @precision number of mantissa bits used, >= 0 and <= 18
*/
float powFastLookup
(
const float val,
const float ilog2,
unsigned int* const pTable,
const unsigned int precision
)
{
/* build float bits */
const int i = (int)( (val * (_2p23 * ilog2)) + (127.0f * _2p23) );
/* replace mantissa with lookup */
const int it = (i & 0xFF800000) | pTable[(i & 0x7FFFFF) >> /* E */
(23 - precision)]; /* F */
/* convert bits to float */
return *(const float*)( &it );
}
Setting the ilog2 parameter:
- for
pow( 2, val), ilog2 = 1 / log2(2) = 1
- for
pow( e, val), ilog2 = 1 / log(2) = 1.44269504088896
- for
pow(10, val), ilog2 = 1 / log10(2) = 3.32192809488736
- for
pow( r, val), ilog2 = log(r) / log(2) = log(r) * 1.44269504088896
These two functions can easily be put in a small class/module that manages the storage and wraps calls for different radixes.
pow, ln近似算法:http://www.hxa.name/articles/content/fast-pow-adjustable_hxa7241_2007.html
适用于IEEE 64-bit double 型。
double pow(double a, double b) {
int tmp = (*(1 + (int *) &a));
int tmp2 = (int)(b * (tmp - 1072632447) + 1072632447);
double p = 0.0;
*(1 + (int * ) &p) = tmp2;
return p;
}
最后附送一坨位运算技巧:http://graphics.stanford.edu/~seander/bithacks.html
最新评论