Webgl Angle循环非常慢

Webgl Angle loops really slow

本文关键字:非常 循环 Angle Webgl      更新时间:2023-09-26

我这里有一些glsl,它就像一个魅力。仅编译需要 3 分钟左右。我知道这是由于角度,Angle是一个软件,它将opengl es 2.0代码转换为Windows系统上webgl的directX 9。如果我禁用角度,它会在一秒钟内编译。有谁知道为什么嵌套循环的角度如此慢。如果有解决方法?我的意思是,我不能让每个人在每个着色器上等待超过一分钟。

for ( int b = 0; b < numberOfSplitpoints; b++ ) {
    if ( cameraDepth > splitPoints[b] && cameraDepth < splitPoints[b+1] ) {
        const float numberOfSplitpoints = float( NUMBER_OF_SPLIT_POINTS - 1 );
        vec4 projCoords = v_projTextureCoords[b];
        projCoords /= projCoords.w;
        projCoords = 0.5 * projCoords + 0.5;
        float shadowDepth = projCoords.z;
        projCoords.x /= numberOfSplitpoints;
        projCoords.x += float(b) / numberOfSplitpoints;

        for( int x = 0; x < fullkernelSize; x++ ) {
            for( int y = 0; y < fullkernelSize; y++ ) {
                vec2 pointer = vec2( float(x-kernelsize) / 3072.0, float(y-kernelsize) / 1024.0 );
                float convolution = kernel[x] * kernel[y];
                vec4 color = texture2D(shadowMapSampler, projCoords.xy+pointer);
                if(encodeDepth( color ) + shadowBias > shadowDepth) {
                    light += convolution;
                } else {
                    light += convolution * 0.6;
                }
            }
        } 
    }
}
vec2 random = normalize(texture2D(randomSampler, screenSize * uv / 64.0).xy * 2.0 - 1.0);
float ambiantAmount = 0.0;
const int kernel = 4;
float offset = ssoasampleRad / depth;

for(int x = 0; x<kernel; x++) {
    vec2 a  = reflect(directions[x], random) * offset;
    vec2 b  = vec2( a.x *0.707 - a.y*0.707, 
                    a.x*0.707 + a.y*0.707 );
    ambiantAmount += abientOcclusion(uv, a*0.25, position, normal);
    ambiantAmount += abientOcclusion(uv, b*0.50, position, normal);
    ambiantAmount += abientOcclusion(uv, a*0.75, position, normal);
    ambiantAmount += abientOcclusion(uv, b, position, normal);
}

GLSL ES 不定义 while 循环和"动态"限定为循环是必需的。ANGLE利用了这一点,并进行了广泛的循环展开:如果你有for ( int b = 0; b < numberOfSplitpoints; b++ )numberOfSplitpoints必须是常量表达式,否则着色器将无法编译。

循环展开应该允许本机着色器优化器进行更多优化并最大程度地减少分歧,但是(在您的代码中)如果您有numberOfSplitpoints并且fullkernelSize非常大,则展开的代码可能会变得非常长(最内部部分的代码将重复numberOfSplitpoints*fullkernelSize*fullkernelSize次),这可能会导致优化器和编译器陷入各种麻烦。