5 Replies Latest reply on Mar 5, 2010 2:20 AM by phaseblue

    Optimizing Pixel Bender Code

    phaseblue Level 1

      I`m building a 10-track mixer/remixer application in Flash and I`m using a Pixel Bender shader to do all my audio mixing  (ala Kevin Goldsmith`s example: http://blogs.adobe.com/kevin.goldsmith/2009/08/pixel_bender_au.html).  Each of the mp3s is about 7 seconds long, and they are set up to loop in the "processSound(event:SampleDataEvent)" function.

       

      It works fine except that I`m having a little trouble with performance on some machines.  On my 3.2 GHz Pentium 4 PC, it will occasionally stutter a bit if another application is opened.  On my 2.4 GHz dual core iMac, it will even do this when I run the mouse over the file menu in Safari!  What`s strange though, is that it performed almost perfectly on a 1.6 GHz Pentium Laptop running XP and IE 6!  I can`t figure this out, but the 2,205,000 or so math calculations per second in the shader must be taxing the cpu pretty heavily!  As it is, I`m doing my pan calculations outside the kernel, so as to lighten it`s load.  Other than that, I`m not too sure what else to do.  Is there any way I can optimize this code?

       

      One other thing I thought about (which might be a little off this forum topic, so I apologize), would be to preload all 10 of the 7 second mp3 files into their own byteArrays when the app. first loads.  Then I could just pull the 2048 - sample chunks off of it at a time to load the sound buffer when the mixer is playing.  I don`t know how much of a load the sound.extract() method places on the cpu, but considering I`m using it on 10 mp3s simultaneously, several times a second, I`m sure it isn`t light!  Info. on this is pretty sparse out there!  Any thoughts on this?

       

      I`d really appreciate any help anyone can give!

       

      Here`s the code for my shader:

       

      <languageVersion : 1.0;>

       

      kernel AudioMixerFilter

      <   namespace : "darktrain";

          vendor : "phaseblue";

          version : 1;

          description : "Audio mixer with 10 tracks";

      >

      {

          input image4 track1;

          input image4 track2;

          input image4 track3;

          input image4 track4;

          input image4 track5;

          input image4 track6;

          input image4 track7;

          input image4 track8;

          input image4 track9;

          input image4 track10;

       

          parameter float vol1;

          parameter float vol2;

          parameter float vol3;

          parameter float vol4;

          parameter float vol5;

          parameter float vol6;

          parameter float vol7;

          parameter float vol8;

          parameter float vol9;

          parameter float vol10;

       

          parameter float panR1;

          parameter float panR2;

          parameter float panR3;

          parameter float panR4;

          parameter float panR5;

          parameter float panR6;

          parameter float panR7;

          parameter float panR8;

          parameter float panR9;

          parameter float panR10;

       

          parameter float panL1;

          parameter float panL2;

          parameter float panL3;

          parameter float panL4;

          parameter float panL5;

          parameter float panL6;

          parameter float panL7;

          parameter float panL8;

          parameter float panL9;

          parameter float panL10;

       

          output pixel4 dst;

       

          void

          evaluatePixel()

          {

              pixel4 tmp1 = sampleNearest(track1,outCoord());

              tmp1[0] = tmp1[0] * panL1 * vol1;

              tmp1[2] = tmp1[2] * panL1 * vol1;

              tmp1[1] = tmp1[1] * panR1 * vol1;

              tmp1[3] = tmp1[3] * panR1 * vol1;

              pixel4 tmp2 = sampleNearest(track2,outCoord());

              tmp2[0] = tmp2[0] * panL2 * vol2;

              tmp2[2] = tmp2[2] * panL2 * vol2;

              tmp2[1] = tmp2[1] * panR2 * vol2;

              tmp2[3] = tmp2[3] * panR2 * vol2;

              pixel4 tmp3 = sampleNearest(track3,outCoord());

              tmp3[0] = tmp3[0] * panL3 * vol3;

              tmp3[2] = tmp3[2] * panL3 * vol3;

              tmp3[1] = tmp3[1] * panR3 * vol3;

              tmp3[3] = tmp3[3] * panR3 * vol3;

              pixel4 tmp4 = sampleNearest(track4,outCoord());

              tmp4[0] = tmp4[0] * panL4 * vol4;

              tmp4[2] = tmp4[2] * panL4 * vol4;

              tmp4[1] = tmp4[1] * panR4 * vol4;

              tmp4[3] = tmp4[3] * panR4 * vol4;

              pixel4 tmp5 = sampleNearest(track5,outCoord());

              tmp5[0] = tmp5[0] * panL5 * vol5;

              tmp5[2] = tmp5[2] * panL5 * vol5;

              tmp5[1] = tmp5[1] * panR5 * vol5;

              tmp5[3] = tmp5[3] * panR5 * vol5;

              pixel4 tmp6 = sampleNearest(track6,outCoord());

              tmp6[0] = tmp6[0] * panL6 * vol6;

              tmp6[2] = tmp6[2] * panL6 * vol6;

              tmp6[1] = tmp6[1] * panR6 * vol6;

              tmp6[3] = tmp6[3] * panR6 * vol6;

              pixel4 tmp7 = sampleNearest(track7,outCoord());

              tmp7[0] = tmp7[0] * panL7 * vol7;

              tmp7[2] = tmp7[2] * panL7 * vol7;

              tmp7[1] = tmp7[1] * panR7 * vol7;

              tmp7[3] = tmp7[3] * panR7 * vol7;

              pixel4 tmp8 = sampleNearest(track8,outCoord());

              tmp8[0] = tmp8[0] * panL8 * vol8;

              tmp8[2] = tmp8[2] * panL8 * vol8;

              tmp8[1] = tmp8[1] * panR8 * vol8;

              tmp8[3] = tmp8[3] * panR8 * vol8;

              pixel4 tmp9 = sampleNearest(track9,outCoord());

              tmp9[0] = tmp9[0] * panL9 * vol9;

              tmp9[2] = tmp9[2] * panL9 * vol9;

              tmp9[1] = tmp9[1] * panR9 * vol9;

              tmp9[3] = tmp9[3] * panR9 * vol9;

              pixel4 tmp10 = sampleNearest(track10,outCoord());

              tmp10[0] = tmp10[0] * panL10 * vol10;

              tmp10[2] = tmp10[2] * panL10 * vol10;

              tmp10[1] = tmp10[1] * panR10 * vol10;

              tmp10[3] = tmp10[3] * panR10 * vol10;

       

              pixel4 tmp_out = tmp1 + tmp2 + tmp3 + tmp4 + tmp5 + tmp6 + tmp7 + tmp8 + tmp9 + tmp10;

       

              dst = tmp_out;

          }

      }

        • 1. Re: Optimizing Pixel Bender Code
          Kevin Goldsmith Level 3

          well, you can increase the size of the buffers you use, which increase the latency, but should help a bit.

           

          You also are doing a lot of redunant calculations, you could try:

           

           

                  pixel4 tmp1 = sampleNearest(track1,outCoord());

                  tmp1[0] = tmp1[0] * panL1 * vol1;

                  tmp1[2] = tmp1[2] * panL1 * vol1;

                  tmp1[1] = tmp1[1] * panR1 * vol1;

                  tmp1[3] = tmp1[3] * panR1 * vol1;

           

          changing to:

           

               pixel4 tmp1 = sampleNearest(track1, outCoord());

               float2 tmp1VolPan = float2(panL1, panR1)*vol1;

               tmp1 *= float4(tmp1VolPan.x, tmp1VolPan.y, tmp1VolPan.x, tmp1VolPan.y);

           

          don't know if that will make a huge difference, but it will save you some multiplies and may help. If you are parsing from 10 mp3 files each time that will definitely be hurting, much better to have them in memory if you can fit them, you're adding a lot of disk I/O which could really hurt.

          1 person found this helpful
          • 2. Re: Optimizing Pixel Bender Code
            phaseblue Level 1

            Thanks a lot Kevin!  I`ll give this a go and see if it helps!

            • 3. Re: Optimizing Pixel Bender Code
              phaseblue Level 1

              Kevin,

               

              Thanks again for your help!  The optimizations seem to have improved performance a little!

               

              At least (thanks to Pixel Bender) things are working to a point where I`m pretty satisfied!

               

              I just had one more question:

              Like you suggested, I am going to try and extract all the sounds into memory beforehand, but there is one problem - to do this, I must specify the size of the byteArray (i.e. the size of the song file) in the sound.extract() method.  From what I have seen so far, there is no accurate way to get the exact length/size of the mp3 file in order to do this.  I`ve heard, and discovered firsthand, that the amount returned by the sound.bytesTotal property is often very inaccurate.  Is there a way to get the length of an mp3 file down to-the-sample?

               

              Again, I hate to bother you with posts that are somewhat off topic, but to be quite honest, this is such a niche area of sound processing that the only place I`ve found where people can offer accurate help is in the "Pixel Bender Forum"!  The general AS3 forum is great in most situations, but I`ve tried in the past to ask audio related questions that aren`t even as difficult as this, and I get the feeling that 99.99% of people there don`t have the slightest idea of what I`m talking about.  The few that do are often in the same boat as I am!

               

              Thanks again for any advice!

               

              Matt

              • 4. Re: Optimizing Pixel Bender Code
                makc3d Level 1

                I always thought bytesTotal refers to compressed data

                • 5. Re: Optimizing Pixel Bender Code
                  phaseblue Level 1

                  Oh, you`re right!  The extract() method returns the uncompressed PCM data.

                  I overlooked that not-so-insignificant detail.

                   

                  Well, if that`s the case, then the situation is even more difficult than I imagined.

                  Is there any accurate way of accessing the uncompressed size of an imported audio (i.e. mp3) file with ActionScript - before it`s actually uncompressed?