what does the input look like? In CPU mode in the toolkit, you are doing all your processing natively at 32bpp, Flash stores images at 8bpp, so the data is converted in and out of pixel bender processing. There are more details in the Flash section of the documentation included with the Pixel Bender Toolkit.
This is my input, a simple Render Clouds from Photoshop, scaled from 512 to 128:
It really is just a JPG...
What confuses me most, is how the out coordinates work differently. With CPU, I can only work within the confines of the image, with Flash, It scales and i have control over the size of the bitmap ?
I can send you the PBk if you want, prefer not to share my source here.
I think I understand what the problem is, it's with the flash renderer of the toolkit.
My kernel relies heavily on knowing the dimensions of the source image (compared to HLSL this is a very annoying thing, as coordinates always range between 0-1).
Now it seems like the Flash renderer takes the source image and rescales it to fit your screen, making all my wrapping and offsetting code useless, since I can never accurately know the size of the bitmap. Good thing is in my own flash application I have the control I want, but I really suggest changing this in the Toolkit!
I found out the other problem. step returns a different result in Flash then it does on CPU!
a step(x, y) becomes step(y, x).