in this forum you'll find some posts about optimizations involving inline assembly code, i.e.:
inline functions in C, gcc optimization and floating point arithmetic issues
But in general I would avoid optimizations at the inline assembler level. Instead I would use profile your app and use the results to zoom in on specific areas that your optimizations will benefit from. You may have already done that and identified floating point calculations as one of those areas. If floating point calculations are your problem then you might be able to get better performance by using integer math internally (if that's possible). This technique is used by programs like Donald Knuth's TeX. The idea is that you do your math in integer units of floats (i.e. 1.234cm = 1 unit, 2 * 3 = 6 units = 6 * 1.234cm = 7.404cm).
Another performance hog that will probably show up in your profiling results will probably point you to the fact that crossing the border between AS3 and C world (calling from AS3 code into Alchemy-C code and vice versa) is very expensive. You'll get good performance improvements by reducing calls that cross that boundary.
If there is a piece of code in particular that you need to optimize I would post it here in this forum.
I am sure you'll find help here.