SSE intrinsics optimizations in popular compilers

Lately I have been playing a lot with SSE optimizations and I really enjoy it so far – using functions to tell the compiler what instructions to use makes you feel the power in your finger tips. At first I was naive and thought the compiler will do exactly what it’s being told, assuming that you know what you’re doing – looking at the SSE intrinsic header file was mostly a bunch of calls to internal GCC functions or ‘extern’ in MSVC, suggesting that the compiler will simply follow your leadership.

I assumed wrong – the compiler will take the liberty to optimize your code even further – at points you wouldn’t even think about, though I have noticed that is not always the case with MSVC. MSVC will sometimes behave too trusting at the coder even when optimizations obviously could be made. After grasping the concept of SSE and what it could do, I quickly realized MSVC won’t optimize as good as GCC 4.x or ICC would.

I read a lot of forums about people who want to gain speed by using SSE to optimize their core math operations such as a 4D vector or a 4×4 matrix. While SSE will notably boost performance by about 10-30% depending on usage, there is no magic switch to tell the compiler to optimize your code to use SSE for you, so you need to know how to use intrinsics while actually optimizing along the way, while carefully examining the resulting assembly code.

This article will closely inspect and analyze the assembly output of 3 major compilers – GCC 4.x targeting Linux (4.3.3 in specific), the latest (stable) MSVC 2008 (Version 9.0.30729.1 SP1 in particular) and ICC 11.1.

Continue reading

Implementing in C++

Call me crazy, but I really like Flash’s EventDispatcher class – it’s simple, powerful and most of all relatively fast.

I felt the need to take EventDispatcher outside of my flash projects to my more advanced C++ ones. This turned out to be quite an easy task.

Instead of a boring ‘download code’ link, I will write the steps of implementing it using C++’s STLs, just because I feel like writing.

Continue reading

Typesafe assignable enumerations in AS3

Being a huge C/C++ fan, I had a really hard time switching to AS3 and giving up most of C++’s power features as enumerations.

I was surprised that a programming language that is based on Java and C# doesn’t support native enumerations built in the language. After a quick look-up at various Google searches, turns out no one has implemented a type-safe assignable enumerations design pattern in AS3. The various code examples I’ve seen were either not type safe or not assignable (using the enumerations as ‘global’ consts). Since those were not the behaviors I wanted, I had to write my own.

Continue reading