I read a good article highlighting why it is difficult to write a decent multi-threaded app that scales well with the number of hardware cores and wanted to share it with you all.
http://www.javacodegeeks.com/2012/08/what-makes-parallel-programming-hard.html
Speed optimizations can even cause unintended artificial limitations. Something to think about.