Transactional Synchronization Extensions in Haswell
Intel have released details of their new Transaction Synchronization Extensions (TSX) programming model and instruction set that will be introduced with Haswell. These new extensions will help with synchronization of shared memory in multithreaded apps that use memory locks, by combining a lot of small locks into one large lock.
Traditionally, programmers would lock just the small area of shared memory that is being worked on, with overheads for each lock/release operation. With TSX, it is the hardware, and not the software, which decides how threads should execute when using lock-protected shared memory, and they will be serialized only when it is needed. Instead of having the software request a separate lock for each thread that is working on an area of shared memory, the program can request that a larger area of memory be locked, and then run the threads as normal.
If two or more threads try to operate on the same piece of data within that memory area, the processor will block them both, and then perform serialization as needed. Otherwise all the threads run as normal, without the normal overheads associated with locking memory.
There are two types of instruction for TSX. First up is Hardware Lock Elision (HLE), a set of legacy compatible instructions that uses the XACQUIRE and XRELEASE instruction prefixes, used to specify transactional regions. Software written using this technique will run on processors with and without TSX extensions. The second software interface is Restricted Transactional Memory (RTM), which is a set of new instructions that make it far easier for a programmer to set up transactional regions. Programmers need to provide alternate code for cases where the code is run on a system without the TSX instructions.
To summarize, when writing highly parallel applications with a lot of shared memory, TSX can make both program development and execution a lot faster. It'll be interested to see how well it works alongside some gpgpu computing if the iGPU has the power.
Transactional Synchronization in Haswell
Coarse-grained locks and Transactional Synchronization explained