SSE4.1 consists of 47 instructions that improve performance of media data manipulation:

  • 2 Dword multiply instructions
  • 2 Single- and double-precision dot product instructions
  • 1 streaming Load Hint Instruction
  • 6 packed blending instructions
  • 8 packed integer MIN/MAX instructions
  • 4 instructions used for rounding scalar, single and double-precision operands
  • 7 instructions used to simplify insertion and extractions data to/from XMM registers
  • 12 instructions used to convert packed integer data
  • 1 instruction that improves sum absolute difference for 4-byte blocks
  • 1 search instruction that determines value and location of minimum unsigned word in a block of 8 packed unsigned words
  • 1 packed test instruction
  • 1 128-bit packed qword equality test
  • 1 instruction used to pack pack dword to word with unsigned saturation

SSE4.1 is only the first part of SSE4 instruction set. SSE4.1 was first introduced in Intel Penryn core in January 2008. The first AMD microprocessors with SSE 4.1 support were Bulldozer-based FX-Series and Opteron 6200. These families were released in October and November 2011 respectively.

