A processing unit (CPU, GPU, or whatever) and RAM are usually separate pieces built on separate chips. But what if they were part of the same chip, all mixed together? This is exactly what Samsung did to create the world’s first high bandwidth memory (HBM) with built-in AI processing hardware called HBM-PIM (for in-memory processing).

He took his HBM2 Aquabolt chips and added Programmable Computing Units (PCUs) between the memory banks. These are relatively straightforward and operate on 16-bit floating-point values with a limited instruction set – they can move data and perform multiplication and addition.


PCU mixed with memory banks • PCU is a very limited FP16 processor
But there are many PCUs out there and they literally lie next to the data they’re working on. Samsung has successfully run PCUs at 300 MHz, which equates to a processing power of 1.2 TFLOPS per chip. And it kept the same power consumption (per chip) while transferring data at 2.4Gbps per pin.

The power consumption per chip can be the same, but the overall system power consumption is reduced by 71%. This is because a typical processor would need to move data twice – read the input and then write the result. With HBM-PIM, data really doesn’t go anywhere.

It’s not just about saving energy, using PIM for machine learning and inference tasks, researchers have seen system performance more than double. It’s a win-win situation.
The HBM-PIM design is backward compatible with conventional HBM2 chips, so no new hardware needs to be developed – the software just needs to tell the PIM system to switch from normal mode to in-memory processing mode.
There is a problem with this and that is that PCUs take up space previously occupied by memory banks. This cuts the total capacity in half – up to 4 gigabits. Samsung decided to split the difference and combine 4 gigabit PIM chips with 8 regular gigabit HBM2 arrays. Using four of each, he created 6 gigabyte stacks.

There’s more bad news – it will be some time before HBM-PIM hits mainstream hardware. For now, Samsung has sent out chips for testing by partners developing artificial intelligence accelerators and expects the design to be validated by July.
HBM-PIM will be showcased at the International Semiconductor Circuit Virtual Conference this week, so we can expect more details.

Source | Via


