Thursday, January 11, 2018

Exceeding GPU memory limits to allow proteomics data processing!


This isn't even close to the first time someone has set up GPU based data processing, but it's still really cool!

For people who aren't as obsessed with today's computers, this is how most of them work:
You have a Central Processing Unit (there is probably a sticker on it that says what that is: i7 or Xeon which...really...means nothing because there can be a 3 order of magnitude difference is power and efficiency between different generations of these processor families. It is probably an indication of how much the PC cost when new. i3 being the least and Xeon being the most. That's the CPU and a crazy huge one has 20 cores (though multi-chip setups might allow near 100 these days)

Your computer also has a Graphics Processing Unit (GPU). Lower power ones may be built directly into the motherboard. Higher power ones will be completely separate units attached to the motherboard. Modern GPUs have THOUSANDS of cores. These cores are generally taxed with controlling a small number of pixels and their work load isn't very hard. There isn't a real reason to give them tons of memory.

Side note: Who is old enough to remember when you had to purchase a secondary "math processing unit" for your PC so it could handle big data like multiplying 6 digit numbers....?  We've come a long way!

My understanding is that one of the big problems for searching a spectra with a GPU is the memory issue -- a spectra is too large to fit in the memory for that tiny little processing core.

That's a lot of words to introduce G-MSR which you can read about (open access) here!



G-MSR reduces the data that each GPU core gets hit with to make sure that each little tiny core can handle what it is told to process. The authors hit it with 14 different historic datasets (I think they're all high resolution MS/MS) and they can process all of them.

I looked up the GPU they are using and it's $4,000! It looks like one specifically designed for industrial applications, but it doesn't look like it has more cores or memory than your standard $300-$500 GPU you'll use for 3D rendering, playing Injustice 2, or mining applications. It would be interesting to see how this algorithm would do with something that it would be easy to array 5-10 of....

The program is GPL open and can be downloaded from GitHub here.


No comments:

Post a Comment