Sunday, January 21, 2018

North American Mass Spec Summer School!

Have some students of collaborators who could use an intensive summer school course in Mass Spec? Check this out. And look at the lineup!

You can register here.

No joke -- Madison is awesome in August. The students are gone and the weather is great.

Saturday, January 20, 2018

MLL-Rearrangements in Leukemia revealed by MudPIT LC-MS

I'll be honest -- I may have just made fun of using MudPIT in a meeting recently. In my immediate defense, I was only thinking about it in context of our Fusion 1 system because it has been my primary focus the last few days and nights.

You know what, though? We have an LTQ Velos that isn't being used as frequently as other systems and this beautiful recent study from the MudPIT experts at Stowers makes me think that this is exactly what the LTQ should be doing!

This study isn't in Cell because they didn't have anything else to put in the journal that month. This study is in Cell because it's flippin' awesome!

MLL is a gene that has a wild-type form (which is important) but can have weird translocations and produce strange chimeric proteins that are seriously bad news for the patient. Over 70 different chimeric proteins have been identified -- and they all sound like they sucked.

As you can imagine -- this is kind of a moving target. Seventy different protein variants? How do you even start to study at this? This group says "oh. that's simple. we'll study it with EVERY analytical technique you've ever heard of"!!

This study has cell sorting, RNA-Seq, induced mutations, purifiable (via flagtag) proteins, more cell line combinations than you can shake a stick at, bone marrow transplants in mice that are THEN irradiated -- you name it. They threw it at this problem. Oh yeah! And they did proteomics!

What did they get out of this? Oh -- just the most thorough picture of how MLL translocations lead to the destruction of the important wild-type protein and a darned good picture of how the entire mechanism of MLL leukemia works. You know...nothing special...

An immediate question I had was how did this relatively small number of authors do ALL OF THESE THINGS? I looked and expected it to have 40 names on it.  I'm impressed, for real.

Side note: An LTQ is still an awesome instrument if you give it the right problems to solve!

Friday, January 19, 2018

Informing proteomic biomarker discovery with genomic databases!

Global proteomics is awesome. I LOVE to give the elevator pitch to someone about what proteomics is. I've ran through it so many times that I've got it perfected and I imagine that just about everyone doing this has one that is better than mine.

However -- there are some clear downsides to all of the statistics that are necessary to match every ion the instrument sees and/or fragments to a theoretical database containing somewhere between tens of thousands and hundreds of thousands (millions?) of theoretical sequences. Just a reminder:  A 1% false discovery rate (FDR 0.01) on 1,000,000 Peptide Spectral Matches is 10,000 matches that could have occurred purely by chance.

On the other extreme end -- you have the targeted proteomics stuff -- where you specifically look at a small set of things you are interested in.  This new study bridges this gap.

This study is focuses purely on cancer biomarkers. To go after them they narrow the definition of what a "biomarker" is by interrogating databases to build a list of around 1,000 proteins that have been linked to cancer in some way. I haven't looked at this list yet, but I like the number. If you are searching 1-10 proteins, I do not trust global FDR approaches like target decoy -- or even Percolator /Elucidator.  They're great, but I think they need a lot of data to work right. Around 1,000 proteins? I'd use the global tools without hesitation (I hope it goes without saying that I would manually look through the matches, though!). Here data spectra appear to be searched against all of Human UniProt/SwissProt, but the downstream analysis in informed with the biomarker list.  I'm thinking that I might look at some other datasets and limit the FASTA to just the biomarkers this team has identified.

The team then develops a kind of extreme phenotype to assess how well this approach works. By arresting cancer cells of different types at different cell cycle check points they have a really interesting and complex model system to test it on. And it works! An LTQ (yup! Linear Tion trQp!) can identify and quantify more than 1/3 of the biomarkers from their starting list. Since we know that cancer is almost never just one protein being messed up, and is instead dozen or hundreds of proteins working together -- 300+ quantified proteins is more than enough to point you toward the pathways being affected!

Thursday, January 18, 2018

JUMP and JUMPg for tag based proteogenomics!

It's funny how often I am "introduced" to something and the search blog on this bar shows proof that I read a paper on the topic at some point. JUMP is one such thing!

I have an excuse for forgetting about it. I didn't have a Linux computer to run it on. Yesterday, however, my application for access to a small Linux computer....

...with 72,000 (seventy two THOUSAND!!!) processing cores was approved...and it's time to see what this little guy can do (and...what they bill for it...since I now have a billing account for it as well...)

Of course it has around 50 programs installed for next gen sequencing analysis, but it has 2 programs for proteomics and the first is JUMPg, which was recently described here.

My hopes were dashed a little when I found out it is only enabled to run on one node at a time (so... 28 or so of those 72,000 cores...) but it appears that multiple instances can be started. I'll send it something tough and see how it goes.

It seems like most institutions have super computing resources these days (they have to in order to support the "next gen" sequencing stuff). Maybe JUMPg is a resource you can sneak onto one near you as well!

Wednesday, January 17, 2018

Avogadro: Chemical editor, visualization and analysis

One of the first things I noticed after being out of the lab for a few years is that the scientists have been getting younger -- or...well...something else I'd rather not consider has occurred.

[Go Go Gadget Denial!]

An upside of this is a running notepad on my telephone of the cool new tools I'm learning about that are being used in school these days/recently that I need to check out. The first I'm getting to this morning is Avogadro. It was described originally here, but appears to have evolved a lot since.

In it's simplest form it is a really powerful molecule editor -- akin to ChemSketch/Draw but with a simpler and more intuitive interface. It if has something comparable to the "mass spec scissors" I haven't found it yet (but I often just delete a bond anyway).

You can get Avogradro here.

Shoutout to Conor for the cool new icon/tool on my desktop!

Tuesday, January 16, 2018

May Institute Computation and Stats for MS and Proteomics! Deadline looming!

We've come a loooooong way in terms of bioinformatics in mass spectrometry -- I'd argue the biggest gains have all come in the last 5 years or so. However -- we've still got quite a ways to go.

One of the most exciting meetings coming up this year might be this one at Northeastern (in Boston, not to be confused with Northwestern, which is somewhere else).

If you want to go, you have to hurry and register. Deadline is January 31st.

Automatically building big & complex multi-organism FASTA databases in PD

Imagine that you are sitting there minding your own business and someone walks in with some Louisiana crayfish peptides. As cool as proteomics is, you'll have to convince me that there is a more appropriate usage of Crayfish than this....

...but let's assume that this is super important (and we only need a few micrograms of peptides anyway)

What if there isn't a good sequenced crayfish FASTA database? I don't know if there is, I'm working on something else and I only chose this example because I'm hungry.  Before you go all out and start de novo sequencing everything, maybe you can start with just a giant FASTA (I learned today that this is a hard "A" fast-AY or fast-(Candian) -Eh.) Who knew? Everybody?

You can start by building a FASTA that has all related organisms. If you have Mascot, you're in luck. You can just choose the taxonomy in your pulldown (assuming the complete database has been loaded). I don't have Mascot access at home so I went to Google and the first link was some terrifying exercise in BLASTP+ from command line where you cross-reference your taxonomy list from UniProt to the complete FASTA....

Then I remembered one of the perks of having PD maintenace -- something about FASTA downloads. Turns out it is pretty cool. If you look up Crayfish (p.s. in my state we call them crawldaddies, no idea why) in WikiPedia you can find the entire taxonomy. You can then follow either of the links in the box at the top image (this pops up when you go FASTA Database utilities --> download from ProteinCenter --> I chose arthropoda in taxonomy which gave me that number and then I just hit Download.

It queues up and does everything, building you a huge (prepare to wait a while if you choose TrembL) database that you can then run and see if it actually finds you some hits.

Monday, January 15, 2018

ConDuct -- Maximizing ion transmission from the atmosphere into the MS!

I'm just going to leave this here. It's from this study from Andrew Krutchinksy et al. a couple of years ago.

It appears (to this biologist) to be a relatively minor alteration to the front of an instrument that causes an impressive gain in the efficiency of ion transmission from ionization to detection.

The authors conclude with some really interesting assays that the mechanism of action is probably something like this.

A mixture of light and heavy peptides looks pretty convincing (to this biologist).

Sunday, January 14, 2018

NeuCode for cell lines and animals -- full protocol!

NeuCode (neutron coding!) has seemed an almost inevitable replacement for a lot of our labeled proteomics techniques for a few years now. However, the fact that you are simply switching neutrons in different atoms has made some of us kind of gasp at how much resolution you need to pull it off.

(There are over 20 theoretical tags that can exist in something like a 0.030 Da space!)

This new Nature Protocol walks you through the entire thing -- including the really smart instrument method shown above. The trick is using a lower resolution MS1 scan to pick your data dependent ions for fragmentation and then to obtain those fragments in the ion trap simultaneously to your 500,000 resolution scan for quantification.

In case you're would you possibly write that method... Don't worry! They provide step by step instructions for both instruments!

Teaching a mass spec course? Need some awesome cartoons and illustrations?

I have a love/hate relationship with Twitter. It is an AMAZING tool for the rapid dissemination of scientific discovery. If it wasn't for that, I'd probably be happier if it didn't exist.

On the former topic --

Dr. Alex Jones is a researcher at the University of Warwick and she's been been creating a bunch of cool animations for describing mass spectrometry that are available to the community (please give her appropriate credit, of course!). I hope that the positive feedback she's received is enough to encourage her to make some more.

MassSpecPro has also put some out there, but he's an ion physics guy so they're a little beyond this biologist and maybe intro courses (but good for physics classes, I bet!). I've seen them on Twitter, but I'm sure they're on his website here.

Saturday, January 13, 2018

Do we have open source chromatography options?

I was looking for something a little different, but this is too good to not share! 

OPENChrom Community Edition is an open source community driven software for looking at chromatography and mass spec data from virtually any device -- in the same interface.

There is an Enterprise Division if you're really serious and you like it, but if you meet the requirements to use the Community Edition and you're real tired of looking at output from 6 different vendor instruments, this could be huge.

What I was looking for was something like this:

You can't tell me that with today's quality of home electronics that can built with Arduino and controlled with Python and/or Raspberry Pi that I can't recruit some summer students in mechanical engineering and have them build a functional HPLC.

I have access to a number of new/newish HPLCs now and I was surprised to see that many of the features that were standard on our old stack in grad school (that was controlled with a monochrome Macintosh II..and I wonder if it is still cranking along making beautiful chromatograms -- yup! sure is!) aren't things that we have now. Sure, the pressure is higher on the pumps, and the mixing is supposedly more uniform, and there are cool things like what the print cartridge manufacturers use to make sure I'm using the instrument vendor's columns. There are even cooler things like 1/16th inch screw unions that can't be used together and there are inconsistent labeling schemes for solvent delivery line diameters -- from the same vendor.

Important side note: Do not use NanoVuper and ZenFut unions interchangeably! They are both 1/16th. They are not interchangeable or compatible! They just look compatible and the consequences of using the wrong one can mean pulling a switching valve and trying to remove the dead volume seal. I didn't make this mistake, I just heard about it ;)

Maybe I'm just mad about pump seals and a fraction collector that doesn't automatically progress to the next 96 well plate so we can't batch fractionations (different vendors, btw, but the latter makes reproducing the ultradeep proteomics methods out of the Olsen lab seem much more difficult). I'm probably not angry enough to actually spend my Saturday morning emailing engineers and chromatographers I know to see if they'd like to pool resources on such a ridiculous after-school endeavor.  No one is that weird, right?

Albert would you even build an autosampler? Where would you get something in today's world that would have precisely automated and easily programmable recognition of space in X,Y, and Z planes that you could obtain with Prime Shipping by Tuesday? Even if I was that weird, I bet you I'd be stumped by that problem. And I'm sure it would be impossible to use that exact same product to make a fraction collector that would progress to a second 96-well plate for collection of the next sample.

I'm probably only weird enough to ramble on about it on some weird blog...however, that post on CF sure seems to be getting a lot of attention, and I wonder if anyone else is that weird?

EDIT:  WHOA!!! Someone has already done the Autosampler. I didn't have this idea first at all! It's called OpenSampler!!  Check this out!

Holy cow. I'm amazed. motor screws...valco valves....arduino compatible linear actuators...flow meter might be a challenge...though CMOS looks pretty cool...

EDIT 2 (1/15/2018): What about a UV detector for GC or CE separations?

This team built one with a sophomore engineering class in Ann Arbor. Price? $50.

You can check out this Open Access paper here.

Friday, January 12, 2018

Upgrade the valves on your existing nanoLC!

Have you had some vague and somewhat hard to diagnose issues with pressure consistency on your little self-contained nanoLC?  If you think a minute about the symptoms you are seeing does this product make a ton of sense to you? Especially during high pressure switching....?

Thursday, January 11, 2018

Exceeding GPU memory limits to allow proteomics data processing!

This isn't even close to the first time someone has set up GPU based data processing, but it's still really cool!

For people who aren't as obsessed with today's computers, this is how most of them work:
You have a Central Processing Unit (there is probably a sticker on it that says what that is: i7 or Xeon which...really...means nothing because there can be a 3 order of magnitude difference is power and efficiency between different generations of these processor families. It is probably an indication of how much the PC cost when new. i3 being the least and Xeon being the most. That's the CPU and a crazy huge one has 20 cores (though multi-chip setups might allow near 100 these days)

Your computer also has a Graphics Processing Unit (GPU). Lower power ones may be built directly into the motherboard. Higher power ones will be completely separate units attached to the motherboard. Modern GPUs have THOUSANDS of cores. These cores are generally taxed with controlling a small number of pixels and their work load isn't very hard. There isn't a real reason to give them tons of memory.

Side note: Who is old enough to remember when you had to purchase a secondary "math processing unit" for your PC so it could handle big data like multiplying 6 digit numbers....?  We've come a long way!

My understanding is that one of the big problems for searching a spectra with a GPU is the memory issue -- a spectra is too large to fit in the memory for that tiny little processing core.

That's a lot of words to introduce G-MSR which you can read about (open access) here!

G-MSR reduces the data that each GPU core gets hit with to make sure that each little tiny core can handle what it is told to process. The authors hit it with 14 different historic datasets (I think they're all high resolution MS/MS) and they can process all of them.

I looked up the GPU they are using and it's $4,000! It looks like one specifically designed for industrial applications, but it doesn't look like it has more cores or memory than your standard $300-$500 GPU you'll use for 3D rendering, playing Injustice 2, or mining applications. It would be interesting to see how this algorithm would do with something that it would be easy to array 5-10 of....

The program is GPL open and can be downloaded from GitHub here.

ABRF and ASMS registrations are now open!

It's 20178! And time to start picking conferences. I just got an ABRF announcement this morning. I forgot it was Myrtle Beach this year!

You could be there! (Around all the great talks from core directors and high throughput researchers from around the world, of course!)

You can check it out and register here! 

ASMS, however, decided to do a 180 this year. Instead of picturesque Indiana, Missouri and Baltimore (hey! what's wrong with Baltimore?!?! oh yeah...the conference attendees that got mugged assaulted...?'s with this in mind that I feel I have the freedom to make fun of other conference locations, since mine might not have been everyone's favorite). This year, however!!


It's a fact. It is the greatest city in the HISTORY of mankind.  Discovered by the Germans in 1904....

(The best part is the blooper at the end where even Will Ferrell can't say that line with a straight face)

REMINDER!!  ASMS abstracts are due FEBRUARY 2nd!!

Register for ASMS 2018 here!!

Wednesday, January 10, 2018

SNaPP proteomics! Online and automated for nanograms of material!

This sample prep methodology is the topic of 2 recent papers out of PNNL. The first can be found here and the second (that the above image was taken from) is available here.

The idea is simple -- minimize (almost negate) sample handling and get tiny amounts of protein digested automatically and shot into an instrument for proteomics.

The execution is not so simple -- but the two studies both show remarkable success with tiny amounts of protein yielding huge numbers of peptides/protein IDs.

I'm looking for any way I can automate and standardize what is going into the instruments around my work and this looks like a concept that is almost what I'm dreaming of.

Almost -- cause the sample prep isn't happening on the next sample while the first one is running -- but the authors are very clear that they are working on it.

To the authors: If you have the misfortune of landing on this silly site -- If you need a beta tester for your finished project I happily volunteer!