Friday, April 20, 2018
i had to disable my caps lock key or it would look like i was crazily shouting about how much i love this new study and software throughout this whole post. you can check out this awesome new tool here!
if you do highph offline fractionation followed by the same low ph gradient on every one of these fractions and you look at the data coming off you'll think something like "wow...i really could get more ids if i optimized each fraction separately."
but that's a lot of work (and it will impact your reproducibility if you are doing something like label free quan). plus it would be a lot of work. one run to see the relative elution times and a second with the reoptimized gradient..?...
fraction optimizer can do this for you!!!
100% recommend you check it out. you can get the software directly here.
Wednesday, April 18, 2018
I can clearly remember my excitement when I first saw 2 different "-omics" datasets integrated together (that totally worked) in a great study. It might still be in the 1,000+ entries on this scrambled blog, but who will ever know (the search bar appears to lose power as the entries get older -- which is honestly fine by me, I've said some dumb stuff over the years)
These days -- I'm still impressed -- but it's a lot more common. However, I haven't pulled one off myself yet but I'd sure love to....but where do you even start....what about here?
First impression -- wow -- this looks seriously powerful. This team pulls in a bunch of different datasets -- microarrays, proteomics, my heart jumped for just a second because I thought they were also integrating CyTOF data (I don't think they do here, they just discuss the statistics involved) and through "t-stochastic neighbor embedding" --
-- they massively reduce a staggering amount of signals from various sets to very small and shockingly meaningful observations.
Is it a trick?
Honestly, I can't say for sure, but the results seem logical and really impressive. (Two gifs is probably too much for one post).
I start to get nervous as soon as we start talking statistics things -- but if it gets you to a small number of targets that you can validate (and they do) it's a WIN.
I made sure to mention MatLab in the title of this post. MatLab is not free software (though trials are available and many many University's have license deals for the software). There is also a home version that is roughly 1/10 the normal license price, but has some limits on it's functionality. If you have access to this software --- and have some huge and intimidating "omics" data sets you should definitely check this out!
Tuesday, April 17, 2018
Hello Vienna in the summer time!! (I was there in October on vacation 2 years ago and fled south as fast as a 2 cylinder diesel KIA rental "car" could get us somewhere NOT COLD.
I plan to spend far more than 8 hours in Vienna this time. I want to learn all the things on that list. Oh. And I'll be there rambling about some stuff that I'm doing in my lab as well, but nothing as cool as the topics listed on the cool picture above.
If you are interested in learning Advanced Practical Proteomics or know someone who might, send them this link (I think the course tops out at 30 students!)
We can try to solve this mystery together...
...why are there rubber ducks everywhere!?!?
Friday, April 13, 2018
I've been sending this great new JPR study to lots of people for weeks now and I forgot to post it here! (If you think this blog seems frantic -- you haven't received many emails from me....I need an embedded exclamation mark filter...)
Honestly, the picture does a great job of describing the workflow, but I love to type!
The phosphopeptides are going to have low relative ion signal on their own -- maybe too low to find in cell lysate even with a PRM, but a TiO2 enrichment will do a great job of enriching your single phoshorylation site. And -- okay -- hard to admit, but some proteins are too low to find with high confidence without a stupid antibody enrichment thing --particularly in body fluids, but do you have to run the instrument 3 times? These authors say --
Just combine it back together and schedule the single (per appropriate replicate, of course) LC-PRM assays. If your protein and phospho are high enough in abundance that you don't have to add in the extra variables of TiO2 and IP and can just build your own -- maybe using the amazing resources at PhosphoPedia -- even better, but if you don't have enough sensitivity, this method could save you valuable instrument time!
Thursday, April 12, 2018
I absolutely have to run out the door, but I have got to leave this here so I can read it if I ever get caught up again!
Proteomics is great at assigning MS/MS spectra to peptide identifications (Peptide Spectral Match -- PSMs).
We are...not so great...at figuring out what protein that PSM belongs to.
We try really hard and we've got lots of things that do it for us.
As far as I know we've NEVER had a good standard for testing how well that algorithm builds a protein identification out of a PSM. (Sure, we have years of evidence that the things we use work) but this is a standard specifically engineered to help test the efficacy of algorithms!
Check this out here!
Wednesday, April 11, 2018
Need a reason to get to beautiful San Diego a day early for ASMS 2018?
Check out the ridiculously cool lineup of things going on Saturday at the Skyline User Group Meeting!
Normal Skyline stuff is covered. Then it goes crazy.
Skyline for glycans
Skyline for small molecules
Skyline for drug monitoring stuff.
The final talk is Matt McDonald from the University of Pittsburgh. I've had the distinct pleasure recently of working on a project with his lab where they generated the data and I did the downstream data processing. This experience was humbling because I've never -- in my life -- generated data as good as what this team rolled off of an Orbitrap XL. The study is being written up now by the people who generated the samples so I can't go into it, but I think this lab is quietly re-establishing the boundaries of what we can do with clinical proteomics if we step away from our normal routines and put experimental design and QC as our number one priorities.
Registration is free, but space is limited. Also -- some evil corporations are providing lunch!
Tuesday, April 10, 2018
From the title of this blog post (and this great new paper in press at MCP) this sounds really obvious, right?
A bacteria that is famous because it can make really high quantities of industrial chemicals as it grow anaerobically (some of which are shown in the picture above I stole from this great ASM study) regulates itself primarily by the derivatives of the two things it's most famous for producing. End of story.
However -- what the heck is a butyrylation!?! And just because your metabolic pathways produce weird things that doesn't mean those are great things to control your metabolism with, right? I don't know, I find this study really cool for these reasons
1) This is a super weird way of regulating anything.
2) Lots of people are working with this bacteria and trying to make it make more of these chemicals they produce. Virtually all of the world's acetone and butanol is made from propylene (from fossil fuels). Dirty, inefficient systems that are only functional right now because we're pumping these things out of the ground like they aren't in any way finite. At some point our species might take a step back -- consider that we aren't acting any more intelligently in terms of resource utilization/waste management than log phase E.coli in a swirling flask -- and look at alternative ways of doing things. This bacteria could be a great alternative way of getting a chemicals we take for granted! What was I saying...?
OH YEAH!! But consider this -- microbiology is done by genetics. Knockouts, overexpressions, but -- this awesome paper shows that this isn't how this bacteria regulates itself in terms of fine control -- it's regulating itself for all the important (industrial type production) things with PTMs that even 100x coverage on a Hi-SeQ is not going to show you. You want to REALLY engineer this bacteria? You need to find out what a butyrylation is and how to monitor and regulate that.
Saturday, April 7, 2018
HCD is awesome. It's super fast. It breaks more-or-less right along the b/y ion backbone that you get from CID and it's super fast. Also -- it's fast.
Unfortunately -- nothing beats CID in terms of predictable fragmentation. HCD fragmentation efficiency can vary quite a bit (look a the actual eV used in your scan header when you're using normalized CE for proof of this) and is dependent on some additional variables like the fragmentation energy calibration, your HCD gas pressure (and -- no proof of this yet -- the quality of the N2 being delivered.)
I think I've rambled on here more than once about my love of the PROCAL peptide standard. One reason for that is it gives you the capability of calibrating your HCD CE so that your fragmentation patterns match between instruments (PROCAL paper here).
Okay -- so that is one part of the equation -- I need 29 HCD on the Fusion (today) to match 27 nCE on the HF -- but....what's the ideal CE? This team takes a swing at it here!
It's more complicated than you'd guess, unfortunately. But this team sets up a really nice mechanism for determining it. They use a series of different fragmentation energies versus Mascot scores and other metrics and work out this multiphase ideal shown above. There is some further interesting info in here, including how to strengthen different ion series --- super important if you're thinking about doing something crazy like studying peptides where the y-ions aren't going to be as helpful to you.
Friday, April 6, 2018
It should come as no surprise to you that glycans are super important in all sorts of diseases. What we normally do, however, is say -- "hey! this site used to be glycosylated" and put it on the list (because we got rid of the glycans). Totally valid. Great science comes from this and will continue to. While we're seeing more studies with glycan oxonium ion triggered ETD that can work out peptides with glycosylations and what those chains are there are some limitations. First off -- 2 fragmentations slow your instrument down and second off (?) there is a finite limit to the length of the glycan chain you can study and third off (? I should restructure this sentence?) many of the stupid sugars have exactly the same stupid mass -- so you can't tell them apart.
As the body of work continues to build that the actual sugars within the glycan chain is of paramount importance -- examples....here...and here...and here.... --- is it fair to ask the question -- are we focusing our powers on the right side of the molecule?
Of course whether the glycosylation occurs or not is important.
Of course it would be great to know whether you get a di -- or branched-penta peptide at this site. But -- what if you treated a cell and the biggest and most profound difference in that cell was something like this --
-- umm -- I give up. I can't figure out how to rotate this. You're going to have to turn your head. What if your drug functions by completely eliminating an important class of glycosylations -- or two -- that have the same mass as one that it doesn't eliminate? Maybe you could find it, but it sure would be great to have a pipeline specifically find stuff like this (P.S. those papers up there aren't weird -- cancer people are talking about glycans all the time and coming up with crazy ideas for how to study them like making them stick to glass arrays and using lasers and stuff -- dedicated glycan analysis workflows could be VERY popular for your lab) --but how would you ever set one up?!?! I sure don't want to think about it....WAIT....CHECK THIS OUT!
What if this team was already setting up a revolutionary new kind of glycan analysis pipeline? What if you already have everything in your lab that you need to set it up? HPLC? Check. Ion trap? Check! Skyline?!?!? YEAH!!
In negative ion mode diagnostic fragment ions can be produced that can tell between the sugar isomers. This team works this out and shows you how to set up high throughput workflows to figure out what these sugars are. And -- it's Skyline -- you know we can quantify them. Time to break out the Accela (...or...umm...something else....) and put that Ion Trap back to work full-time!
Thursday, April 5, 2018
If you are one of the people who read this weird blog, I've probably convinced you to download Morpheus (and METAMORPHEUS, btw, I think the paper just came out for it -- I can't tell you about that today because it's boring stuff Thursday).
Here are some boring tips that will make you like regular Morpheus better. Associate your TSV files!
Your Morpheus results will pop out as a bunch of .TSV files and some XML thing. If you are a smart data person you probably know how to use the XML thing. I'm not -- and I don't. I use the TSV files and if you are a smart data person please stop reading right about 2 lines ago. Because I'm going to use Excel (...trigger groans....)
However, Excel doesn't know what a .TSV file is either, so you have to associate it. To do this (told you this was gonna be boring!) find the folder where your processed Morpheus stuff went:
Pick one (I only open the Protein_Groups one (I mostly use Morpheus just to get a good snapshot of my experiments and to check my QC standards)
Right click on it and go to Properties
In properties you need to find "Open With"
Here it looks like Windows recognized Microsoft Excel. I assure you that it does not -- and will not. You'll need to Browse for Excel. I can't describe to you how to do that without leaving a lot of profanities typed on this page. If you've used Bing much, you won't be surprised to find out that if you do choose Browse and then think that the search bar that pops up in Microsoft Windows can find the program Microsoft Excel....
...but it is in there somewhere. Once you find it. Checkmark "always use the selected program". From now on your Morpheus output files will all open in Excel. And you'll be so mad when you go to the LTQ you don't use as often and realize you have to do it all over again. But -- eventually -- they'll all be right...
SO...Here I was with evidence it was finally here. I'm insane. For real. Not "Oh, look at the old guy climbing a tree outside the bar, I bet his kids are embarrassed" crazy. The real one. Where you concentrated really hard and on the 5th time you still FAILED TO DO AN IN-GEL DIGESTION. You QC'ed the instrument again. And still nothing. Air bubbles in all the vials? You've checked everything and that help wanted sign you saw on the door at the gas station this morning strangely pops into your mind. What is that doing there? I'm trying to troubleshoot something. No random flashbacks!! Wait. Did I change a nanoLC column at 9pm last night? Why would that even make sense?!?
Wait. What's this? The Biopharma Finder runs you queued up last night finally finished (my gosh -- great software -- but we need to see if we can install it on a cluster or something -- slooooow....) says this other protein digested great with trypsin AND LysC, but no Glu-C peptides at all? It's not me (exclusively) me!?!? It might (also be) the enzyme!?!?
Okay -- so -- I'd just assumed everybody had the exact same enzyme source and they probably just change the label on the vial. Apparently not true.
Promega (don't sue me!) clearly states -- suitable for in-solution digestion. Doesn't say in-gel. Ask my friend who didn't take 5 years off from science. Of course she already knew this.
Okay -- crisis averted -- I guess. Especially since a friend can loan me one that is compatible tonight!
Pierce's Glu-C specifically states it is in-gel compatible.
NEB says theirs is and recommends a specific buffer they provide and protocol.
Roche Glu-C is in-gel compatible.
Probably others. And maybe Promega's is too, but it doesn't work in my hands. It worked great for an in-solution digestion, though.
Wednesday, April 4, 2018
I bet it wasn't this one....which is still hilarious....
...and better than I could do...and turned out okay in the end (NY Post article)
Maybe you knew you could do this. I didn't.
If you are sitting there thinking "hey...something looks a little off on this replicate.." and wondering whether you should stop the system and run a QC --
---to the rescue!!
I have Morpheus installed on every proteomics acquisition PC in our lab. If you click on the top picture you'll see that I have it set so it can't use more than 2 CPU threads on the PC. It works great for the PC's that have 8 cores -- on the LTQ I only allow it one thread (4 core PC). I don't know that using all cores will push the PC too hard and will crash Xcalibur -- but I'd rather be cautious!
What a revelation, though!! 60 minutes into the run -- yes -- something is a little fishy here at the MS2 level. Time to stop the run and do something about it so it doesn't impact tonight's samples!
If you don't have Morpheus, you should. You can get it from Github here. And this is the original paper from the Coon lab describing it.
Tuesday, April 3, 2018
KRAS is a little protein that is a big deal. As one of the most commonly mutated genes in cancer -- and one of the very worst -- loads of people (including a lot of friends of mine) are working on assays to figure out things like:
How much is there?
What mutant variants are there?
What is the ratio of normal to mutant?
With a goal being -- rapid -- sensitive -- and hopefully clinically adaptable!
Most of these assays are digestion shotgun based. Should/could we flip the paradigm and do TOOOOP DOOOOWN PROOOOOTEEEEEOOOOMIIIICS!?!? Picture any author of this great new paper yelling that out the window of a car as they drive by you. It's much more fun that way.
There are some very good reasons for looking at these proteoforms from the intact protein level. Seriously. Again, it is a small protein. We're probably looking for a single amino acid substitution in a small protein. When we digest that protein and use standard global approaches identification is complicated by a couple of things. The wild type form is often still present AND there can be large differences in abundance between the WT and variant(s) forms.
From the intact protein level we can get 1) the intact mass shift (super important here) and verification from the MS/MS mass shifts. This is 2 points of evidence of the variant -- compared to one single peptide.
Having spent some time looking at this in global data I can tell you that the data processing shortcuts we use in virtually all proteomics software are NOT friendly to these proteins. Protein grouping and
strict parsimony. Remember those? These are trying to make your data report as simple and accurate as possible by making assumptions. Have you ever fed 2 virtually identical FASTA isoforms in and looked at data where both protein/peptide isoforms are present?
You'll get a surprising read-out; if the variant peptide is identified with confidence in run #1 it will appear that only the variant form is detected.
In run #2 if that peptide scores, for example, medium confidence (below your filter cutoff -- remember these are looow abundance proteins) the next run scores that only the WT is present. Protein grouping, strict parsimony, and the use of razor peptides confound your results.
P.S. Quick reminder: If you are still using Proteome Discoverer 1.4 or earlier -- please keep in mind that if you have equivalent information for 2 possible FASTA entries -- PD will always give you the smallest entry (because it is the highest % protein coverage -- this is why you see so many more shortened isoforms of proteins in global data than people using other software. In PD 2.0 or later, the longest peptide gets the nod.
You can avoid all this with top-down!
This team uses a pan-RAS IP (pulls down KRAS, NRAS, HRAS....
....um...thought I'd pull some lyrics from this album to extend the joke....nope. That parental advisory sticker is there for very good reason....yikes....moving on!)
On top of the peptide sequence variation, the terminus of KRAS can be modified in a number of ways --non-mass spec friendly ways.
Now I assume I've convinced you that we'd be better of studying KRAS isoforms/proteoforms with top-down rather than bottom up. But -- will this work in real samples?
This team optimizes that assay for cell lines in culture -- and then gets characterized material from CPTAC -- and shows it works there! I'm a little unclear on the amount of material it took from the fixed CPTAC tissue to characterize it, but I'm assuming we're looking at the normal amount -- what will fit on a slide and show they can determine -- on a precision medicine lab budget friendly Q Exactive, the KRAS isoforms present. As the title suggests -- this wasn't entirely their aim, I guess, they also reveal more basic info about KRAS mutant biology, but that is beyond me. I'm just psyched to have a simple straight-forward KRAS top down isoform characterization assay all ready to go!
Well...I wasted some time this morning because I misread the title of this great new study!!
My first thought was "Master of Time!!! Gotta find a picture of Roger Delgado (he died in a car accident in his second season of Doctor Who, but he and Anthony Ainley look so similar that people who grew up in places where there were more than 2 TV stations (and one that...strangely... just played Doctor Who reruns all the time? Probably think it was the same guy all those years). )
HOWEVER -- FASTER PERCOLATOR!??! Sign me up!
These authors substitute the primary machine learning algorithm Percolator is based on (I2-SVM-MFM -- which is easy to remember with a somewhat childish mnemonic device I won't share here) for a newer algorithm called TRON. They say it works and it works faster, and that's enough for me!
If you don't want to read the paper and just want FASTER PERCOLATOR -- you can get it from Github here!