The Secrets of Switching from Physics to BioInformatics

Stuart Jefferys explains his dramatic tale!

Stuart tells us of the first time in his life that he felt that computational biology was a great thing to study and contribute to. Stuart heads up the P.I.E project – also known as the Protein Inference Engine. His work uses some stochastic models to infer probable places for post translational modifications in proteins based on the analysis of tandem mass spectrometry data. Check out the video for the geek factor turned up to 11!

Two new papers for high accuracy protein ID, HMM_Score and PMM

HMM_Score is based on machine learning principles using a hidden Markov model to significantly improve the accuracy of peptide identification for tandem mass spectrometry (MS/MS) data. The Bioinformatics article is available via Open Access. (DOI 10.1093/bioinformatics/btn011). The associated software can be downloaded from our Downloads page, and updates seen at the HMM_Score page.

The Peptide Markov Model (which, it turns out, doesn’t really need the “Markov” part to work well) also uses machine learning principles to significantly improve the accuracy of protein identification using peptide mass fingerprinting (PMF). While many have come to see the PMF approach as out of date due to the advent of ever-better MS/MS based methods, the paper shows that when peptide ion peak intensities are correlated with the peptide sequence, PMF can rival the accuracy of MS/MS based identification. The paper is available via “Author Choice” (open access) here, DOI: 10.1021/pr070088g. We are continuing to explore how this may be used to improve results from shotgun searches as well.

Too Many Scientists?

Recently, Brian Martinson had a commentary in Nature about the overproduction of scientists in the US. He argues:

Yet, largely because of the structure of the funding flows between the NIH and the universities, there are few checks in the system to keep competition for grant funding at a healthy level. Thus, calls for further increases in the NIH budget may only make matters worse.

I wholeheartedly agree that this issue needs more attention, since the focus has solely been upon the NIH funding pool rather than the pool of qualified researchers. His post has sparked further discussion. I’m glad there’s discussion about this, because our “production” of graduate students is not often assessed from the standpoint of the market for them when they graduate. It seems to me that the marketability of the Ph.D.’s we are granting should be given consideration on occasion.

However, I diverge from the viewpoint espoused by Brian Martinson on one key point. His considerations reflect a snapshot of the current point in time, where there is an oversupply of scientists and undersupply of funding. Before making any drastic changes, it seems worthy to look into the crystal ball and figure out what else is at play that might alter the supply/demand equation. While it would be foolish to try to make accurate predictions, there are some big trends in demographics that indicate that we are not necessarily in an equilibrium situation, and that the present oversupply of people may change.

Here are two factors to consider:

  • The Baby Boom. In the 80’s and 90’s there was a huge influx of researchers into the academic system, many of whom are part of the baby boom generation (those born from 1946 to 1964). For those of us who followed this boom, academic jobs and resources have been scarce, simply because there’s this large group of more senior folks already filling the great positions. However, the first wave of the boomers is nearing retirement age – they just hit 60 years old last year. In my department alone, I can think of quite a few senior faculty that are within 10 years or less from retirement. It is my belief that once they retire en masse, filling these positions with qualified replacements will actually be quite challenging, and we may find that the supply/demand ratio has inverted.
  • International Scholars. The second point is that the demand for US positions by international scholars has been very high for quite some time, since it was seen as a great place to do research. However, I believe that perception is changing, and will continue to change. My assertion is from several factors: 1) The increasing investment in top-notch infrastructure and facilities in places like China is beginning to attract qualified researchers to positions elsewhere; and 2) In the aftermath of Sep. 11th, visas have been significantly harder to obtain, and though it has eased somewhat, there is still a perception of difficulty coming to the US. Because of these, while we will continue to attract great foreign scientists, the competition will be reduced. From personal experience trying to recruit capable bioinformatics-related postdocs, the supply in the US is not yet meeting demand.

These considerations make me wary of attempting to tinker blindly with the supply/demand, i.e. changing the number of PhD’s produced by the system. In fact, I believe there is a third factor at play – the same feedback loops that Brian says are lacking:

There are insufficient ‘feedback loops’ linking the production of biomedical researchers to the availability of resources to support them. Instead, the educational system is replete with incentives to generate ever more PhDs and medical doctors.

I disagree – I think the feedback loops are sufficient, but just take time to play out – on the order of 5-10 years. Here are two ways that these feedback mechanisms play out:

  • Lower success rates in NIH funding means that individual labs can’t support as many graduate students. If the trend continues for several years in a row, there will be increasing difficulties placing qualified graduate students in labs, and this will put pressure on admissions committees to throttle down admission. Then, 4-6 years later, there will be less PhD’s graduating.
  • Graduate students (and potential graduate students) aren’t blind to these issues. All of the graduate students in my department seem acutely aware of the difficulties of grant funding in the present environment. This discourages people from going to graduate school, from completing their PhDs, and from trying to find an academic job. Again, this feedback mechanism takes years to play out, but it does. Just look at the cyclical nature of information technology jobs to see an example of a feedback loop with delay (hence causing oscillation). I have been in the latter field for long enough to have seen several such oscillations between gloom and doom versus hopeless optimism. If one waits long enough, any scenario will come true!

One last consideration is yet another form of feedback: the difficulty of grant funding should be a motivator for young scientists to join cutting edge research fields. While the competitiveness for “traditional” research grants is extreme, there has been less competition in a bleeding-edge field like bioinformatics (and proteomics) because there isn’t a large cadre of senior researchers in the field. While that is not the reason I picked work in this field, it is probably the main reason I am still in the field, in an academic post. In essence, this particular feedback loop encourages people to join fields where there is the need for more people, and discourages them from joining overly-crowded research fields.

In fact, I believe this latter issue is a disadvantage for senior researchers, since it is challenging to completely re-focus a whole lab’s efforts in new directions and fields. It is much easier for a young researcher starting out to jump into a new area.

So in sum, I believe that while the problem is worth some thought and attention, in the end there are a number of feedback loops acting, which in combination with demographics, may yield a situation in 10-20 years that may look much the opposite of the situation we are in today (i.e., not enough qualified people for the jobs available).
– Morgan

We’re on the cover of BMC Evolutionary Biology

The front page of the Journal BMC Evolutionary Biology features our paper on the evolution of the Dscam gene family in their Research Highlights section on the front page.

This is gratifying, since this paper has been a long time coming. Among other things, the lead author, Mack Crayton, was displaced from his new home in New Orleans while we were in the revision process.

This paper has now been rated as “highly accessed” by BMC Evolutionary biology.