AlphaFold @ CASP13: “What just happened?”

Update: An updated version of this blogpost was published as a (peer-reviewed) Letter to the Editor at Bioinformatics, sans the “sociology” commentary.

I just came back from CASP13, the biennial assessment of protein structure prediction methods (I previously blogged about CASP10.) I participated in a panel on deep learning methods in protein structure prediction, as well as a predictor (more on that later.) If you keep tabs on science news, you may have heard that DeepMind’s debut went rather well. So well in fact that not only did they take first place, but put a comfortable distance between them and the second place predictor (the Zhang group) in the free modeling (FM) category, which focuses on modeling novel protein folds. Is the news real or overhyped? What is AlphaFold’s key methodological advance, and does it represent a fundamentally new approach? Is DeepMind forthcoming in sharing the details? And what was the community’s reaction? I will summarize my thoughts on these questions and more below. At the end I will also briefly discuss how RGNs, my end-to-end differentiable model for structure prediction, did on CASP13.

“What just happened?” was a question put to me in exactly these words by at least one researcher at CASP, and a sentiment expressed by most academics I spoke with. As one myself, I shared it going in and throughout the meeting. In fact I went into CASP13 feeling melancholy (the raw results were out two days prior), although my mood lifted during the meeting due to the general excitement and quality of discussions, and as my tribal reflexes gave way to a cooler and more rational assessment of the value of scientific progress.

This will be a long post. I will start with the science: the significance of DeepMind’s result, their methodology, and how it relates to existing methods. Then I will discuss the sociology: how people reacted, why we did so, what this means for the academic discipline of protein structure prediction (and life science companies), and how I think we ought to move forward. After what I hope is an exposition of general interest, I will briefly discuss how RGNs performed at CASP13. Spoiler alert: not very well, partly because the value of co-evolutionary information increased substantially in this CASP relative to prior ones, and partly because I could not submit the original submissions unaltered owing to technical problems.

For the sake of making this post easier to navigate, below is a table of contents.

Update: Jinbo Xu kindly wrote a number of thoughtful points in the comments section below, particularly here and here. They are well worth reading.

Table of contents

The science

Significance

Let me get the most important question out of the way: is AlphaFold’s advance really significant, or is it more of the same? I would characterize their advance as roughly two CASPs in one (really ~1.8x). Historically progress in CASP has ebbed and flowed, with a ten year period of almost absolute stagnation, finally broken by the advances seen at CASP11 and 12, which were substantial. What we’ve seen this year is roughly twice as much as the recent average rate of advance (measured in mean ΔGDT_TS from CASP10 to CASP12—GDT_TS is a measure of prediction accuracy ranging from 0 to 100, with 100 being perfect.) As I will explain later, there may actually be a good reason for this “two CASPs” effect, in terms of the underlying methodological breakdown. This can be seen not only in the CASP-over-CASP improvement, but also in terms of the size of the gap between AlphaFold and the second best performer, which is unusually large by CASP standards. Below is a plot that depicts this.

Top two performers at CASP13 (GDT_TS)
Curves show the best and second best predictors at each CASP, while the dashed line shows the expected improvement at CASP13 given the average rate of improvement from CASP10 to 12. Ranking is based on CASP assessor’s formula, and does not always coincide with highest mean GDT_TS (e.g. CASP10.) Error bars correspond to 95% confidence intervals.

Prior to CASP10, for roughly ten years, the curve was basically flat. CASP11 began to show life because of the introduction of co-evolutionary methods, but just barely because most FM targets had shallow multiple sequence alignments (MSAs), which are required for co-evolutionary methods. CASP12 was when the power of these methods finally got demonstrated, and CASP13, even when excluding AlphaFold, showed further progress due to the widespread adoption of deep learning in co-evolutionary methods. We see that the second best method (Zhang server) improved by almost exactly “one expected CASP”, reflective of the field-wide improvement, and AlphaFold added to this yet another “one CASP”’s worth of improvement. Note these “one CASP”s are very recent history dependent, really just the past few CASPs (10-12), and so please take them with a mountainful of salt. Note also that my method of using mean GDT_TS is problematic because the difficulty of FM prediction targets varies from one CASP to another, although they’ve been supposedly stable recently.

Taken together the above suggests substantial progress, more so than usual, and hence not only did AlphaFold “win” CASP13, but did so by an unusual margin. Great! Does this mean the problem is solved, or nearly so? The answer, right now, is no. We are not there yet. However, if the (AlphaFold-adjusted) trend in the above figure were to continue, then perhaps in two CASPs, i.e. four years, we’ll actually get to a point where the problem can be called solved, in terms of gross topology (mean GDT_TS ~ 85% or so). Of course, this presupposes that the trendline will continue, and we have no real reason to believe that it will, at least not without new conceptual breakthroughs. Keep in mind that unlike other areas of machine learning, new protein structures are not appearing at an increasing rate, and so waiting things out will not help.

The above graph is misleading in one way though because it is dependent on a specific metric, GDT_TS, which only measures gross topology. If we care about high resolution topology, which we certainly do for most practical applications, then a more appropriate metric is GDT_HA, and using it the picture looks a bit different:

Still a good trendline, but much further down from a “solution”.

Another caveat is that both of these metrics measure global goodness of fit, which is important in terms of the basic scientific problem, but is often not indicative of functional utility. Local accuracy, for example the coordination of atoms in an active site or the localized change of conformation due to a mutation, is what is often sought when answering broader biological questions. Global metrics hide local discrepancy by diluting it in the sea of generally good agreement between experimental and predicted structures.

Another way of thinking about this is asking whether the same headlines would have been generated had an academic group achieved the same increase in accuracy that DeepMind has. The answer is certainly not, and we have the CASP11 → CASP12 advance to confirm that, as it was about equal in absolute magnitude (and thus arguably harder coming from a lower starting point) but generated few if any headlines. DeepMind’s publicity machine certainly helped shine a bright light on their advance, which is frankly also good for the field as a whole.

None of this is to detract from the AlphaFold advance. It is an anomalous leap, on the order of a doubling of the usual rate of improvement, and portends very favorably for the future. But that future has yet to be realized. (I actually think people may have walked away a bit too optimistic from this CASP—a DeepMind joins the field for the first time only once, and the value added of their excellent engineering may not get repeatedly re-realized, but we’ll see.)

Prior work

Let me now switch gears and talk a bit about the landscape of protein structure methodology before AlphaFold’s arrival. I won’t talk much about RGNs here because in some ways they are much more unusual methodologically than AlphaFold is, and so the two are well separated in algorithm space.

AlphaFold is a co-evolution based method, building on the groundwork that has been laid in the past ~7 years by several academic groups. The basic idea is to extract so-called evolutionary couplings from protein MSAs by detecting residues that co-evolve, i.e. that have mutated over evolutionary timeframes in response to other mutations, thereby suggesting physical proximity in 3D space. The first batch of such approaches [2, 3, 5] predicted binary contact matrices from MSAs, i.e. whether two residues are “in contact” or not (typically defined as being within <8Å), and fed that information to simple geometric constraint satisfaction methods to fold the protein and return its 3D coordinates. (There is a pre-history to this field when overly simple statistical models were used to predict such contacts, dating back to the 90s, but I will not cover it as that generation of approaches was not successful and I am by no means trying to be comprehensive here.) This first generation of methods was a substantial breakthrough, and ushered in the new era of protein structure prediction that finally showed promise of working.

An important if expected development was the coupling of such binary contacts with more advanced folding pipelines such as Rosetta and I-Tasser, which resulted in better accuracy and were the state of the art until around the middle of 2016, or just before CASP12. The next major advance came from applying convolutional networks and deep architectures (residual networks) to integrate information globally across the entire matrix of raw couplings to turn them into more accurate contacts. Jinbo Xu’s group developed the first major (and experimentally serious) version of this approach, among others [1, 4, 6].

Which brings us to the present and AlphaFold. Only a few weeks before the CASP13 results became public, Xu published a preprint on bioRxiv that predicted inter-residue distances instead of binary contacts [7]. It used the same input (MSAs), and largely the same architecture as his CASP12 approach, but predicted probabilities over a discretized spatial range and then picked the highest probability one for feeding into CNS to fold the protein. Xu’s preprint showed significant promise on a subset of CASP13 targets, and the buzz among some of us was that Xu’s approach would win the competition. As it turns out, this seemingly simple change had a surprisingly profound impact, and forms one of the key ingredients of AlphaFold’s recipe.

AlphaFold

DeepMind has promised to publish a paper on AlphaFold, so the final and definitive description will have to wait for their paper, which I hope will be thorough. They have no plans to release the source code, and are unlikely to put up a public prediction server in the near term, although they appear open to considering it at some point. Having said that, they were generally forthcoming in discussing their method during CASP13, and appeared genuinely interested in sharing the approach with the community and ensuring that people can build on it. The sense I got was that they are in it for the science.

Just like Xu’s approach, AlphaFold uses a softmax over discretized spatial ranges as its output, predicting a probability distribution over distances (the details of the convolutional ResNet architecture are different, but it remains unclear how large a contribution these details made.) Unlike Xu’s approach, which tosses away these probabilities and only uses the most likely distance bin as input to CNS, AlphaFold uses the entire distribution as a (protein-specific) statistical potential function that is directly minimized to generate the protein fold. The key idea of AlphaFold’s approach is that a distribution over pairwise distances between residues corresponds to a potential that can be minimized after being turned into a continuous function. They initially experimented with more complex approaches, including fragment assembly using a generative variational autoencoder. Remarkably however, halfway through CASP13, they discovered that simple and direct minimization of their predicted energy function, using gradient descent (L-BFGS), is sufficient to yield a high accuracy fold. And so they essentially switched to this approach half way and it represents the essence of their final model.

This idea looks deceivingly simple but has rather profound implications. I think its simplicity may somewhat mask the difficulty with which it can be arrived to. More often than not in science, particularly physical sciences, a simple change in perspective can lead to surprising changes in outcomes. The paradigm of predict contacts → feed into complex folding algorithm was so entrenched in the field that it was difficult for most to see it as unnecessary (including for DeepMind’s team, which tried more conventional folding approaches before discovering that a simpler approach works just as well.) Much of the pushback I received toward my end-to-end differentiable approach was because it eschewed any sampling and directly folded the protein.

There are some important technical details. The potential is not used as is, but is normalized using a learned “reference state”, harking back to the old days of knowledge-based potentials like DFIRE and the Quasichemical potential (parenthetically, I wrote a couple of papers on the topic, developing what I think was the first ML-based potential for protein-DNA interactions.) This normalization evidently had a large impact. Furthermore, their potential is coupled with a more traditional physics-based potential and the combined energy function is what is actually minimized.

This idea of predicting a protein-specific energy potential brings AlphaFold’s approach into proximity to another approach, called NEMO, which is currently in open review at ICLR. While the submission is anonymous, it is fair to conclude that, given this talk, it’s been developed by John Ingraham, Adam Riesselman, Chris Sander, and Debora Marks. NEMO too generates a protein-specific energy potential that is then minimized to yield the final protein, but the similarities end there. AlphaFold generates the potential using a neural network, but once done, turns it over to a minimizer that operates independently and is not optimized jointly with the neural network. NEMO on the other hand turns the entire folding process into a differentiable Langevin dynamics simulator, and backpropagates from the final predicted structure through a few hundred steps of the simulator into the neural network variables. Additionally NEMO, like RGNs, only uses raw sequence information and PSSMs.

While the AlphaFold and NEMO approaches do harken back to knowledge-based potentials, they are different in a fundamental way. The knowledge-based potentials of yore (and current physics-based potentials like Rosetta) are universal, in the sense that they at least pretend to be applicable to any protein, and would yield the right result if enough sampling was done to find their minimum. Whether this is true or not in practice is a different matter. The protein-specific potentials of AlphaFold and NEMO are quite different beasts. They are entirely a consequence of the MSA (or sequence + PSSM) that they depend on. What they do is construct a potential surface, particularly in the case of AlphaFold, that is very smooth for the given protein family, and whose minimum closely matches that of the native protein (-family average) fold. It is fantastic (and surprising to some, but I would argue RGNs already showed it is possible by doing so implicitly in the RGN latent space) and extremely useful, in that it allows one to make accurate predictions given an MSA. But it is not an energy potential in the conventional sense.

I should say that this is my characterization and not DeepMind’s. In general I have fairly strong feelings about protein-specific energy potentials, and was planning on writing a more detailed blog post about the topic in connection with the NEMO paper, but have not gotten around to it yet (and unfortunately probably never will.)

Below is a table that summarizes my view of how all the approaches I have discussed so far relate. Adjacent columns in the table indicate methods that in some sense are most similar, but because this is a multi-dimensional space, the relationships are more complex than that. For example, Xu’s approach is similar to AlphaFold because of their prediction of distances, while NEMO is similar to AlphaFold because of their use of protein-specific energy potentials, while NEMO and RGNs are similar because they are end-to-end differentiable and don’t use MSA data, which puts them in a different category altogether. I should point out that NEMO did not participate in CASP13, and neither NEMO nor RGNs are broadly competitive with the other methods (particularly on templated-based modeling (TBM) for RGNs), at least in part because they are using a lot less information.

ZhangXuAlphaFoldNEMORGN
InputsMSAMSAMSASequence or PSSMPSSM
Outputs (pre-folding)ContactsDistancesDistributions over distancesCartesian coordinates (folding internal)Cartesian coordinates (folding internal)
FoldingI-TasserCNSL-BFGSDifferentiable Langevin dynamicsImplicit
Energy functionExplicit, fixed, and universalNoneExplicit, learned, and MSA-specificExplicit, learned, and sequence- or PSSM-specificImplicit, learned, and PSSM-specific
Uses templatesYesNoNoNoNo
End-to-end differentiableNoNoNoYesYes

The careful reader will note that one column in the above table covers the Zhang group method, which I have not talked about much. Zhang’s approach is interesting for several reasons. First, it came in second during CASP13, and when looking at the overall results (not just FM but also TBM), it is not that far behind AlphaFold’s method. Remarkably, Zhang’s approach does not use predicted distances, but relies on the old style binary contacts. This raises the question of where their improvement is coming from. There are several things going on. While Xu’s approach uses the more informative distances, its folding pipeline is rather simplistic. Zhang’s approach, while using the less informative binary contacts, folds via the sophisticated I-Tasser engine. Since the groups were working independently (and largely in secrecy and competitively), they did not add up their relative contributions. If it were not for AlphaFold, this combined “double” effect may not have been seen until CASP14, but AlphaFold effectively did both at once. Of course, the way AlphaFold achieves this is not via a better folding engine, as theirs is very simple too (L-BFGS). Rather, they get around the problem by building a better energy potential using distributional information. But the advantages of having such an energy potential may be compensated by using a stronger folding engine. I-Tasser also uses templates from the PDB which can substantially help its performance on TBM targets. And perhaps there is further gain to be had by combining AlphaFold’s approach with something like I-Tasser or Rosetta, but AlphaFold’s preliminary results seem to suggest that they’ve already squeezed out what can be had from a better folding engine.

This sheds some light on AlphaFold’s novelty (more on this next.) If it weren’t for AlphaFold, what the field may have moved towards is combining Xu’s approach with Zhang’s, which would have arguably been less elegant than AlphaFold. But this is highly speculative, and it is likely there’s a “half CASP” waiting to be squeezed out by leveraging these partially complementary approaches.

Fundamental scientific insight or superb engineering?

A question that arose over many conversations at CASP13 is whether AlphaFold represents a triumph of insightful science or superb engineering? Such questions can often be silly and divisive (with science somehow occupying a higher ethereal realm than engineering), but at the heart of the question is whether AlphaFold “only” won because it has a large and well-funded team with inexhaustible compute resources, and therefore the academic community has nothing to feel bad about and need not engage in uneasy introspection (you can tell I’m gearing up to shift to the sociology), or whether they have done good science that the academic community missed out on. Insofar as this question merits answering, my own take is that it’s a mixture of the two.

On the fundamental insight front, AlphaFold had a number of good ideas. First, don’t just predict contacts, but also distances, something that Xu does as well but all indications point to the two groups having independently developed the idea. Critically, AlphaFold takes this a step further by predicting a distribution over distances, and then uses that to construct a smooth potential that is minimizable. A second good idea is the use of a reference state, which debiases the predicted potential and demonstrates a solid understanding of knowledge-based potentials that reflects positively on the DeepMind team. The fact that these ideas are “simple”, in the sense that they are unsurprising does not detract from them in the least bit (personally I was actually surprised by the impact the reference potential made, but others appeared less surprised.) The best science is one in which simple ideas have profound consequences, and it very much appears to be the case here. DeepMind is of course also leveraging their deep (no pun intended) expertise in machine learning. For example, the distributional prediction idea seems somewhat similar in spirit to their paper from about a year ago on distributional RL. Whether that insight had any impact on AlphaFold I don’t know, but I think it’s fair to say that the confluence of strong expertise in ML and proteins helped to bring about these advances.

On the engineering front, it’s also clear that the apparently elegant solution we see now is a result of much trial and error, and that much more complex components involving fragment assembly and so on were tried and disposed of. The ability to explore model space rapidly depends heavily on both computational and human resources. So while the final ideas are simple and elegant, they are unlikely to have been discovered if the AlphaFold team wasn’t able to sweep through idea space as rapidly as they did.

If I were to pick, I think about half of the performance improvement we see in AlphaFold comes from the simple ideas above, and about half from the sophisticated engineering of the distance-predicting neural network. If this is true, then academic groups should be able to see substantial improvements in fairly short order.

The sociology

“What just happened?”

Now that the serious and respectable matters are out of the way, I can finally engage in some gossip. This part will be quite the rant. Like I alluded to in the very beginning of this post, there was, in many ways, a broad sense of existential angst felt by most academic researchers at CASP13, including myself. In a delicious twist of irony, we the people who have bet their careers on trying to obsolete crystallographers are now worried about getting obsoleted ourselves.

I think many of us went through the following phases: (i) fearing that the DeepMind team outsmarted us all by some brilliant fundamental insight, combined with virtuoso engineering; (ii) breathing a sigh of relief that the insights were not radically different from what most of the field was thinking; (iii) (slightly) belittling DeepMind’s contribution by noting its seeming incrementality and crediting their success to Alphabet’s resources.

Setting aside the validity of the above sentiments, the underlying concern behind them is whether protein structure prediction as an academic field has a future, or whether like many parts of machine learning, the best research will from here on out get done in industrial labs, with mere breadcrumbs left for academic groups. Truth be told, I don’t know the answer, and I think it’s possible that some version of this will come to pass. What is clear is that the protein structure field has a new, and formidable, research group. For academic scientists, especially the more junior among us, we will have to contend with whether it’s strategically sound for our careers to continue working on structure prediction. Despite the size of the Baker and Zhang groups for example, I never felt intimidated by them, because on the novelty front I always felt I was several steps ahead. But with DeepMind’s entry I will have to reconsider, and from conversations with others this appears to be a nearly universal concern. Just like in machine learning, for some of us it will make sense to go into industrial labs, while for others it will mean staying in academia but shifting to entirely new problems or structure-proximal problems that avoid head-on competition with DeepMind.

So that’s what just happened. What I’d like to turn my attention to now is what this episode says about academic science, particularly as it pertains to protein structure prediction, and the scientific health of pharmaceutical companies (prepare to be roasted!)

An indictment of academic science

I don’t think we would do ourselves a service by not recognizing that what just happened presents a serious indictment of academic science. There are dozens of academic groups, with researchers likely numbering in the (low) hundreds, working on protein structure prediction. We have been working on this problem for decades, with vast expertise built up on both sides of the Atlantic and Pacific, and not insignificant computational resources when measured collectively. For DeepMind’s group of ~10 researchers, with primarily (but certainly not exclusively) ML expertise, to so thoroughly route everyone surely demonstrates the structural inefficiency of academic science. This is not Go, which had a handful of researchers working on the problem, and which had no direct applications beyond the core problem itself. Protein folding is a central problem of biochemistry, with profound implications for the biological and chemical sciences. How can a problem of such vital importance be so badly neglected?

Part of the problem is the nature of academic research. Marc Kirschner recently framed this beautifully, and I will copy it here verbatim:

“I believe that science, at its most creative, is more akin to a hunter-gatherer society than it is to a highly regimented industrial activity, more like a play group than a corporation.” – Marc Kirschner

I wholeheartedly agree with this, and think it is a good thing. The problem occurs when we take this analogy to mean that each small unit of hunter gatherers must defend its turf at all costs, as if the acquisition of scientific knowledge is akin to the hording of food. Science is, in the final analysis, a collective enterprise, and we all gain the greatest benefit when we cooperate and share our knowledge. An element of competitiveness is unavoidable given the human nature of this activity, but it should not rise to the toxicity that currently characterizes much of academia.

More important, and this is where the protein structure field has a very serious problem, the sharing of information must occur with frequent regularity. Even if individual groups are secretive while carrying out their research, if the frequency of sharing is on the order of months, as is typically the case in machine learning, the field can still progress at a rapid pace. But in part due to the canonicalization of CASP, protein structure prediction effectively has a two-year clock cycle, where separate research groups guard their discoveries until after CASP results are announced. As I discussed earlier, it is clear that between the Xu and Zhang groups enough was known to develop a system that would have perhaps rivaled AlphaFold. But because of the siloed nature of the field, it only gets a “gradient update” once every two years. Academic groups are thus forced to independently rediscover the wheel over and over. In DeepMind’s case, even though the team was small in comparison to the total headcount of academic groups, they were presumably able to share information on a very regular basis, and this surely contributed to their success.

The reliance on CASP dates back to an era when structure prediction did not work at all, and when best practices about data separation and prevention of information leakage were not broadly understood. We exist in a very different climate today. Most researchers understand the issues, and are perfectly capable of constructing training and test sets that properly assess the performance of their methods. My own work in this effort, the ProteinNet dataset, is one concrete contribution I have made to democratize and speed up progress in the field. There will invariably be papers with poor controls and exaggerated claims, but the paranoia of cheating and ineptitude must be balanced with encouraging rigorous but rapidly evolving method development.

CASP serves a crucial purpose, and must continue to do so. DeepMind’s results would not have been nearly as convincing had they not taken place as part of CASP. But we must have a middle ground between the gold standard and a more iterative approach to publication and information exchange. CAMEO helps in this regard, but its targets are often not difficult enough. ProteinNet or something like it, like the NEMO authors’ approach of using CATH-based purging, should be encouraged as a mean to provide acceptable assessment of model quality, especially when it is coupled with release of source code that enables transparent reproduction of the training process.

To be sure, the above will not close the gap between academic and industrial research. There are other, more fundamental problems. For example, competitively-compensated research engineers with software and computer science expertise are almost entirely absent from academic labs, despite the critical role they play in industrial research labs. Much of AlphaFold’s success likely stems from the team’s ability to scale up model training to large systems, which in many ways is primarily a software engineering challenge. While academic labs do not need to perform at the level of Google, they must perform at an adequate enough level to support the core scientific mission of their institutions, and this is not currently happening in my opinion.

An indictment of pharma

What is worse than academic groups getting scooped by DeepMind? The fact that the collective powers of Novartis, Pfizer, etc, with their hundreds of thousands (~million?) of employees, let an industrial lab that is a complete outsider to the field, with virtually no prior molecular sciences experience, come in and thoroughly beat them on a problem that is, quite frankly, of far greater importance to pharmaceuticals than it is to Alphabet. It is an indictment of the laughable “basic research” groups of these companies, which pay lip service to fundamental science but focus myopically on target-driven research that they managed to so badly embarrass themselves in this episode.

If you think I’m being overly dramatic, consider this counterfactual scenario. Take a problem proximal to tech companies’ bottom line, e.g. image recognition or speech, and imagine that no tech company was investing research money into the problem. (IBM alone has been working on speech for decades.) Then imagine that a pharmaceutical company suddenly enters ImageNet and blows the competition out of the water, leaving the academics scratching their heads at what just happened and the tech companies almost unaware it even happened. Does this seem like a realistic scenario? Of course not. It would be absurd. That’s because tech companies have broad research agendas spanning the basic to the applied, while pharmas maintain anemic research groups on their seemingly ever continuing mission to downsize internal research labs while building up sales armies numbering in the tens of thousands of employees.

If you think that image recognition is closer to tech’s bottom line than protein structure is to pharma’s, consider the fact that some pharmaceuticals have internal crystallographic databases that rival or exceed the PDB in size for some protein families.

And if you counter with the argument that machine learning is not pharma’s core expertise, then you only prove my point: why isn’t it? While drug companies wrangle over self-titillating questions like “is AI real?” and “how is deep learning any different than the QSAR we did in the 80s”, Alphabet swoops in and sets up camp right in their backyard. As a result the smartest and most ambitious researchers wanting to work on protein structure will look to DeepMind for opportunities instead of Roche or GSK. This fact should send chills down the spines of pharma executives, but it won’t, because they’re clueless, rudderless, and asleep at the helm.

I am being harsh because this has long been a pet peeve of mine. While companies like Alphabet, Facebook, Microsoft, Intel, and IBM have real research groups with billions of dollars spent on fundamental R&D that has led to Nobel or Turing-grade research, pharmaceuticals engage in “research” so narrowly defined that it rarely contributes to our understanding of basic biology. There is perhaps no better example of this than protein structure prediction, a problem that is very close to these companies’ core interest (along with docking), but on which they have spent virtually no resources. The little research on these problems done at pharmas is almost never methodological in nature, instead being narrowly focused on individual drug discovery programs. While the latter is important and obviously contributes to their bottom line, much like similar research done at tech companies, the lack of broadly minded basic research may have robbed biology of decades of progress, and contributed to the ossification of these companies software and machine learning expertise (there is a reason most newly minted ML PhDs run from pharmas like they’re the plague—they have not cultivated a culture that attracts the world’s best ML talent, in part because of their lack of engagement in basic science.) The AlphaFold episode is only an example of several other problems that have been similarly neglected. It is of course possible that these companies have some newfangled protein structure prediction technology internally, but I’m well networked in these circles and I have seen no indication whatsoever that this is the case.

Smaller and newer companies like AtomWise have done better, focusing more seriously on methodological research, and it will likely take a Silicon Valley-like disruptor to finally turn things around.

The way forward

So what now? Should academics fold up their protein structure research programs and move on to greener and less competitive pastures? And will the space see new entrants from other companies, possibly life science ones? I am still digesting CASP13 and by no means have a definitive recipe, but here are my thoughts so far.

First and foremost, we should recognize what an unqualifiable good thing what just happened is. We, meaning the entire scientific community, have made a major advance on one of the most important problems in biochemistry. Who made the advance is less important than the fact the advance was made, and we should unselfishly rejoice in this fact. I say this cognizant of the fact that my own emotions do not entirely coincide with the sentiment I just espoused, but also cognizant of the fact that we are all adults, and that we must and ought to assess this rationally without letting our tribal affiliations cloud our judgement.

DeepMind’s entry also brings several, unintended benefits. We have a new, world class research team in the field, competitive with the very best existing teams. This has happened maybe once a decade if that. We should welcome them with open arms as, first and foremost, new colleagues with shared purpose. We should encourage them to be as open as the academic teams have been in sharing their research, which they appear to be, and learn from them how to improve our engineering practices, and perhaps more importantly, use their lesson to cultivate a better and more open culture of exchange of ideas, instead of the secretive and siloed behavior that characterizes the field.

DeepMind’s entry also raises the profile of the protein structure problem, likely motivating new students and researchers to work on it, inside of academia and outside. Perhaps DeepMind’s entry will also wake pharmas from their deep slumber, and as a result they too begin to stir with new ideas and resources.

Second, regarding the question of how academic groups should respond scientifically to DeepMind’s entry, I suspect the right answer comes from evolution: adapt. Focus on problems that are less resource intensive, and that require key conceptual breakthroughs and less engineering. Solving protein structure is really multiple problems in one. There is what I would characterize as the canonical problem, the prediction of the overall fold of the native state, and it is the one that most people have focused on including DeepMind. This problem remains unsolved, but it’s clear that for perhaps ~30% of such predictions, we can do very well, and for another ~20% reasonably well. If the trend continues, and there are compelling reasons to argue either way, then something like a solution to this problem is conceivable within ~5 years. That solution may come primarily from better engineering, and so perhaps it represents a less favorable strategic landscape for competition.

Most approaches to the above problem have come from co-evolution based methods and so they are by construction “family-level”. They are able to say a lot less about an individual protein sequence, such as a mutated or de novo designed protein. This is the reason why I have focused on this problem for RGNs, as it is a new frontier. It is unclear if we are even marginally closer to solving this after CASP13—I think there is no indication of any real progress here. And so we could just as well be 20 years out.

Even for MSA / family-level predictions, there is the question of desired accuracy, which hinges on the biological application. If one is predicting protein structures to ascertain their general fold for function classification, then high accuracy is unnecessary. If on the other hand the objective is to design small molecule drugs that bind proteins, which require ~1Å accuracy in the local pocket, it is unclear if we have made any detectable progress.

Finally there is the full realization of the protein folding problem, which concerns not only the final native state but the dynamical trajectory the protein takes to get there, as well as the relative energetics of the near native state ensembles. This is arguably the most important problem for protein function prediction, and it remains very far from being solved.

So let us find and learn the important lessons of CASP13; use it to improve our models and our culture; and, recognizing that we are imperfect, competitive humans, rise above our pettiness to celebrate an important milestone for science.

Post-mortem: RGN @ CASP13

While not the primary subject of this blog post, I would be remiss to not comment on my own participation in CASP13 as a predictor for the very first time! The experience was interesting and informative, but in the end RGNs did not perform well, for reasons I will explain briefly. I should say that it is not possible to know definitively without a thorough analysis, so these are only my best guesses for the moment.

First, given the overall improvement of all co-evolution based methods in CASP13 (even ones that have not changed since CASP12), it appears that the increased availability of protein sequences has widened the gap in information advantage between methods that use co-evolution and those that do not (like RGNs.) I suspect this was the biggest factor in lowering the RGN’s relative ranking. Parenthetically, this suggests a new ultra-hard FM category, single sequence targets without detectable homologs, an idea brought up during CASP13.

Another problem, which I only discovered at the beginning of the CASP13 prediction season, is that all my raw predictions got immediately rejected by CASP’s automatic processing pipeline. This is due to RGN-predicted structures having non-physical torsion angles. In some ways it is unsurprising since, as a machine learning model, RGNs only optimize for what they are trained for, in this case dRMSD. So while the overall global topology of the structures can be quite good, locally the structures are often poor (a point I mention in the paper), and this prevented submissions from going through.

Overlaid backbone traces of experimental (pink) and RGN-predicted (blue) structures for CASP13 FM target T0957s2-D1. While global alignment is good, local alignment is poor, resulting in a low GDT_TS score of 34.

To get around this problem I fed my predicted structures through the Rosetta FastRelax pipeline, which partly defeats the purpose of my method, but my aim was to get structures with sufficiently acceptable local structure to get them past the CASP processing pipeline. This worked most of the time, in the sense that I was able to submit structures, but had the effect of altering them in a way that impacted their accuracy.

It is hard to say yet how much of a contribution this made to reducing RGN performance, and there are other factors that also contributed. For example I used an old RGN model trained on the ProteinNet12 dataset, i.e. couple of years out of date, because I did not have time to retrain for CASP13. I doubt this made a major difference, but it was likely a contributor.

All in all it was a good learning experience, and will give me much to think about over winter break.

Acknowledgments

Thanks to David Baker, Alex Bridgland, Jianlin Cheng, Richard Evans, Tim Green, John Jumper, Daisuke Kihara, John Moult, Sergey Ovchinnikov, Andrew Senior, Jinbo Xu, and Augustin Zidekfor for lively discussions during CASP13 that formed the basis for much of the content of this post.

References

  1. Golkov, V. et al. Protein Contact Prediction from Amino Acid Co-Evolution Using Convolutional Networks for Graph-Valued Images. in Annual Conference on Neural Information Processing Systems (NIPS) (2016).
  2. Jones, D. T., Buchan, D. W. A., Cozzetto, D. & Pontil, M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28, 184–190 (2012).
  3. Kamisetty, H., Ovchinnikov, S. & Baker, D. Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era. Proc. Natl. Acad. Sci. U.S.A. (2013). doi:10.1073/pnas.1314045110
  4. Liu, Y., Palmedo, P., Ye, Q., Berger, B. & Peng, J. Enhancing Evolutionary Couplings with Deep Convolutional Neural Networks. Cell Systems 6, 65-74.e3 (2018).
  5. Marks, D. S. et al. Protein 3D structure computed from evolutionary sequence variation. PLoS ONE 6, e28766 (2011).
  6. Wang, S., Sun, S., Li, Z., Zhang, R. & Xu, J. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model. PLOS Computational Biology 13, e1005324 (2017).
  7. Xu, J. Distance-based Protein Folding Powered by Deep Learning. bioRxiv 465955 (2018). doi:10.1101/465955


140 comments

  1. Great post, but I’m genuinely surprised that the code doesn’t have to be shared in some form for the competition. Is this just true of CASP, or is it the normal standard?

    • It is true of CASP and of the field as a whole unfortunately. Several academic groups have not open-sourced their code. Declining to allow them to participate would result in a detectable drop in the number of CASP participants, which I think the CASP organizers are keen on avoiding.

  2. Pingback: New top story on Hacker News: AlphaFold CASP13: “What just happened?” – Latest news

  3. Pingback: New top story on Hacker News: AlphaFold CASP13: “What just happened?” | World Best News

  4. Pingback: New top story on Hacker News: AlphaFold CASP13: “What just happened?” – Golden News

  5. Pingback: New top story on Hacker News: AlphaFold CASP13: “What just happened?” – News about world

  6. Pingback: New top story on Hacker News: AlphaFold CASP13: “What just happened?” - EYFnews

  7. The biology is facing a crisis that is derive from the freezing of the field. Resources are limited so the scientists have to keep the secrets that goes against the development of the field.

  8. Pingback: New top story on Hacker News: AlphaFold CASP13: “What just happened?” – Hckr News

  9. I’m not sure why this is an indictment of pharma? The major issue for drug discovery is target validation not protein structure prediction. I’ve worked on several successful projects that lead to marketed drugs that had no protein structure information.

    • Agreed. IMO, there’s a proper division of labor between academia and pharma, with CASP-type work falling on the academia (and software vendor) side. Protein structure determination just does not appear to be a bottleneck in drug discovery, and there are enough other challenges related to medchem and biology that pharma has no choice but to address, because they are too “applied” for most academics.

  10. Pingback: AlphaFold at CASP13: What just happened? – Hacker News Robot

  11. This is a great post. However, I think you are missing one important factor. Academic groups from computer science departments could have probably produced the same improvement as DeepMind. The problem is that big funding is in biomedicine and, guess what, the experts sitting panels and reviewer proposals are those that: a) don’t really understand machine learning (this is what their biology-trained postdocs are for…), b) consider that an extensive knowledge of biology is needed to be competitive (clearly not the case, just a self-fulfilling prophecy), and c) have repeatedly used their dominant position to reject grant proposals anonymously that could have challenged such position. Meanwhile data and methodology was improving over the years. The tipping point has been triggered by AI companies financing these health-related applications. DeepMind did not invent machine learning, it simply had the right combination of expertise and funding to give the last blow to this dam, which biochemists created to protect their dominant position and was almost overtaken. It will happen again, they are many biomedicine problems in similar situation. The solutions you propose are very appropriate to change this selfish attitude that academic funding institutions allow.

  12. Regarding your parenthetical comment about statistical protein-DNA classifiers, there were groups several years earlier who published in the same area. Zhang published on the application of their DFIRE method to protein-DNA complexes in 2005:

    https://www.ncbi.nlm.nih.gov/pubmed/15801826

    I later published on application of a pure bayesian classifier to the problem in 2007:

    https://www.ncbi.nlm.nih.gov/pubmed/17078093

    There were a number of others, and of course, both of these papers were based on work in from Ram Samudrala and John Moult lab nearly a decade before that:

    https://www.ncbi.nlm.nih.gov/pubmed/9480776

    • Thanks for your comment. Weren’t these papers based on statistical potentials? I.e. based on counting statistics of PDB atomic contacts and correcting using a fixed reference state?

      I was referring to machine-learned potentials and reference states, similar in spirit to the ones AlphaFold learns. Please correct me if I’m wrong. Thank you!

  13. Pingback: AI & Machine Learning News. 10, December 2018 - CloudQuant

  14. Do you have a more detailed source or description of the technical work behind the protein-family specific reference state that seemed to make a big difference?

      • Thank you for the response. I was referring to the “learned reference state” quoted in the blog post. Looking at it again, it does seem like the methods learns a universal reference state and not a protein-specific one. Is that interpretation correct?

    • Potentials of mean force (PMFs) as used in protein structure prediction are not really PMFs as defined by physics. Rather, they are an approximation of a Bayesian inference method called Jeffrey’s conditioning or probability kinematics. This results in updating an approximate pdf q(x) with a correcting ratio of two pdfs p(y)/r(y), where x is a “fine grained” variable and y is a “coarse grained” variable, with y=f(x) (for example the radius of gyration or a set of pairwise distances). For example, q(x) can come from a fragment library. r(y) corresponds to the reference state and is defined by q(x).

      See:
      https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0013714

      https://onlinelibrary.wiley.com/doi/full/10.1002/prot.24386

      Hamelryck, T., Boomsma, W., Ferkinghoff-Borg, J., Foldager, J., Frellsen, J., Haslett, J., Theobald, D. (2015). Proteins, physics and probability kinematics: a Bayesian formulation of the protein folding problem. In Geometry Driven Statistics, Wiley.

  15. Pingback: Daily Digest | December 11, 2018 – BioDecoded

  16. Pingback: 每日文摘 | 2018年12月11日 – 生物解码

  17. Pingback: Tuesday assorted links - Marginal REVOLUTION

  18. Thanks Mohammed for such an excellent writing about CASP and the field of protein structure prediction. Below are my minor comments.
    1. When comparing AlphaFold to the 2nd-ranked method, maybe it is better to compare AlphaFold to Zhang-human instead of Zhang-server since AlphaFold is also a human group. Compared to Zhang-human, I think AlphaFold speeds up the progress by about 50% instead of 80%.

    2. For evolutionary coupling analysis, the following old paper talked about direct and indirect couplings and proposed the maximum-entropy method to resolve this issue: https://www.jstor.org/stable/4356049?seq=1#page_scan_tab_contents . However, this paper was kind of ignored for a few years maybe because of lack of protein sequences back to 1999.

    3. I am not the first one that applied deep learning to contact prediction. In 2012, Cheng group introduced Deep Belief Networks to contact prediction, but this work did not draw much attention from the community. I guess this is because the accuracy improvement is not significant and in CASP11 (2014) a shallow neural network method MetaPSICOV did the best in contact prediction.

    4. There are a couple of reasons why deep ResNet works much better than previous methods. Besides it is a very deep convolutional neural network, one major reason is that I formulated the contact prediction problem very differently than previous methods such as MetaPSICOV and Cheng’s DNCON. These methods formulated the problem as an image recognition problem, while I treated it as image semantic segmentation. By this new formulation, we can make use of contact patterns or structure motifs to significantly improve prediction accuracy. Since structure information is used in current deep learning methods for contact prediction, maybe we shall not simply say that these new deep learning methods are co-evolution-based methods (especially considering sometimes these methods work fine on proteins without any sequence homologs).

    5. Among the three references (1, 4, and 6) you cited for convolutional neural networks for contact prediction, Ref 1 is quite different from Refs 4 and 6. In Ref 1, convolution is only applied to the 21*21 matrix generated by DCA for an individual residue pair instead of the whole contact matrix. That is, Ref 1 did not take into consideration contact patterns, it is unlikely to have very good performance. Ref 4 is close to Ref 6, but Ref 4 did not use the residual module, so it cannot go to very deep and may not yield the best performance.

    6. In CASP13 I did not use the distance bin with the maximum probability to build 3D models. Instead I calculated the mean and variance from the predicted distance probability distribution and then used them to estimate the lower and upper bounds of inter-atom distance. This is because CNS only accepts this kind of distance information. I do not know how to input distance-based statistical potential to CNS.

    7. It is quite natural for me to extend my work from contact prediction to distance prediction. I not only mentioned this in the Conclusion and Discussion section of my 2017 PLoS CB paper (i.e., Ref 6), but also worked on distance prediction before. In 2012, my group has used traditional neural network to predict distance probability distribution and then converted it to protein- and position-specific distance-based statistical potential for decoy ranking. Please see my paper at https://www.sciencedirect.com/science/article/pii/S0969212612001451 . My another student studied this kind of statistical potential for folding simulation in his PhD thesis (2016), but he left for an industry job in a hurry. In Summer 2018, I published one threading paper that used deep ResNet to derive such a protein-specific distance-based statistical potential and then applied it to improve protein threading. See https://www.ncbi.nlm.nih.gov/pubmed/29949980 for this paper.

    8. Some CASP participants are not surprised at the gradient descent method for energy minimization maybe because Rosetta has this (this is purely my guess).

    9. The boundary between template-based and template-free methods is blurring since a deep learning model may encode information of all the training proteins (as long as it is deep enough). So maybe we shall not classify methods based upon if they have explicitly used templates or not.

    10. I agree with you that academic research is not very efficient in some sense. However, I think our field has made substantial progress even without the DeepMind team. The key ideas (fragment assembly, co-evolutionary analysis and deep ResNet for contact/distance prediction) and the powerful Rosetta are all developed by academia. It took time (and also some random walk) to develop these ideas before we know they actually work. Without these prior studies, I don’t think any team can make substantial progress within a couple of years. By the way, the DeepMind team indeed has very good scientists with strong background in biochemistry/biophysics. To speed up idea exchange, maybe CASP shall de-emphasize ranking and pay more attention to method development. Most top human groups add little value over the best model quality assessment method. Not sure why CASP does not merge human tertiary structure prediction and model quality assessment into a single category.

    • Thank you very much Jinbo for your detailed comments. I added a link to them near the table of contents. With regards to your point 1, I am actually comparing to Zhang (human), not Zhang-server, but looking only at FM predictions.

    • Great comments. For the comment 6: “I do not know how to input distance-based statistical potential to CNS. ” , Maybe you can convert the probabilities to the constraint weights?

  19. Thank you for this blog post, it was very informative and well written.

    Perhaps for your own RGN+Rosetta modeling, you could take the models that came out of your RGN and harvest them for CA-distances, turn those into restraints, and then feed them into Rosetta’s abrelax protocol? FastRelax assumes its inputs have ok torsions, but bad collisions, and so it ramps only the repulsive component of the energy function — but if the torsions are also bad, it might break the structure at the gates trying to correct it. With distance restraints and fragment insertion (Rosetta’s ab intio protocol), you will get structures with the topology you want.

    Just a thought.

    • Thank you very much for the suggestion. That sounds like something I should try! I’m not much of a Rosetta expert at all and tbh this was all done in a rush because I hadn’t realized the structures will get rejected outright. It’s not surprising that FastRelax would have trouble with it because the torsions were all very funky and obviously Rosetta is not parameterized to deal with strange torsions. I suspect there’s a lot of room for improvement here, including making the output from the RGN less torsionally offensive 🙂

  20. Pingback: Friday links: hermit crabs vs. game theory, the truth about peer review, activism vs. academic careers, and more | Dynamic Ecology

  21. Pingback: Au Courant || WNBTv

  22. Pingback: Newsletter #61 - AI+ NEWS

  23. Pingback: 哈佛医学院研究员解读DeepMind大突破AlphaFold:有进步,但未解决根本问题 - 香港交友討論區每秒頭條中國版

  24. Sorry that in the past days I did not get much time to check out comments. Based upon feedback I have received from different channels, I would like to add the following comments:
    1) About progress. In CASP12, the average TMscore of the top 1 models by the best human group is 0.392. In CASP13, the average TMscore of the top 1 models by AlphaFold and Zhang-human is about 0.58 and 0.52, respectively. This is why I said that the speedup made by AlphaFold is about 50% instead of 80%. Nevertheless, the exact number is not that important. Although CASP likes to use GDT, I use TMscore here because it is length-independent. For a protein with >100 AAs, GDT is stricter, while for a protein with <80 AAs, TMscore is stricter.
    2) About distance constraints for CNS. Yes, it is possible to treat predicted distance probability as weight in CNS. However, compared to Rosetta, CNS does not have a good energy function for folding, which may help when predicted distance distribution is not that accurate. Further, Rosetta can also refine a coarse-grained 3D model very well (and thus, greatly improve accuracy) by using a well-developed full-atom energy function. Therefore, it is better to use Rosetta instead of CNS to build 3D models from predicted distance information.
    3) Conceptually, distance prediction is not much different from contact prediction since contact is the binary representation of distance although distance prediction is much more useful to folding than contact prediction.
    4) Protein-specific distance-based energy function is not a new concept. As pointed out by me before, I have published two papers about this: one in 2012 and the other in 2018.
    5) Learned reference state is new, but personally I am not convinced that it is a game changer. When two residues are far away from each other along the primary sequence, you may estimate their reference state using the methods described in DOPE (https://en.wikipedia.org/wiki/Discrete_optimized_protein_energy) or DFIRE. When two residues are close, the DOPE or DFIRE methods may not work well, but I guess some simulation methods may work fine (see Zhang’s RW statistical potential).
    6) Gradient-based energy minimization is not a new concept. Rosetta supports this very well.
    7) My server did not perform as well as AlphaFold, so some people suspect that I may overfit my deep learning model (i.e., I overclaimed the results in my papers). This is simply not true because my server RaptorX-Contact did very well in contact prediction (officially ranked No. 1). RaptorX-Contact also did very well in predicting 3D models, considering that it did not use any energy function or Rosetta to do refinement. In CASP13, RaptorX-Contact predicted correct folds for 17 out of 32 FM targets, consistent with our self-test result (21 out of 37) on the CASP12 FM targets. Please see https://www.biorxiv.org/content/early/2018/11/08/465955 for more details. My two servers RaptorX-DeepModeller and RaptorX-Contact also rank very well among all CASP13 servers.
    8) To understand the CASP ranking well, it is important to know that CASP has two types of predictors: human groups and server groups. For each target, a server group has only 3 days and cannot see the results of other servers, while a human group not only has 3 weeks for a target, but also can make use of all the server results and any available information and resources. Because of this, a human group can easily beat most of the server groups by simply doing a consensus analysis (i.e., majority voting) on all the server predictions or just copying the results of the small set of best servers. That is, it does not make much sense to rank human groups and server groups together.
    9) To get a good ranking in CASP, you do not need new ideas. Instead, you may adopt the following strategies:
    A) Learn how to do a consensus analysis on all the server predictions since many human groups win by this way. This possibly is the simplest and also time-saving strategy.
    B) Download as many existing software as possible, learn how to use them and study how to integrate them (by consensus) for structure prediction. Reimplement some techniques if no good software for download. This will take a lot of more effort than A), but you may set up an in-house server that may rank well.
    However, none of the above strategies can really move forward this field. To advance this field, we need more people to study the problem from different aspects and using very different methods (e.g., RGN and NEMO) at the risk of bad ranking in CASP. Although the new methods alone may not rank well in CASP, some of them might have a long-lasting impact on this field when coupled with other techniques.

    • Thanks for your comments Jinbo, and I agree with all of them. You’re quite right in that the final scores can be very metric dependent, and so the gap etc can change depending on what you’re measuring (TM score vs. GDT). That’s why I said take the comparison about the gap with a grain of salt, because while it’s clear AlphaFold did quite well, the true size of the gap is unclear without more rigorous statistics.

  25. Pingback: Links: Shame storms (be shameless), Agatha Christie, paganism?, Houellebecq, and more! « The Story's Story

  26. Pingback: 哈佛医学院研究员解读DeepMind大突破AlphaFold:有进步,但未解决根本问题_泛人工智能技术资源社区!乌云网

  27. Pingback: Links (23) | Nintil

  28. Thanks for the interesting and insightful post. Looks like I have a lecture course to update. I think the independent rediscovery is a necessary flipside of giving clever people time and space to think about a problem fundamentally, without being too constrained by others’ methods.

  29. Pingback: 281: AlphaFold, protein shape prediction, and the future of AI – mssv

  30. Pingback: i don’t think – scryptomnema

  31. Pingback: Making New Drugs With a Dose of Artificial Intelligence – Get Fit Winnipeg

  32. Pingback: Making New Drugs With a Dose of Artificial Intelligence | Hit News World

  33. Pingback: Making New Drugs With a Dose of Artificial Intelligence

  34. Pingback: Making New Drugs With a Dose of Artificial Intelligence | My News

  35. Pingback: Making New Drugs With a Dose of Artificial Intelligence | Media One

  36. Pingback: Making New Drugs With a Dose of Artificial Intelligence – conorpatricsilva

  37. Pingback: Making New Drugs With a Dose of Artificial Intelligence | Newsmediaone.com !

  38. Pingback: Making New Drugs With a Dose of Artificial Intelligence

  39. Pingback: Making New Drugs With a Dose of Artificial Intelligence – Sky New

  40. Pingback: Making New Drugs With a Dose of Artificial Intelligence – Online48h

  41. Pingback: Making New Drugs With a Dose of Artificial Intelligence | Thefoolishblog.com |

  42. Pingback: Making New Drugs With a Dose of Artificial Intelligence – The Show Times

  43. Pingback: Making New Drugs With a Dose of Artificial Intelligence – Note For Day

  44. Pingback: Making New Drugs With a Dose of Artificial Intelligence - T I S H

  45. Pingback: Making New Drugs With a Dose of Artificial Intelligence | Usworldnewstoday.com

  46. Pingback: Making New Drugs With a Dose of Artificial Intelligence | WorldNewsBuz

  47. Pingback: Making New Drugs With a Dose of Artificial Intelligence | Capmocracy.com


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s