Theory’s Death is Highly Exaggerated…

Raj Vedam
3 min readJan 9, 2022

Years ago, while working on my doctoral degree in Nonlinear Systems, I encountered the fascinating protein folding problem.

Roughly, it has to do with the final 3D-form a long protein chain will assume, as a result of various stimuli, which in turn impacts how it will express itself in reactions.

Protein Folding (from Wiki)

The computational problem presents itself as modeling the lowest energy state — optimization of a nonlinear multi-timescale dynamical system with an exponential number of outcomes (NP-Hard), constrained by “meta-stable” physical states (amenable to heuristics).

Native state at lowest Energy (wiki)

It turned out to be an incredibly difficult exercise, esp with Nature sometimes halting in suboptimal states, beyond prediction with the computational power of those days, except for trivial cases, let alone the 20,000+ proteins expressed in the human body.

Sample Stages in Protein Folding

The state of art in the 90s therefore was to lab-synthesize the proteins, and use its final form in biological research.

Protein Synthesis (http://www.msrblog.com/science/medical/describe-about-protein-synthesis.html)

With the availability of high-performance computing and machine learning, AI packages such as AlphaFold have been developed that can predict the protein folding with high accuracy, sharply cutting the time and expense needed for lab work. This is an incredible application of machine-learning, for computationally intractable problems.

This exuberance has led some to predict the “death of Theory” in favor of “knowledge” gained from domain-agnostic Big Data methods and AI.

1. The physics, or why something “works” is as important as answering “here is a solution that works”. Can’t expect AI to answer that with the inferential engines of today: theoretical modeling is singularly needed.

2. A good model is one that not only interpolates, but is capable of extrapolation. Data-based models are good at interpolation between known data-constraints, but questionable in predicting sufficiently beyond the “trained” data ranges. (see Elephant in the Room in: https://www.facebook.com/raj.vedam.1/posts/10217236690013799)

3. There are many real-world problems that cannot be cast as ML decision-modeling problems or data-based abstractions, and need hard modeling and computational machinery to solve them.

4. Non-stationarity of processes defy computational modeling as well in a great many biological and econometric processes, let alone ML inferencing, necessitating lab-work or direct experience. Under this category is “human understanding / knowing / consciousness”, the subject of Vedanta.

Theory is not dead, and physical modeling is not passé. Data-based inferencing has its unquestionable utility, but cannot replace theory that drives lab-work or computational work in modeling, simulation, control, identification, and optimization of real-world phenomena. The world will need theory and models based on stochastic theory, calculus, trigonometry AND lab-work indefinitely to sustain human activity.

See also:

https://www.wired.com/2008/06/pb-theory/

https://www.science.org/doi/10.1126/science.abn5795

--

--

Raj Vedam

PhD in Electrical Engineering, Wide Range of Research Interests from Technology to Computation to Deep History.