Seattle, April 30, 2019 — Last week, Arivale, the direct-to-consumer “scientific wellness” start-up closed its doors. But the end of Arivale is not the end of “scientific wellness”. The attribute “scientific” is more than for marketing, more than expressing a sentiment. However, to see its substance we have to reach beyond “data science” — which is not science but is what Arivale has relied upon. Deep sequencing and deep learning will not do it, we also need deep thinking. If theory without data is useless, so is data without theory. An epistemic consideration for big data medicine.
In view of the rapid cycles of booms and busts in the arena of big data medicine, the sudden end of the health-tech startup Arivale should not have been worth big news. Yet it sent shock waves into the land of a promising new genre of health service. Because Arivale has spearheaded the selling of “scientific wellness” as a novel type of service, its demise carries the weight of an indicator for the future of an entire field. But beyond all the usual Monday-morning quarterbacking about absence of a market or failure of marketing or of management, etc. there is a much deeper issue that the closing of Arivale shall entice us to reflect upon: What actually is “scientific wellness”, which Arivale has trade-marked, but is now used ubiquitously? I have found not a single onomasiologically solid definition on the internet. A broader epistemological consideration shall hence be in order.
The hype around the wellness industry, especially when tethered to the big data boom, has been noted. But is all this just smoke without the roast — or is there substance hidden in the smoke? Was adding the attribute ‘scientific’ to ‘wellness’ meant to suggest the latter, or did it just blow more hot air into the balloon, helping its rapid rise? Everybody knows what is ‘wellness’ (even if you cannot define it, “you know it when you see it”). But what makes it “scientific”? Isn’t modern medicine already based on scientific methods? I will contend that there is concrete substance behind a new endeavor in the wellness industry that would deserve the attribute “scientific” in a sense that is fundamental, not sentimental. But it is nuanced and hard to articulate. Even Arivale’s scientists and marketing team, and certainly its competitors, have not bothered to think in detail and depth about the “scientific” part in ‘scientific wellness’.
The idea of a “scientific wellness” is not a marketing ploy. It has its origin in the vision of Leroy Hood, one of the founder of Arivale, which is a spin-off of the Institute for Systems Biology in Seattle that Dr. Hood founded almost 20 years ago to promote systems approaches to biology. The idea of “scientific wellness” can only be understood in the context of a systems approach and of the question of how the achievements of “omics” biology (from genomics to metabolomics) can be leveraged to improve health. “Scientific wellness”, despite its struggle to find a cogent definition, captures an idea that is profound in ways unbeknownst to many, including the data scientists at Arivale. ‘Data science’ is not science, and most of current analysis done in the spirit of “scientific wellness” is merely data science. That’s the problem. It is “science”, more precisely, “systems science”, which makes the new approach to wellness “scientific”. But in what specific way? This is hard to explain, but let me try.
DATA SCIENCE IS NOT SCIENCE
The brute-force collecting, correlating and categorizing of data and then interrogating medical knowledge bases for “actionables” that would improve wellness is all but a scientific method. Such an approach to big data is not the recipe for a home-run in medicine; it is the desperation behind a Hail Mary pass. Yes, data is necessary in science. But data alone is of little use without science, and may even stifle scientific progress. The blind algorithmic sieving of vast amounts of multi-omic, longitudinal data with the hope of finding the gold nuggets that may be monetized, using ever more sophisticated statistical tools, is not science.
Sure, the activity that the misnomer ‘data science’ describes can be most useful in many domains of life: For instance, to figure out what is your client’s likely next purchase, which movie she is likely to watch next, who she is likely to vote for, etc. –all this has practical implications that translate into revenue. Such profitable use of data indeed does not require science and is not science. And conversely, good science does not need to be useful. In these commercial applications, the findings produced by data scientists are per se the “actionables”. But in medicine as a biological science there is one big additional step between data and their application that few data scientists see: it is called ‘science’, or more precisely, a scientific theory that deals with general, fundamental principles. Data (or observations, in old speak) is needed both to erect the general principles of the theory but also to apply them to the specific instance.
Since data science is not science, and data scientists are not necessarily trained as scientists (few are), it is excusable for them to not be aware of the very existence of science and the need for it in some (but not all) domains where data is being produced and analyzed. To think that data analytics will somehow, magically and consistently, unearth actionable information comes close to what the Swiss psychiatrist Eugen Bleuler called “autistic-undisciplined thinking” in medicine in the 1919s — now in the cloak of big data and informatics.
As the late molecular biologist Nobel laureate Sidney Brenner suggested: The most useful “omics” is economics. Thus, while big data fuels revenues at Amazon or Netflix, omics in medicine is still very hard to monetize. Figuring out that your blood levels of metabolite X in your omics profile data is too high does not automatically mean that you need to lower it to improve your personal wellness. Medicine is more complicated than that: actionable information cannot be derived by the sort of hand-waving, causality-based rational thinking that works like a charm in commercial data analysis and in so many domains of life. Knowing that Vitamin D promotes calcium absorption and mineralization does not automatically mean that eating Vitamin D supplements will prevent osteoporosis in healthy people (it mostly doesn’t). Knowing that Vitamin C promotes the activity of immune cells does not automatically mean that Vitamin C can prevent cancer and common cold (it doesn’t, as we now all know). Knowing that magnesium reduces muscle contractility does not automatically mean that it will prevent cramps in your next marathon (it won’t).
PLAUSIBLE CAUSAL MECHANISMS ARE NEAR USELESS IN MEDICINE
The kind or highly plausible, mechanistic explanations as exemplified above just don’t work in medicine (most of the time). Few researchers without long exposure to medical research, including the data scientists and bioinformaticians who now enter biomedicine in droves because of their indispensable skills, appreciate this fact. Clever “engineering hacks”, logically deduced from however solid causal mechanism, no matter whether identified by human reasoning or sophisticated statistical inference algorithms, are not suited for informing care decisions in medicine. The most extreme modern form of such a primitive mode of thought is epitomized by “precision medicine” that naively treats the human body as a complicated Rube-Goldberg machine of sequential molecular mechanisms amenable to linear thinking about causation. This is not how diseases work.
The human organism is a complex non-linear (stochastic) dynamical system. The kind of simplistic linear reasoning that data scientists bring into medicine may have utility only on the few islands of linearity in a vast sea of non-linearity. Such islands on which the principle of linear causation work, are certain monogenic diseases, infectious diseases or some cancers driven by a single dominant oncogene — that’s all. (This will disappoint the advocates of precision medicine, but humbling warning signs of its immanent limitations due to organismal complexity are accumulating and described by many observers, notably Dr. Michael Joyner from the Mayo clinic, here and here).
It is the very fact that plausible mechanistic rationales are not reliable in the real world of medicine which has given birth to evidenced-based medicine (EBM). With the narrower meaning of the term ‘evidence’, EBM espouses empirical knowledge and eschews “expert opinion” which often are rooted in the afore-criticized ad hoc mechanistic causal reasoning. EBM extracts clinically actionable information from large, rigorous clinical trials and epidemiological studies that establish best-practice guidelines. EBM has been vastly successful in many ways, as best seen in cardiovascular medicine, saving millions of lives. But EBM has clearly reached its limits. Its descriptive, empirical approach has little patience for logical inference on the grounds of molecular or physiological mechanisms –because they cannot be trusted. But we know more and more about molecular pathways and organismal pathophysiology — they fill vast databases.
While care based on EBM is still by orders of magnitudes more reliable than that solely based on plausible ad hoc molecular rationales, one problem with EBM is that its insights, since they are derived from clinical trials, by necessity pertain to aggregates of large patient cohorts and often do not apply to the individual. The idiosyncrasy of the individual person in need of medical care poses a fundamental problem to EBM and is what personalized medicine, precision medicine and big data medicine seek to address. One possible solution is the so-called “N-of-1” trial. But this scheme of obtaining evidence for efficacy of treatment for one particular patient through series of treatment/observation cycles in the same patient is logistically challenging. In the long run, and epistemologically, there is no way around mechanistic reasoning that utilizes instance-specific information when it comes to improving the condition of an individual patient (the instance) because the general EBM guidelines apply only to an imaginary “average patient”.
Our knowledge of the human body, even if one could search the exploding medical databases for all molecular pathways and physiological laws, is not yet ready to permit reliable deductive reasoning to “figure out”, as engineers or car mechanics or detectives do, what is wrong with a given instance (of a class of systems) and what is the cause of it and how to fix it: The doctor cannot deduce, by combining a given patient’s multi-omic profile information with existing medical knowledge, how to treat the complex clinical condition of that specific multi-morbid patient. She still need to “look it up” in (or memorize it from) some handbook or paper with general guidelines determined by empirical clinical studies. In medicine, unlike in physics, chemistry, engineering, meteorology (see below), etc. we lack a formal theory for logical deduction of what to do in a specific case without relying on explicit empirical guidelines (“if blood pressure is higher than 130/80mmHg, and there are other risk factors x, then start drug treatment”).
So, if both traditional evidenced-based medicine as well as the new approaches based on ad hoc causal mechanisms derived from personal data have their limitations –what can we do in the era of big data medicine?
BEYOND EVIDENCED-BASED MEDICINE AND LINEAR MECHANISTIC REASONING
Enters “scientific wellness”. I propose that the scientific principle warranting the attribute ‘scientific’ is the application of burgeoning formal concepts from systems biology, which is what had originally fostered the idea of “scientific wellness”. The term, coined by Dr. Leroy Hood, is far from a tacit suggestion that current medicine is “non-scientific” –of course modern medicine is based on science. But in Hood’s view, the ‘scientific wellness’ implies an emphasis on optimization of wellness, informed by scientific principles, without explicit reference to ‘disease’. There is a subtle but fundamental, clear-cut difference between “optimizing wellness” and “preventing disease” — beyond the sentimental half-full/half-empty glass analogy. The focus on optimizing wellness is motivated by insights into the very nature of transitions from the regime of ‘being well’ into that of ‘being sick’, if we imagine, following a formalism from the theory of dynamical systems, that these two regimes of health are complementary domains in the space of all possible health states (the green and red zones in the figure below).
Applying the theoretical framework of complex dynamical systems to medicine will obviate the habit of assuming linear causal mechanisms that has become the default mode of operation of data scientists and statisticians analyzing the big data produced by multi-omics profiling of consumers and patients: “If A increases before disease B is manifest, then A could be a biomarker or even therapeutic target for disease B”. Such linear thinking rests on plausible logical relationships but too often they fail to materialize, which is what led to systems biology in the first place: The millions of measurable variables (genes, proteins, metabolites, symptoms, clinical signs, etc) are all interconnected: they influence each other, forming one integrated system that exhibits their a collective, emergent behavior.
SYSTEMS DYNAMICS OF WELLNESS-to-DISEASE TRANSITION (AND HOW TO DELAY IT)
Why is a systems approach different from figuring out individual causal mechanisms in understanding loss of wellness? This gets very technical, but very briefly:
As a consequence of the inevitable degradation of organismal function with age, molecular regulatory feedback circuits lose their full functionality. In the theoretical view of said health state space, this process shifts the organism gradually from the regime of homeostasis (self-stabilization against perturbations) towards the regime of disease, until it enters it –and then, it is trapped in it. This is not a metaphor but can be reduced to crisp mathematical principles of systems dynamics applied to physiological regulation.
We can frame this process in terms of a unifying principle, not individual molecular mechanisms which may differ from individual to individual. Reduced self-stabilization is prosaically manifest, as many of us sadly notice, in the longer times it takes to recover from injury as one gets older. The restoring forces that underlie the “wellness attractor” become weaker, and the health state often fluctuates with larger amplitudes around the optimal wellness state. This is why Dr. Hood attaches so much importance to optimizing wellness which becomes the “treatment goal” in its own right, independent of any notion of specific pathogenetic pathways that one would seek to block in the conventional idea of disease prevention. ‘Disease’ is not part of the equation, hence the term ‘wellness’! And this is also why we need to obtain data points over time when the individual is still in the healthy regime in order to optimize stability of wellness. (OK, “treating the healthy” may sound like a luxury and waste of money –but we are talking science here, not social economy. But look at LED lamps, smartphones, electric cars: Every new technology, initially affordable only to the wealthiest, will one day become cheaper and accessible to all. )
The above rather abstract formal concept can also be illustrated (in some stretched but permissive way) by the loss of system control much akin to what happened in the recent tragic crashes of the Boeing 737 MAX: A faulty feedback system results in undulating flight due to excess correction and counter-correction, leading to increasing departure from the optimal set-point (cruising altitude entered in the autopilot — the equivalent to homeostasis). The declining accuracy of control loops in the body exposes more and more primordial ,once critical, hence powerful regulatory processes. They become newly connected into inappropriate self-reinforcing feedback loops. The faulty feedback control can trap the system in a fateful self-propelling trajectory of increasing dysfunction which in early stages is manifest in larger and larger diversity and fluctuations of system states.
If sensors send inaccurate signals,the organism as a complex multi-layer system will alternate between under and overcompensation because it fails to terminate corrective actions as needed. Maladaptation ensues: Wound healing and regeneration fuel cancer growth, hemostasis and vessel repair build atherosclerotic plaques, life-saving anti-shock system pushes up blood pressure, immune cell check-points permit auto-immunity or weakens immune defense, etc. In each individual the molecular defects that drive these maladaptive vicious cycles can be different — only the clinical outcome is similar. Such behaviors, as systems theorists will readily appreciate, are characterized by the existence of multiple alternative ways to get to a behavior with common properties: a point-of-no-return and irreversible one-directional course after that. This is then recognized as a chronic disease. This is why stabilizing wellness eo ipso is a form of prevention.
Also, this convergence of distinct trajectories in a high-dimensional multi-omics space towards these common endpoints is one reason why it is so challenging to find biomarkers for this and that — because individuals may take slightly distinct trajectories into the disease regime. Thus, identifying and interpreting the nature of the above degradation and mis-wiring of control loops when analysing omics data demands not only impeccable knowledge of biochemistry and pathophysiology and of medicine but also a robust command of dynamical systems theory that deals with the non-linear dynamics that is at the core of any self-propelling departure from homeostasis. The statistical associations offered by data science won’t do it.
This formalism may still be too abstract to be actionable. Sure, it still needs to be developed, but the idea already offers a formal framework to unify the varying individual-specific molecular changes that may differ from person to person and that therefore have defied the population-averaging in EBM. The systems dynamics concept puts us on a path towards the ability for case-specific deductive assessment of the nature of molecular dysfunction in the health state of an individual patient. With a general theory, the specific data collected for an instance can be explanatory and predictive without the need of statistical associations based on generic biomarkers. Such analysis would then be more like solving a mystery case, akin to the work of a detective or investigator, and less like the current clinicians’ routine of comparing patient data to some reference for diagnosis and to then apply standards-of-care treatment. This would forever change the intellectual task of the care provider.
It is this new systems perspective that underpins Hood’s concept of optimizing wellness within the broader vision of scientific wellness. This theory-informed approach is far more than listing correlations and inferring causation from omics data. Understanding non-linear systems behaviors will require both tons of data as well as a solid theory. We have in the past years emphasized the former but neglected the latter. Much as in hurricane prediction, a task in which we are stunningly good at, both longitudinal high-dimensional data as well as theory (classical mechanics, thermodynamics, fluid mechanics, etc) are required. In science data without a formal theory is useless — unless you are a data scientist working outside of science, namely, on mundane but commercially important tasks, such as figuring out what ad you should push onto a website when a particular client visits it.
OUTLOOK: LESSONS (TO BE) LEARNED
Sadly, in current big data medicine, the data scientists do not ask profound scientific questions, they do not formulate hypotheses, let alone frame theories. In a culture of infatuation with big data without theory, to which Arivale may have succumbed over the past few years, we have not only neglected systems theory but also tossed out the classical medical disciplines that embrace theoretical principles and scholarly pursuit, such as biochemistry and pathophysiology, that underlie the path to disease. It is all too tempting to skip all the steps of scholarly discipline, rigorous theory and science in the era of big data. Deep sequencing and deep learning are replacing deep thinking.
Arivale, small, innovative and nimble, would had been in a position to be a leader in opposing this unfortunate trend away from science and towards brute-force, theory-free data analysis, because of the historical roots of “scientific wellness” in the very discipline of systems biology. This home-field advantage over the Googles, Microsoft and IBMs etc. and whoever has entered into the health care thinking that their data crunching machinery and AI capacities alone will suffice, has too readily been given away.
Arivale was part of the new wave of health-tech companies which in view of the weakness of the scientific basis for their services, must rely on two arms for potential revenue streams: One arm provides a direct-to-customer health-related service (despite at times still sketchy science), the other collects the customer data, and, so the idea, converts them into new intellectual property to establish the missing scientific basis, but also to sell new knowledge to big pharma. This two-arm strategy poses leadership challenge but makes sense. The first arm should ethically be based on solid science, but we are simply not there yet, and the second arm can generate new knowledge of diseases — and even contribute to the missing science for the former. But this, would require rigorous scientific thinking and systems theory in the above described sense.
The two-arm business model of funding knowledge acquisition by companies is critical for advancing medical knowledge because governmental agencies have long ceased to fund true innovation. The peer–review system of grant proposals is structurally destined to stifle big, outside-the-box innovative ideas. In fact, for years now the National Health Institutes mostly fund projects in data collection and management as well as in the technology for doing so, but not to erect brand new biological concepts. In brief, the NIH, too, has succumbed to the autistic-undisciplined “data-only” mentality. Thus, the main source for paradigm-shifting new ideas in our understanding of the human body beyond the current molecular stamp-collecting and bean-counting will come from bold innovative start-ups that bypass academic peer-review and yet embrace some basic science.
If costs of assays for the personal data acquisition are too expensive, thereby slowing expansion of the first arm, as seemed to have been the case with Arivale, then obviously priority should be put on the second arm of revenue to subsidize the first arm through partnerships with big pharma. But this would require more than the diffuse promise that “somehow” big data mining will deliver new IP. One needs a concrete research program that lays out a logical path to answer a biological question driven by a hypothesis that is rooted in a theory (e.g., of systems dynamics) and in rigorous medical knowledge. The problem is that few data scientists and big data entrepreneurs know how to do this.
* * *
Thus, to conclude: In medicine (not in commerce) there is an important, often overseen step that lies between data acquisition and actionable information, and this step is called “science” as in “scientific wellness”. We can only hope that all the competitors that now may jump into the space freed by Arivale appreciate this and learn to overcome the autistic-undisciplined thinking of data science because “scientific wellness”, with the fundamental and not sentimental meaning of “scientific”, is the future.
AT THE END, ALL WILL BE WELL(NESS). IF IT IS NOT WELL(NESS), IT IS NOT THE END.