Sui Huang, MD, PhD
2 min readDec 2, 2020

--

Thanks for the clear summary and the optimistic outlook on AI.

I have to disagree, though, with the rather lazy, hail-mary pass arguments of simply saying “AI will get smarter” and promising that AI may have a “huge impact on curing intractable diseases”.

I share the optimism. But one cannot express such attitude without pointing to what is missing in today’s AI, which has been successful only for solving a particular class of low-hanging fruit problems .

All these amazing success stories that have fueled the awe of AI that you mentioned, from protein folding to beating humans in chess, GO, etc. to predicting what movie you would love to see to giving directions, belong to one class problems that is distinct to that in, say medical research. AI is still not able to manage problems in fields that, to paraphrase Herbert Simon (the father of the very idea of AI), are “semantically rich” – like biomedical research.

All the success stories have been in “semantically poor” domains where rules (of a game, or that govern protein folding) and the targeted solution (categories, “cat or not?”, “good or bad move”, shortest route, most similar customer) are easily defined, in few sentences. They can typically be comprehended by a teenager or lay person. Moreover, the big data needed to feed the deep neural networks that solve these problems is quite benign and homogeneous.

By contrast, in “semantically rich” domains you will have to learn a lot of knowledge (content) only to even define and “understand” a problem, as is typically the case in biomedical research problems – even after parsing. For mastering the next pandemic, you need to know a lot about viruses, their evolutionary histories, the biological principles of viral replication at the molecualr level, etc. knoweldge that fill endless volumes of textbooks that no NLP algorithm can master. In brief AI that cures diseases will need to ingest and in some way “understand” the content of entire libraries of medical libraries, highly heterogeneous information without obvious structure... Only then, after acquiring an exhaustive set of content knowledge, can an AI system be fed with data (which unlike in games and protein folding and image recognition, are astronomically high-dimensional) and trained.

In summary, in the not too distant future, the paradigm of brute deep learning will have to fundamentally change, if as Lance says, AI is to save us: AI will have to shift from the regime of simple rules + big data to a new regime: mastering complex semantically rich rules + big data. The former is what one could call “Big Knowledge”. How AI is going to deal with Big Knoweldge is hardly been discussed, since AlphaGo, AlphaFold, navigating a vehicle etc. and all these celebrated success use cases belong to the realm of “small knowledge” – despite the need for big data. Data is not knowledge. https://medium.com/@Cancerwarrior/the-science-behind-scientific-wellness-e2a677d391c5

And thus, Big Data is not Big Knoweldge.

So, “getting smarter” means not more powerful deep learning, but dealing with Big Knoweldge. (How??)

--

--

Sui Huang, MD, PhD
Sui Huang, MD, PhD

No responses yet