On the Field – Defined Terms

It has been a little over a month since I started this blog about “information, computation, causality, and learning.” I figured it’s well past time to explain what this is even all about and how these topics connect to each other and why someone would leave a good job when they still have grad school debt and live in one of the most expensive cities in the U.S. in order to spend their time reading (and now writing) about these topics.

As a first cut, my aim is to survey and learn, as much as I can over a still to-be-determined period of time, the science, mathematics and engineering of the computational basis of abstract knowledge representation, reasoning, and learning. Of course this is an extremely tall order. This problem touches on a wide swath of fields including computational linguistics, computational cognition, and machine learning, and people spend lifetimes studying just one small slice of the problem. As formulated, it’s effectively AI-complete and its solution would be equivalent to solving artificial general intelligence.

So, that should immediately set off some alarm bells. A lawyer doing natural language understanding might well be the strongest possible peak AI hype indicator. And in any event, the pursuit of a vague, almost grandiose goal runs the danger that one will confuse furious activity for understanding, or make illusory connections among ill-defined concepts, or mistake the trivial, obvious and tautological for the true, new and important. The risk is increased when starting from a field (say, private equity mergers and acquisitions) that is not closely linked to the existing academic and commercial AI networks.

You can at least take to heart that I’m fully heeding Cosma Shalizi’s comments on cranks in his infamous review of Wolfram’s A New Kind of Science. In the upcoming months on this blog you can expect eccentric exploration of varied topics, but explicitly situated within the existing literature and coupled with an effort to develop some level of rigorous understanding.

The goal is not to become a data scientist. I submit that in contrast to the almost uniformly negative public perception, biglaw is surprisingly underrated by the career zeitgeist. And while the commercial applications or social consequences of AI are obvious and much discussed – still too prematurely in my view – the goal here is the more basic one of increased understanding. Since I have the financial resources to do this for some time (“tenured unemployment” as my friends call it) and the educational background (physics and math undergraduate) to tackle the subject at some reasonable level of depth and a wife who is supportive of crazy ideas, might as well do it now and see what happens.

The motivating intuition is that large parts of our higher-order thought seems to involve constructing and manipulating various complex causal models. I say causal as opposed to statistical because they provide answers to interventional and counter-factual questions in addition to predictive inquires (i.e., not just “will that stack of blocks fall down?” but “what would have happened if they were arranged differently?” and “if I reach my hand out, what will occur?”). Current AI methods have proved we can build programs that exceed human performance on tasks when the rules are explicit and pre-specified and we have complete knowledge of all of the pieces and players – i.e., if they’re given the game engine, they can probably learn to beat any human at playing the game. If we could build a game engine through which a program could play the game of “corporate law”, I’m sure I would have many more friends in the land of tenured unemployment. But of course in real life (and expanding on Feynman’s famous analogy) you need to learn to see the chess pieces and the board at the same time you’re learning the rules of the game at the same time you’re learning the names of the pieces at the same time you’re playing the game – and the game keeps changing! I’m interested in seeing how we construct and manipulate those models, and represent them computationally, when the “pieces” are purely abstract ideas.

On deck next week:

thought vectors and distributed representations of words and phrases;
algorithmic fairness and computational social science;

Archives