We use machine learning as a tool to study decision making, focusing specifically on how physicians diagnose heart attack. An algorithmic model of a patient’s probability of heart attack allows us to identify cases where physicians' testing decisions deviate from predicted risk. We then use actual health outcomes to evaluate whether those deviations represent mistakes or physicians’ superior knowledge. This approach reveals two inefficiencies. Physicians overtest: predictably low-risk patients are tested, but do not benefit. At the same time, physicians undertest: predictably high-risk patients are left untested, and then go on to suffer adverse health events including death. A natural experiment using shift-to-shift testing variation confirms these findings. Simultaneous over- and undertesting cannot easily be explained by incentives alone, and instead point to systematic errors in judgment. We provide suggestive evidence on the psychology underlying these errors. First, physicians use too simple a model of risk. Second, they overweight factors that are salient or representative of heart attack, such as chest pain. We argue health care models must incorporate physician error, and illustrate how policies focused solely on incentive problems can produce large inefficiencies.