Main Article Content
People like taking pictures of food because they like food. Behind every meal is a complicated recipe that tells its story. Unfortunately, we can't see how it was made just by looking at a picture of it. So, in this paper, we show an inverse cooking system that can make recipes from pictures of food. Our system guesses ingredients as sets by using a new architecture to model their dependencies without forcing any order. It then makes cooking instructions by looking at both the image and its guesses for ingredients at the same time . We test the whole system on the massive Recipe1M dataset and show that: (1) we improve performance compared to earlier baseline methods for nutritional information prediction; (2) we can get high-quality recipes by using both images and ingredients; and (3) according to human judgement, our system can make more interesting recipes than retrieval-based methods. We share code and models with the public.