Images and words: mechanics of automated captioning with neural networks
Image captioning is the process of generating textual description of an image. It uses both Natural Language Processing and Computer Vision to generate the captions. Like in the notorious “finger pointing to the moon”, automated image captioning requires the ability to discern what it’s really going on in a scene and generate a fluent description for the act taking place. In this talk we present the underlying mechanics to the object detection and language generation using Convolutional and Recurrent Neural Networks.
Computer engineer with 11 years of experience, specialized in mission critical, high traffic, high available Linux architectures and infrastructures (before the cloud was out), with a relevant experience in development and management of web services. He has served as Infrastructure Lead in 4 companies (Translated, N26, Wanderio, Klar) and participated in 2 EU multimillion funded NLP research projects (MateCAT, ModernMT). Alberto has a variegated bundle of experience, that ranges from devops to machine learning, from the corporate banking to the mutable startup world.