Talking, drawing and editing with visual foundation models