Connecting Vision and Language with Video Localized Narratives | IEEE Conference Publication | IEEE Xplore