Learning Structured Appearance Models from Captioned Images of Cluttered Scenes | IEEE Conference Publication | IEEE Xplore