Conversion from input texts to moving facial images and speech has gained attention as a basic technique to realize new types of communication services and human interface. This paper focuses on the control method for image synthesis.The proposed control method can accept multiple input sentences and more control words, and thus can synthesize more various and longer facial image sequences easily and automatically.