Abstract
Designing and generating new data under targeted properties has been attracting various critical applications such as molecule design, image editing and speech synthesis. Traditional hand-crafted approaches heavily rely on expertise experience and intensive human efforts, yet still suffer from the insufficiency of scientific knowledge and low throughput to support effective and efficient data generation. Recently, the advancement of deep learning has created the opportunity for expressive methods to learn the underlying representation and properties of data. Such capability provides new ways of determining the mutual relationship between the structural patterns and functional properties of the data and leveraging such relationships to generate structural data, given the desired properties. This article is a systematic review that explains this promising research area, commonly known as controllable deep data generation. First, the article raises the potential challenges and provides preliminaries. Then the article formally defines controllable deep data generation, proposes a taxonomy on various techniques and summarizes the evaluation metrics in this specific domain. After that, the article introduces exciting applications of controllable deep data generation, experimentally analyzes and compares existing works. Finally, this article highlights the promising future directions of controllable deep data generation and identifies five potential challenges.
Supplemental Material
Available for Download
Supplementary material
- [1] . 2018. Graph convolutional policy network for goal-directed molecular graph generation. Conference on Neural Information Processing Systems 31 (2018).Google Scholar
- [2] . 2018. Junction tree variational autoencoder for molecular graph generation. In International Conference on Machine Learning. PMLR, 2323–2332.Google Scholar
- [3] . 2018. MolGAN: An implicit generative model for small molecular graphs. International Conference on Machine Learning 2018 Workshop on Theoretical Foundations and Applications of Deep Generative Models (2018).Google Scholar
- [4] . 2015. Draw: A recurrent neural network for image generation. In International Conference on Machine Learning. PMLR, 1462–1471.Google Scholar
- [5] . 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).Google Scholar
- [6] . 2019. Conditional adversarial generative flow for controllable image synthesis. In Conference on Computer Vision and Pattern Recognition. 7992–8001.Google ScholarCross Ref
- [7] . 2018. Long text generation via adversarial training with leaked information. In AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarCross Ref
- [8] . 2019. Controllable neural story plot generation via reward shaping. In International Joint Conferences on Artificial Intelligence.Google ScholarCross Ref
- [9] . 2001. Emotional speech synthesis: A review. In European Conference on Speech Communication and Technology. Citeseer.Google ScholarCross Ref
- [10] . 2018. AAG-Stega: Automatic audio generation-based steganography. arXiv preprint arXiv:1809.03463 (2018).Google Scholar
- [11] . 2020. Semi-supervised generative modeling for controllable speech synthesis. In International Conference on Learning Representations. https://openreview.net/forum?id=rJeqeCEtvHGoogle Scholar
- [12] . 2007. Progress in computational protein design. Current Opinion in Biotechnology 18, 4 (2007), 305–311.Google ScholarCross Ref
- [13] . 2016. Emerging memristor-based logic circuit design approaches: A review. IEEE Circuits and Systems Magazine 16, 3 (2016), 15–30.Google ScholarCross Ref
- [14] . 1998. Models and algorithms for road network design: A review and some new developments. Transport Reviews 18, 3 (1998), 257–278.Google ScholarCross Ref
- [15] . 2001. IT-security and Privacy: Design and Use of Privacy-enhancing Security Mechanisms. Springer.Google ScholarCross Ref
- [16] . 2019. A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space. Chemical Science 10, 12 (2019), 3567–3572.Google ScholarCross Ref
- [17] . 2013. Estimation of the size of drug-like chemical space based on GDB-17 data. Journal of Computer-aided Molecular Design 27, 8 (2013), 675–679.Google ScholarCross Ref
- [18] . 1999. CMOS Logic Circuit Design. Springer Science & Business Media.Google ScholarDigital Library
- [19] . 2016. A logic circuit design for perfecting memristor-based material implication. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 36, 2 (2016), 279–284.Google ScholarDigital Library
- [20] . 2016. The coming of age of de novo protein design. Nature 537, 7620 (2016), 320–327.Google ScholarCross Ref
- [21] . 2013. A review of urban transportation network design problems. European Journal of Operational Research (2013).Google ScholarCross Ref
- [22] . 2022. FBNETGEN: Task-aware GNN-based fMRI analysis via functional brain network generation. In Medical Imaging with Deep Learning. https://openreview.net/forum?id=oWFphg2IKonGoogle Scholar
- [23] . 2021. Deep generation of heterogeneous networks. In International Conference on Data Mining. IEEE, 379–388.Google ScholarCross Ref
- [24] . 2019. Review of deep learning algorithms and architectures. IEEE Access 7 (2019), 53040–53065.Google ScholarCross Ref
- [25] . 2018. A survey on deep learning: Algorithms, techniques, and applications. Comput. Surveys 51, 5 (2018), 1–36.Google ScholarDigital Library
- [26] . 2019. Survey on deep learning algorithms. International Journal of Emerging Technology and Innovative Engineering 5, 1 (2019).Google Scholar
- [27] . 2022. Multi-objective Deep Data Generation with Correlated Property Control. (2022).
arxiv:cs.LG/2210.01796 Google Scholar - [28] . 2021. Representation learning on spatial networks. Advances in Neural Information Processing Systems 34 (2021), 2303–2318.Google Scholar
- [29] . 2018. Recurrent neural network model for constructive peptide design. Journal of Chemical Information and Modeling 58, 2 (2018), 472–479.Google ScholarCross Ref
- [30] . 2019. Generative models for graph-based protein design. Conference on Neural Information Processing Systems 32 (2019).Google Scholar
- [31] . 2022. Protein sequence design with a learned potential. Nature Communications 13, 1 (2022), 1–11.Google ScholarCross Ref
- [32] . 2019. A comprehensive survey of deep learning for image captioning. Comput. Surveys 51, 6 (2019), 1–36.Google ScholarDigital Library
- [33] . 2021. MARS: Markov molecular sampling for multi-objective drug discovery. In International Conference on Learning Representations. https://openreview.net/forum?id=kHSu4ebxFXYGoogle Scholar
- [34] . 2021. Multi-objective optimization methods in novel drug design. Expert Opinion on Drug Discovery 16, 6 (2021), 647–658.Google ScholarCross Ref
- [35] . 2022. Differentially private facial obfuscation via generative adversarial networks. Future Generation Computer Systems 129 (2022), 358–379.Google ScholarDigital Library
- [36] . 2020. An abstract painting generation method based on deep generative model. Neural Processing Letters 52, 2 (2020), 949–960.Google ScholarDigital Library
- [37] . 2020. Conditional GANs for painting generation. In International Conference on Machine Vision. SPIE.Google ScholarCross Ref
- [38] . 2020. Generative modelling for controllable audio synthesis of piano performance. In International Conference on Machine Learning Workshop on Machine Learning for Music Discovery Workshop (ML4MD), Extended Abstract.Google Scholar
- [39] . 2019. Visualization and interpretation of latent spaces for controlling expressive speech synthesis through audio analysis. In Proc. Interspeech 2019. 4475–4479.Google ScholarCross Ref
- [40] . 2011. Effect of higher minimum inhibitory concentrations of quaternary ammonium compounds in clinical E. coli isolates on antibiotic susceptibilities and clinical outcomes. Journal of Hospital Infection 79, 2 (2011), 141–146.Google ScholarCross Ref
- [41] . 2020. Length-controllable image captioning. In European Conference on Computer Vision. Springer, 712–729.Google ScholarDigital Library
- [42] . 2019. MSCap: Multi-style image captioning with unpaired stylized text. In Conference on Computer Vision and Pattern Recognition. 4204–4213.Google ScholarCross Ref
- [43] . 2021. Property controllable variational autoencoder via invertible mutual dependence. In International Conference on Learning Representations. https://openreview.net/forum?id=tYxG_OMs9WEGoogle Scholar
- [44] . 2022. Fine-grained style control in Transformer-based text-to-speech synthesis. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 7907–7911.Google ScholarCross Ref
- [45] . 2017. Toward controlled generation of text. In International Conference on Machine Learning. PMLR, 1587–1596.Google ScholarDigital Library
- [46] . 2002. Aqueous solubility- molecular size relationships: A mechanistic case study using C10-to C19-alkanes. The Journal of Physical Chemistry A 106, 11 (2002), 2760–2765.Google ScholarCross Ref
- [47] . 2021. A causal lens for controllable text generation. In Conference on Neural Information Processing Systems, , , , and (Eds.). https://openreview.net/forum?id=kAm9By0R5MEGoogle Scholar
- [48] . 2022. Interpretable molecular graph generation via monotonic constraints. In SIAM International Conference on Data Mining. SIAM, 73–81.Google ScholarCross Ref
- [49] . 2000. Protein-length distributions for the three domains of life. Trends in Genetics 16, 3 (2000), 107–109.Google ScholarCross Ref
- [50] Djuke Veldhuis. 2011. Tree-like giant is largest molecule ever made. New Scientist 209, 2795 (2011), 17.
DOI: Google ScholarCross Ref - [51] . 2004. Chemical space. Nature 432, 7019 (2004), 823–824.Google ScholarCross Ref
- [52] . 2013. RDKit: A software suite for cheminformatics, computational chemistry, and predictive modeling. Greg Landrum (2013).Google Scholar
- [53] . 2021. A survey on deep learning and its applications. Computer Science Review 40 (2021), 100379.Google ScholarDigital Library
- [54] . 2019. A state-of-the-art survey on deep learning theory and architectures. Electronics 8, 3 (2019), 292.Google ScholarCross Ref
- [55] . 2014. A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Transactions on Signal and Information Processing 3 (2014).Google Scholar
- [56] . 2018. A survey of deep learning: Platforms, applications and emerging research trends. IEEE Access 6 (2018), 24411–24432.Google ScholarCross Ref
- [57] . 2017. A survey of deep neural network architectures and their applications. Neurocomputing 234 (2017), 11–26.Google ScholarCross Ref
- [58] . 2020. A survey of deep learning and its applications: A new paradigm to machine learning. Archives of Computational Methods in Engineering 27, 4 (2020), 1071–1092.Google ScholarCross Ref
- [59] . 2020. A survey of deep learning for scientific discovery. arXiv preprint arXiv:2003.11755 (2020).Google Scholar
- [60] . 2015. Deep learning in neural networks: An overview. Neural Networks 61 (2015), 85–117.Google ScholarDigital Library
- [61] . 2018. A survey on deep learning for big data. Information Fusion 42 (2018), 146–157.Google ScholarCross Ref
- [62] . 2017. Deep representation learning for human motion prediction and classification. In Conference on Computer Vision and Pattern Recognition. 6158–6166.Google ScholarCross Ref
- [63] . 2021. Anticancer peptides prediction with deep representation learning features. Briefings in Bioinformatics 22, 5 (2021), bbab008.Google Scholar
- [64] . 2019. Multimodal deep representation learning for video classification. International World Wide Web Conference 22, 3 (2019), 1325–1341.Google ScholarDigital Library
- [65] . 2017. A hybrid deep representation learning model for time series classification and prediction. In International Conference on Big Data Computing and Communications. IEEE, 226–231.Google ScholarCross Ref
- [66] . 2020. When does self-supervision help graph convolutional networks?. In International Conference on Machine Learning. PMLR.Google Scholar
- [67] . 2022. Self-supervised representation learning: Introduction, advances, and challenges. IEEE Signal Processing Magazine 39, 3 (2022), 42–62.Google ScholarCross Ref
- [68] . 2016. An overview on data representation learning: From traditional feature learning to recent deep learning. The Journal of Finance and Data Science 2, 4 (2016), 265–278.Google ScholarCross Ref
- [69] . 2013. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence (2013).Google ScholarDigital Library
- [70] . 2018. Network representation learning: A survey. IEEE Transactions on Big Data 6, 1 (2018), 3–28.Google ScholarCross Ref
- [71] . 2020. Network representation learning: A systematic literature review. Neural Computing and Applications 32, 21 (2020), 16647–16679.Google ScholarDigital Library
- [72] . 2020. Graph representation learning: A survey. APSIPA Transactions on Signal and Information Processing 9 (2020).Google ScholarCross Ref
- [73] . 2016. A survey of inductive biases for factorial representation-learning. CoRR abs/1612.05299 (2016).
arXiv:1612.05299 http://arxiv.org/abs/1612.05299Google Scholar - [74] . 2021. A survey on concept factorization: From shallow to deep representation learning. Information Processing & Management 58, 3 (2021), 102534.Google ScholarDigital Library
- [75] . 2018. Deep generative models: Survey. In International Conference on Intelligent Systems and Computer Vision. IEEE, 1–8.Google ScholarCross Ref
- [76] . 2022. Deep generative models in engineering design: A review. Journal of Mechanical Design 144, 7 (2022), 071704.Google ScholarCross Ref
- [77] . 2021. Deep generative models for synthetic data. Comput. Surveys (2021).Google Scholar
- [78] . 2020. A comprehensive survey and analysis of generative models in machine learning. Computer Science Review 38 (2020), 100285.Google ScholarDigital Library
- [79] . 2020. A systematic survey on deep generative models for graph generation. arXiv preprint arXiv:2007.06686 (2020).Google Scholar
- [80] . 2022. A survey on deep graph generation: Methods and applications. arXiv preprint arXiv:2203.06714 (2022).Google Scholar
- [81] . 2021. Deep graph generators: A survey. IEEE Access 9 (2021), 106675–106702.Google ScholarCross Ref
- [82] . 2017. Recent progress of face image synthesis. In Asian Conference on Pattern Recognition. IEEE, 7–12.Google ScholarCross Ref
- [83] . 2021. A survey on multimodal deep learning for image synthesis: Applications, methods, datasets, evaluation metrics, and results comparison. In International Conference on Innovation in Artificial Intelligence. 108–120.Google ScholarDigital Library
- [84] . 2017. Deep learning for image-to-text generation: A technical overview. IEEE Signal Processing Magazine 34, 6 (2017), 109–116.Google ScholarCross Ref
- [85] . 2021. A survey on text generation using generative adversarial networks. Pattern Recognition (2021).Google Scholar
- [86] . 2021. A survey on recent deep learning-driven singing voice synthesis systems. In International Conference on Artificial Intelligence and Virtual Reality. IEEE, 319–323.Google ScholarCross Ref
- [87] . 2021. A survey on neural speech synthesis. arXiv preprint arXiv:2106.15561 (2021).Google Scholar
- [88] . 2021. Review of end-to-end speech synthesis technology based on deep learning. arXiv preprint arXiv:2104.09995 (2021).Google Scholar
- [89] . 1984. A historical review of circuit simulation. IEEE Transactions on Circuits and Systems 31, 1 (1984), 103–111.Google ScholarCross Ref
- [90] . 2008. Transit network design and scheduling: A global review. Transportation Research Part A: Policy and Practice 42, 10 (2008), 1251–1273.Google ScholarCross Ref
- [91] . 2021. A review of text style transfer using deep learning. TAI (2021), 669–684.Google Scholar
- [92] . 2022. Deep learning for text style transfer: A survey. Computational Linguistics 48, 1 (2022), 155–205.Google ScholarCross Ref
- [93] . 2022. MolGenSurvey: A systematic survey in machine learning models for molecule design. arXiv preprint arXiv:2203.14500 (2022).Google Scholar
- [94] . 2021. Comprehensive survey of recent drug discovery using deep learning. International Journal of Molecular Sciences 22, 18 (2021), 9983.Google ScholarCross Ref
- [95] . 2020. Deep learning and generative methods in cheminformatics and chemical biology: Navigating small molecule space intelligently. Biochemical Journal 477, 23 (2020), 4559–4580.Google ScholarCross Ref
- [96] . 2022. Sample efficiency matters: A benchmark for practical molecular optimization. arXiv preprint arXiv:2206.12411 (2022).Google Scholar
- [97] . 2008. Using measured pKa, LogP and solubility to investigate supersaturation and predict BCS class. Current Drug Metabolism 9, 9 (2008), 869–878.Google ScholarCross Ref
- [98] . 2022. Graph neural networks: Graph transformation. In Graph Neural Networks: Foundations, Frontiers, and Applications. Springer, 251–275.Google ScholarCross Ref
- [99] . 2014. Semi-supervised learning with deep generative models. In Conference on Neural Information Processing Systems, , , , , and (Eds.), Vol. 27. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2014/file/d523773c6b194f37b938d340d5d02232-Paper.pdfGoogle Scholar
- [100] . 2014. Discovering hidden factors of variation in deep networks. arXiv preprint arXiv:1412.6583 (2014).Google Scholar
- [101] . 2017. Conditional image synthesis with auxiliary classifier gans. In International Conference on Machine Learning. PMLR, 2642–2651.Google ScholarDigital Library
- [102] . 2017. Controlling linguistic style aspects in neural language generation. In Proceedings of the Workshop on Stylistic Variation. 94–104.Google ScholarCross Ref
- [103] . 2017. Real-valued (medical) time series generation with recurrent conditional GANs. arXiv preprint arXiv:1706.02633 (2017).Google Scholar
- [104] . 2018. Semi-supervised FusedGAN for conditional image generation. In European Conference on Computer Vision.Google ScholarDigital Library
- [105] . 2022. Equivariant diffusion for molecule generation in 3D. In International Conference on Machine Learning. PMLR, 8867–8887.Google Scholar
- [106] . 2021. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741 (2021).Google Scholar
- [107] . 2018. Deep encoder-decoder models for unsupervised learning of controllable speech synthesis. arXiv preprint arXiv:1807.11470 (2018).Google Scholar
- [108] . 2022. Human motion diffusion model. arXiv preprint arXiv:2209.14916 (2022).Google Scholar
- [109] . 2021. C5T5: Controllable generation of organic molecules with transformers. arXiv preprint arXiv:2108.10307 (2021).Google Scholar
- [110] . 2018. Learning deep generative models of graphs. arXiv preprint arXiv:1803.03324 (2018).Google Scholar
- [111] . 2021. CPCGAN: A controllable 3D point cloud generative adversarial network with semantic label generating. In AAAI Conference on Artificial Intelligence, Vol. 35. 3154–3162.Google ScholarCross Ref
- [112] . 2019. Modeling tabular data using conditional GAN. Conference on Neural Information Processing Systems 32 (2019).Google Scholar
- [113] . 2019. CTRL: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858 (2019).Google Scholar
- [114] . 2020. Exploring controllable text generation techniques. In International Conference on Computational Linguistics. COLING, Barcelona, Spain (Online), 1–14.
DOI: Google ScholarCross Ref - [115] . 2020. MEGATRON-CNTRL: Controllable story generation with external knowledge using large-scale language models. arXiv preprint arXiv:2010.00840 (2020).Google Scholar
- [116] . 2020. Multi-objective molecule generation using interpretable substructures. In International Conference on Machine Learning. PMLR, 4849–4859.Google Scholar
- [117] . 2020. Cross-Domain face synthesis using a controllable GAN. In Winter Conference on Applications of Computer Vision. 252–260.Google ScholarCross Ref
- [118] . 2022. FastDiff: A fast conditional diffusion model for high-quality speech synthesis. In International Joint Conferences on Artificial Intelligence.Google ScholarCross Ref
- [119] . 2015. Learning structured output representation using deep conditional generative models. In Conference on Neural Information Processing Systems, , , , , and (Eds.), Vol. 28. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2015/file/8d55a249e6baa5c06772297520da2051-Paper.pdfGoogle Scholar
- [120] . 2021. Controllable and diverse text generation in e-commerce. In International World Wide Web Conference. 2392–2401.Google ScholarDigital Library
- [121] . 2019. MelNet: A generative model for audio in the frequency domain. arXiv preprint arXiv:1906.01083 (2019).Google Scholar
- [122] . 2020. MoFlow: An invertible flow model for generating molecular graphs. In Special Interest Group on Knowledge Discovery and Data Mining. 617–626.Google ScholarDigital Library
- [123] . 2020. Conditional tabular GAN-based two-stage data generation scheme for short-term load forecasting. IEEE Access 8 (2020), 205327–205339.Google ScholarCross Ref
- [124] . 2020. Music SketchNet: Controllable music generation via factorized representations of pitch and rhythm. In International Society for Music Information Retrieval.Google Scholar
- [125] . 2021. GAN-control: Explicitly controllable GANs. In International Conference on Computer Vision. 14083–14093.Google ScholarCross Ref
- [126] . 2020. Structured output learning with conditional generative flows. In AAAI Conference on Artificial Intelligence, Vol. 34. 5005–5012.Google ScholarCross Ref
- [127] . 2019. Guided image generation with conditional invertible neural networks. arXiv preprint arXiv:1907.02392 (2019).Google Scholar
- [128] . 2018. Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Science 4, 2 (2018), 268–276.Google ScholarCross Ref
- [129] . 2021. ILVR: Conditioning method for denoising diffusion probabilistic models. International Conference on Computer Vision (2021).Google Scholar
- [130] . 2018. Efficient Generation of Time Series with Diverse and Controllable Characteristics.
Technical Report . Monash University, Department of Econometrics and Business Statistics.Google Scholar - [131] . 2021. GraphEBM: Molecular graph generation with energy-based models. arXiv preprint arXiv:2102.00546 (2021).Google Scholar
- [132] . 2020. Disentangled and controllable face image generation via 3D imitative-contrastive learning. In Conference on Computer Vision and Pattern Recognition. 5154–5163.Google ScholarCross Ref
- [133] . 2018. Data synthesis based on generative adversarial networks. Proc. VLDB Endow. 11, 10 (
Jun. 2018), 1071–1083.DOI: Google ScholarDigital Library - [134] . 2016. Disentangling factors of variation in deep representation using adversarial training. In Conference on Neural Information Processing Systems, , , , , and (Eds.), Vol. 29. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2016/file/ef0917ea498b1665ad6c701057155abe-Paper.pdfGoogle Scholar
- [135] . 2016. Composing graphical models with neural networks for structured representations and fast inference. In Conference on Neural Information Processing Systems, , , , , and (Eds.), Vol. 29. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2016/file/7d6044e95a16761171b130dcb476a43e-Paper.pdfGoogle Scholar
- [136] . 2017. Deep feature interpolation for image content changes. In Conference on Computer Vision and Pattern Recognition. 7064–7073.Google ScholarCross Ref
- [137] . 2018. Glow: Generative Flow with Invertible 1x1 Convolutions. In Conference on Neural Information Processing Systems, , , et al. (Eds.), Vol. 31. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2018/file/d139db6a236200b21cc7f752979132d0-Paper.pdfGoogle Scholar
- [138] . 2018. Generating multi-categorical samples with generative adversarial networks. arXiv preprint arXiv:1807.01202 (2018).Google Scholar
- [139] . 2020. Learning interpretable representation for controllable polyphonic music generation. In International Society for Music Information Retrieval 2020, , , , , , , , and (Eds.). 662–669. http://archives.ismir.net/ismir2020/paper/000094.pdfGoogle Scholar
- [140] . 2018. Constrained generation of semantically valid graphs via regularizing variational autoencoders. Conference on Neural Information Processing Systems 31 (2018).Google Scholar
- [141] . 2019. GANalyze: Toward visual definitions of cognitive image properties. In International Conference on Computer Vision.Google ScholarCross Ref
- [142] . 2019. Flat metric minimization with applications in generative modeling. In International Conference on Machine Learning. PMLR, 4626–4635.Google Scholar
- [143] . 2019. GraphNVP: An invertible flow model for generating molecular graphs. arXiv preprint arXiv:1905.11600 (2019).Google Scholar
- [144] . 2021. Closed-form factorization of latent semantics in GANs. In Conference on Computer Vision and Pattern Recognition. 1532–1540.Google ScholarCross Ref
- [145] . 2022. ChemSpacE: Toward steerable and interpretable chemical space exploration. In International Conference on Learning Representations 2022 Machine Learning for Drug Discovery.Google Scholar
- [146] . 2021. Deep generative models for spatial networks. In Special Interest Group on Knowledge Discovery and Data Mining. 505–515.Google ScholarDigital Library
- [147] . 2022. Disentangled spatiotemporal graph generative models. AAAI Conference on Artificial Intelligence 36, 6 (
Jun. 2022), 6541–6549.DOI: Google ScholarCross Ref - [148] . 2021. Generating tertiary protein structures via interpretable graph variational autoencoders. Bioinformatics Advances 1, 1 (2021), vbab036.Google ScholarCross Ref
- [149] . 2021. Is disentanglement enough? On latent representations for controllable music generation. In International Society for Music Information Retrieval. Online.Google Scholar
- [150] . 2022. EditVAE: Unsupervised parts-aware controllable 3D point cloud shape generation. AAAI Conference on Artificial Intelligence 36, 2 (
Jun. 2022), 1386–1394.DOI: Google ScholarCross Ref - [151] . 2016. Unsupervised representation learning with deep convolutional generative adversarial networks. In International Conference on Learning Representations, and (Eds.). http://arxiv.org/abs/1511.06434Google Scholar
- [152] . 2020. Interpretable molecule generation via disentanglement learning. In Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. 1–8.Google ScholarDigital Library
- [153] . 2019. Talking face generation by adversarially disentangled audio-visual representation. In AAAI Conference on Artificial Intelligence, Vol. 33. 9299–9306.Google ScholarDigital Library
- [154] . 2020. Interpretable deep graph generation with node-edge co-disentanglement. In Special Interest Group on Knowledge Discovery and Data Mining.Google ScholarDigital Library
- [155] . 2021. Deep latent-variable models for controllable molecule generation. In International Conference on Bioinformatics and Biomedicine. IEEE, 372–375.Google ScholarCross Ref
- [156] . 2022. Small molecule generation via disentangled representation learning. Bioinformatics (Oxford, England) (2022), btac296.Google Scholar
- [157] . 2020. Controlling generative models with continuous factors of variations. In International Conference on Learning Representations. https://openreview.net/forum?id=H1laeJrKDBGoogle Scholar
- [158] . 2020. Disentangling factors of variations using few labels. In International Conference on Learning Representations. https://openreview.net/forum?id=SygagpEKwBGoogle Scholar
- [159] . 2020. On the “steerability” of generative adversarial networks. In International Conference on Learning Representations. https://openreview.net/forum?id=HylsTT4FvBGoogle Scholar
- [160] . 2020. GANSpace: Discovering interpretable GAN controls. Conference on Neural Information Processing Systems 33 (2020), 9841–9850.Google Scholar
- [161] . 2020. DSM-Net: Disentangled structured mesh net for controllable generation of fine geometry. arXiv preprint arXiv:2008.05440 2, 3 (2020).Google Scholar
- [162] . 2020. On variational learning of controllable representations for text without supervision. In International Conference on Machine Learning. PMLR, 10534–10543.Google Scholar
- [163] . 2020. Music FaderNets: Controllable music generation based on high-level features via low-level feature modelling. In International Society for Music Information Retrieval.Google Scholar
- [164] . 2020. Controlled time series generation for automotive software-in-the-loop testing using GANs. In IEEE International Conference on Artificial Intelligence Testing. IEEE, 39–46.Google ScholarCross Ref
- [165] . 2022. Deep generative model for periodic graphs. arXiv preprint arXiv:2201.11932 (2022).Google Scholar
- [166] . 2020. Augmenting genetic algorithms with deep neural networks for exploring the chemical space. In International Conference on Learning Representations. https://openreview.net/forum?id=H1lmyRNFvrGoogle Scholar
- [167] . 2021. A distributional approach to controlled text generation. In International Conference on Learning Representations. https://openreview.net/forum?id=jWkw45-9AbLGoogle Scholar
- [168] . 2022. Differentiable scaffolding tree for molecule optimization. In International Conference on Learning Representations. https://openreview.net/forum?id=w_drCosT76Google Scholar
- [169] . 2020. GRATIS: GeneRAting time series with diverse and controllable characteristics. Statistical Analysis and Data Mining: The ASA Data Science Journal 13, 4 (2020), 354–376.Google ScholarDigital Library
- [170] . 2020. Plug and play language models: A simple approach to controlled text generation. In International Conference on Learning Representations. https://openreview.net/forum?id=H1edEyBKDSGoogle Scholar
- [171] . 2017. SeqGAN: Sequence generative adversarial nets with policy gradient. In AAAI Conference on Artificial Intelligence, Vol. 31.Google ScholarCross Ref
- [172] . 2018. Deep generative models with learnable knowledge constraints. Conference on Neural Information Processing Systems 31 (2018).Google Scholar
- [173] . 2018. Reinforced adversarial neural computer for de novo molecular design. Journal of Chemical Information and Modeling 58, 6 (2018), 1194–1204.Google ScholarCross Ref
- [174] . 2019. Controllable neural story plot generation via reinforcement learning. In International Joint Conferences on Artificial Intelligence.Google Scholar
- [175] . 2020. GraphAF: A flow-based autoregressive model for molecular graph generation. In International Conference on Learning Representations. https://openreview.net/forum?id=S1esMkHYPrGoogle Scholar
- [176] . 2021. Practical massively parallel Monte-Carlo tree search applied to molecular design. In International Conference on Learning Representations. https://openreview.net/forum?id=6k7VdojAIKGoogle Scholar
- [177] . 2020. NeVAE: A deep generative model for molecular graphs. Journal of Machine Learning Research (2020).Google Scholar
- [178] . 2021. Multi-constraint molecular generation based on conditional transformer, knowledge distillation and reinforcement learning. Nature Machine Intelligence 3, 10 (2021), 914–922.Google ScholarCross Ref
- [179] . 2019. Symmetry-adapted generation of 3D point sets for the targeted discovery of molecules. Conference on Neural Information Processing Systems 32 (2019).Google Scholar
- [180] . 2018. Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Central Science 4, 1 (2018), 120–131.Google ScholarCross Ref
- [181] . 2018. Style transfer through back-translation. In Proc. ACL.Google ScholarCross Ref
- [182] . 2019. Learning multimodal graph-to-graph translation for molecule optimization. In International Conference on Learning Representations. https://openreview.net/forum?id=B1xJAsA5F7Google Scholar
- [183] . 2018. An application of generative adversarial networks for super resolution medical imaging. In International Conference on Machine Learning and Applications. IEEE, 326–331.Google ScholarCross Ref
- [184] . 2021. GraphDF: A discrete flow model for molecular graph generation. In International Conference on Machine Learning. PMLR, 7192–7203.Google Scholar
- [185] . 2020. Core: Automatic molecule optimization using copy & refine strategy. In AAAI Conference on Artificial Intelligence, Vol. 34. 638–645.Google ScholarCross Ref
- [186] . 2019. Transforming delete, retrieve, generate approach for controlled text style transfer. In Conference on Empirical Methods in Natural Language Processing.Google ScholarCross Ref
- [187] . 2019. DeepPrivacy: A generative adversarial network for face anonymization. In International symposium on visual computing. Springer, 565–578.Google ScholarDigital Library
- [188] . 2017. Optimizing chemical reactions with deep reinforcement learning. ACS Central Science 3, 12 (2017), 1337–1344.Google ScholarCross Ref
- [189] . 2019. Exploring transfer learning for low resource emotional TTS. In Proceedings of SAI Intelligent Systems Conference. Springer, 52–60.Google Scholar
- [190] . 2021. Reinforcement learning for emotional text-to-speech synthesis with improved emotion discriminability. In Interspeech.Google Scholar
- [191] . 2019. Hierarchical generative modeling for controllable speech synthesis. In International Conference on Learning Representations. https://openreview.net/forum?id=rygkk305YQGoogle Scholar
- [192] . 2019. Controllable paraphrase generation with a syntactic exemplar. In Proc. of ACL.Google ScholarCross Ref
- [193] . 2019. PCGAN: Partition-controlled human image generation. In AAAI Conference on Artificial Intelligence, Vol. 33. 8698–8705.Google ScholarDigital Library
- [194] . 2019. Hierarchical cross-modal talking face generation with dynamic pixel-wise loss. In Conference on Computer Vision and Pattern Recognition. 7832–7841.Google ScholarCross Ref
- [195] . 2020. CIAGAN: Conditional identity anonymization generative adversarial networks. In Conference on Computer Vision and Pattern Recognition. 5447–5456.Google ScholarCross Ref
- [196] . 2021. Controllable emotion transfer for end-to-end speech synthesis. In International Symposium on Chinese Spoken Language Processing. IEEE, 1–5.Google ScholarCross Ref
- [197] . 2021. Generating syntactically controlled paraphrases without using annotated parallel pairs. In Conference of the European Chapter of the Association for Computational Linguistics.Google ScholarCross Ref
- [198] . 2020. C-flow: Conditional generative flow models for images and 3D point clouds. In Conference on Computer Vision and Pattern Recognition. 7949–7958.Google ScholarCross Ref
- [199] . 2018. Unsupervised text style transfer using language models as discriminators. Conference on Neural Information Processing Systems 31 (2018).Google Scholar
- [200] . 2020. Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 6189–6193.Google ScholarCross Ref
- [201] . 2021. Prosodic features control by symbols as input of sequence-to-sequence acoustic modeling for neural TTS. IEICE Transactions on Information and Systems 104, 2 (2021), 302–311.Google ScholarCross Ref
- [202] . 2021. Expressive text-to-speech using style tag. In Interspeech.Google Scholar
- [203] . 2019. Multi-reference tacotron by intercross training for style disentangling, transfer and control in speech synthesis. arXiv preprint arXiv:1904.02373 (2019).Google Scholar
- [204] . 2021. Model architectures to extrapolate emotional expressions in DNN-based text-to-speech. Speech Communication 126 (2021), 35–43.Google ScholarCross Ref
- [205] . 2021. Emotion controllable speech synthesis using emotion-unlabeled dataset with the assistance of cross-domain speech emotion recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 5734–5738.Google ScholarCross Ref
- [206] . 2020. GeDi: Generative discriminator guided sequence generation. In Conference on Empirical Methods in Natural Language Processing.Google Scholar
- [207] . 2019. Talking face generation by conditional recurrent adversarial network. In International Joint Conferences on Artificial Intelligence. 919–925.Google ScholarCross Ref
- [208] . 2019. An effective style token weight control technique for end-to-end emotional speech synthesis. IEEE Signal Processing Letters 26, 9 (2019), 1383–1387.Google ScholarCross Ref
- [209] . 2019. A methodology for controlling the emotional expressiveness in synthetic speech-a deep learning approach. In International Conference on Affective Computing and Intelligent Interaction Workshops and Demos. IEEE, 1–5.Google ScholarCross Ref
- [210] . 2020. Introducing prosodic speaker identity for a better expressive speech synthesis control. In 10th International Conference on Speech Prosody 2020. ISCA, 935–939.Google ScholarCross Ref
- [211] . 2018. Delete, retrieve, generate: A simple approach to sentiment and style transfer. In Annual Conference of the North American Chapter of the Association for Computational Linguistics.Google ScholarCross Ref
- [212] . 2020. Speech synthesis and control using differentiable DSP. arXiv preprint arXiv:2010.15084 (2020).Google Scholar
- [213] . 2019. Disentangled representation learning for non-parallel text style transfer. In Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 424–434.
DOI: Google ScholarCross Ref - [214] . 2021. StyleFlow: Attribute-conditioned exploration of StyleGAN-generated images using conditional continuous normalizing flows. TOG 40, 3 (2021), 1–21.Google ScholarDigital Library
- [215] . 2021. Low-rank subspaces in GANs. Conference on Neural Information Processing Systems 34 (2021).Google Scholar
- [216] . 2022. GAN inversion: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence01 (
Jun. 2022), 1–17.DOI: Google ScholarCross Ref - [217] . 2021. Controllable cardiac synthesis via disentangled anatomy arithmetic. In International Conference on Medical Image Computing and Computer Assisted Intervention. Springer, 160–170.Google ScholarDigital Library
- [218] . 2021. Changing the mind of transformers for topically-controllable language generation. In Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 2601–2611.
DOI: Google ScholarCross Ref - [219] . 2020. Interpreting the latent space of GANs for semantic face editing. In Conference on Computer Vision and Pattern Recognition. 9243–9252.Google ScholarCross Ref
- [220] . 2020. A sentiment-controllable topic-to-essay generator with topic knowledge graph. In Findings of Conference on Empirical Methods in Natural Language Processing.Google ScholarCross Ref
- [221] . 2018. Generating sentences by editing prototypes. Transactions of the Association for Computational Linguistics 6 (2018), 437–450.Google ScholarCross Ref
- [222] . 2019. FastSpeech: Fast, robust and controllable text to speech. Conference on Neural Information Processing Systems 32 (2019).Google Scholar
- [223] . 2021. Controllable text-to-speech synthesis using prosodic features and emotion soft-label. Sython.org.Google Scholar
- [224] . 2020. Controllable neural text-to-speech synthesis using intuitive prosodic features. In INTERSPEECH.Google Scholar
- [225] . 2021. EMOVIE: A Mandarin emotion speech dataset with a simple emotional text-to-speech model. In Proc. Interspeech 2021. 2766–2770.
DOI: Google ScholarCross Ref - [226] . 2020. DDSP: Differentiable digital signal processing. In International Conference on Learning Representations. https://openreview.net/forum?id=B1x1ma4tDrGoogle Scholar
- [227] . 2017. Generative models and model criticism via optimized maximum mean discrepancy. In International Conference on Learning Representations. https://openreview.net/forum?id=HJWHIKqglGoogle Scholar
- [228] . 2021. TG-GAN: Continuous-time temporal graph deep generative models with time-validity constraints. In International World Wide Web Conference. 2104–2116.Google ScholarDigital Library
- [229] . 2019. Generative modeling by estimating gradients of the data distribution. Conference on Neural Information Processing Systems 32 (2019).Google Scholar
- [230] . 2021. Molecular design in drug discovery: A comprehensive review of deep generative models. Briefings in Bioinformatics 22, 6 (2021), bbab344.Google ScholarCross Ref
- [231] . 2022. How can graph neural networks help document retrieval: A case study on cord19 with concept map generation. In European Conference on Information Retrieval. Springer, 75–83.Google ScholarDigital Library
- [232] . 2020. Probabilistic extension of precision, recall, and F1 score for more thorough evaluation of classification models. In Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems. 79–91.Google ScholarCross Ref
- [233] . 2020. Controllable meaning representation to text generation: Linearization and data augmentation strategies. In Conference on Empirical Methods in Natural Language Processing. 5160–5185.Google ScholarCross Ref
- [234] . 2017. Bias and statistical significance in evaluating speech synthesis with mean opinion scores. In Interspeech. 3976–3980.Google ScholarCross Ref
- [235] . 2022. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022).Google Scholar
- [236] . 2020. Constrained Bayesian optimization for automatic chemical design using variational autoencoders. Chemical Science 11, 2 (2020), 577–586.Google ScholarCross Ref
- [237] . 2020. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456 (2020).Google Scholar
- [238] . 2021. Diffusion models beat GANs on image synthesis. Advances in Neural Information Processing Systems 34 (2021), 8780–8794.Google Scholar
- [239] . 2022. Guided-TTS: A diffusion model for text-to-speech via classifier guidance. In International Conference on Machine Learning. PMLR, 11119–11133.Google Scholar
- [240] . 2022. Enhancing diffusion-based image synthesis with robust classifier guidance. arXiv preprint arXiv:2208.08664 (2022).Google Scholar
- [241] . 2020. A comprehensive survey on transfer learning. Proc. IEEE 109, 1 (2020), 43–76.Google ScholarCross Ref
- [242] . 2017. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine 34, 6 (2017), 26–38.Google ScholarCross Ref
- [243] . 2010. Posterior regularization for structured latent variable models. Journal of Machine Learning Research 11 (2010), 2001–2049.Google ScholarDigital Library
- [244] . 2017. Optimizing distributions over molecular space. An objective-reinforced generative adversarial network for inverse-design chemistry (ORGANIC). Chemrxiv.org.Google Scholar
- [245] . 2021. UFC-BERT: Unifying multi-modal controls for conditional image synthesis. Conference on Neural Information Processing Systems 34 (2021).Google Scholar
- [246] . 1987. SMILES, a Line Notation and Computerized Interpreter for Chemical Structures. US Environmental Protection Agency, Environmental Research Laboratory.Google Scholar
- [247] . 2020. Applications of deep learning in molecule generation and molecular property prediction. Accounts of Chemical Research 54, 2 (2020), 263–270.Google ScholarCross Ref
- [248] . 2019. Deep learning for molecular design—a review of the state of the art. Molecular Systems Design & Engineering 4, 4 (2019), 828–849.Google ScholarCross Ref
- [249] . 2021. De novo molecular design and generative models. Drug Discovery Today 26, 11 (2021), 2707–2715.Google ScholarCross Ref
- [250] . GAUCHE: A library for Gaussian processes in chemistry. In International Conference on Machine Learning 2022 2nd AI for Science Workshop.Google Scholar
- [251] . 2019. Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. Journal of Medicinal Chemistry 63, 16 (2019), 8749–8760.Google ScholarCross Ref
- [252] . 2021. Out-of-the-box deep learning prediction of pharmaceutical properties by broadly learned knowledge-based molecular representations. Nature Machine Intelligence 3, 4 (2021), 334–343.Google ScholarCross Ref
- [253] . 2018. GraphVAE: Towards generation of small graphs using variational autoencoders. In International Conference on Artificial Neural Networks. Springer, 412–422.Google ScholarCross Ref
- [254] . 2018. GraphRNN: Generating realistic graphs with deep auto-regressive models. In International Conference on Machine Learning. PMLR, 5708–5717.Google Scholar
- [255] . 2018. Exploring deep recurrent models with reinforcement learning for molecule design. In (ICLR’18). Workshop track.Google Scholar
- [256] . 2020. Guiding deep molecular optimization with genetic exploration. Conference on Neural Information Processing Systems 33 (2020), 12008–12021.Google Scholar
- [257] . 2019. Optimization of molecules via deep reinforcement learning. Scientific Reports 9, 1 (2019), 1–10.Google Scholar
- [258] . 2019. Efficient multi-objective molecular optimization in a continuous latent space. Chemical Science 10, 34 (2019), 8016–8024.Google ScholarCross Ref
- [259] . 2018. Latent molecular optimization for targeted therapeutic design. arXiv preprint arXiv:1809.02032 (2018).Google Scholar
- [260] . 2021. Mimosa: Multi-constraint molecule sampling for molecule optimization. In AAAI Conference on Artificial Intelligence, Vol. 35. 125–133.Google ScholarCross Ref
- [261] . 1999. From protein structure to function. Current Opinion in Structural Biology 9, 3 (1999), 374–382.Google ScholarCross Ref
- [262] . 2021. Highly accurate protein structure prediction with AlphaFold. Nature 596, 7873 (2021), 583–589.Google ScholarCross Ref
- [263] . 2018. Computational protein design with deep learning neural networks. Scientific Reports 8, 1 (2018), 1–9.Google Scholar
- [264] . 2022. Protein design via deep learning. Briefings in Bioinformatics 23, 3 (2022), bbac102.Google ScholarCross Ref
- [265] . 2021. Multimodal image synthesis and editing: A survey. arXiv preprint arXiv:2112.13592 (2021).Google Scholar
- [266] . 2009. Text-to-Speech Synthesis. Cambridge University Press.Google ScholarCross Ref
- [267] . 2019. A review of deep learning based speech synthesis. Applied Sciences 9, 19 (2019), 4050.Google ScholarCross Ref
- [268] . 2018. Deep learning for chemical reaction prediction. Molecular Systems Design & Engineering 3, 3 (2018), 442–452.Google ScholarCross Ref
- [269] . 2021. Prediction of chemical reaction yields using deep learning. Machine Learning: Science and Technology 2, 1 (2021), 015016.Google ScholarCross Ref
- [270] . 2019. Deep learning for deep chemistry: Optimizing the prediction of chemical patterns. Frontiers in Chemistry 7 (2019), 809.Google ScholarCross Ref
Index Terms
- Controllable Data Generation by Deep Learning: A Review
Recommendations
Revisiting Learning Paradigms for Multimedia Data Generation
MM '23: Proceedings of the 31st ACM International Conference on MultimediaWith the development of deep learning, multimedia data generation (e.g., image generation, audio synthesis, music composition, and video generation) has attracted a lot of attention. Deep learning methods for data generation usually build a mapping from ...
Baby cry recognition based on SLGAN model data generation and deep feature fusion
AbstractDeep learning models have been applied in baby cry recognition to enhance the recognition accuracy. However, the current research still suffers from data imbalance problem, which leads to bias in model learning. Sparse Autoencoder Long Short-Term ...
Graphical abstractDisplay Omitted
Highlights- A SLGAN model is proposed to solve data imbalance problem by generating new cry data.
- Deep features extracted using transfer learning models are fused using SAE model.
- Our proposed method outperforms existing studies in classifying ...
A comprehensive review on GANs for time-series signals
AbstractDuring the last decade, deep learning (DL) techniques have demonstrated the capabilities in various applications with a large number of labeled samples. Unfortunately, it is normally difficult to obtain such large amounts of samples in practice. ...
Comments