[1] Sachin Dhawan and Rashmi Gupta. Analysis of various data security techniques of steganography: A survey. Information Security Journal: A Global Perspective, 30(2):63–87, 2021.
[2] Pratap Chandra Mandal, Imon Mukherjee, Goutam Paul, and B.N. Chatterji. Digital image steganography: A literature survey. Information Sciences, 609:1451–1488, 2022.
[3] Hamza Kheddar, Mustapha Hemis, Yassine Himeur, David Meg´ıas, and Abbes Amira. Deep learning for steganalysis of diverse data types: A review of methods, taxonomy, challenges and future directions. Neurocomputing, 581:127528, 2024.
[4] Ahmed A. AlSabhany, Ahmed Hussain Ali, Farida Ridzuan, A.H. Azni, and Mohd Rosmadi Mokhtar. Digital audio steganography: Systematic review, classification, and analysis of the current state of the art. Computer Science Review, 38:100316, 2020.
[5] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer
vision and pattern recognition, pages 1–9, 2015.
[6] Zaher Hamid Al-Tairi, Rahmita Wirza Rahmat, M Iqbal Saripan, and Puteri Suhaiza Sulaiman. Skin segmentation using yuv and rgb color spaces. Journal of information processing systems, 10(2):283–299, 2014.
[7] Hamzeh Ghasemzadeh and Meisam Khalil Arjmandi. Universal audio steganalysis based on calibration and reversed frequency resolution of human auditory system. IET Signal Processing, 11(8):916–922, 2017.
[8] Yuzhen Lin, Rangding Wang, Diqun Yan, Li Dong, and Xueyuan Zhang. Audio steganalysis with improved convolutional neural network. In Proceedings of the ACM workshop on information hiding and multimedia security, pages 210–215, 2019.
[9] Zhong-Liang Yang, Xiao-Qing Guo, Zi-Ming Chen, Yong-Feng Huang, and Yu-Jin Zhang. Rnn-stega: Linguistic steganography based on recurrent neural networks. IEEE Transactions on Information Forensics and Security, 14(5):1280–1295, 2018.
[10] Lang Chen, Rangding Wang, Li Dong, and Diqun Yan. Imperceptible adversarial audio steganography based on psychoacoustic model. Multimedia Tools and Applications, 82(17):26451–26463, 2023.
[11] Yanzhen Ren, Dengkai Liu, Chenyu Liu, Qiaochu Xiong, Jianming Fu, and Lina Wang. A universal audio steganalysis scheme based on multiscale spectrograms and deepresnet. IEEE Transactions on Dependable and Secure Computing, 20(1):665–679, 2022.
[12] Wenxue Cui, Shaohui Liu, Feng Jiang, Yongliang Liu, and Debin Zhao. Multi-stage residual hiding for image-into-audio steganography. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), pages 2832–2836. IEEE, 2020.
[13] Ru Zhang, Hao Dong, Zhen Yang, Wenbo Ying, and Jianyi Liu. A cnn based visual audio steganography model. In International Conference on Adaptive and Intelligent Systems, pages 431–442. Springer, 2022.
[14] Shivam Agarwal and Siddarth Venkatraman. Deep residual neural networks for image in speech steganography. arXiv preprint arXiv:2003.13217, 2020.
[15] Subhajit Paul and Deepak Mishra. Hiding images within audio using deep generative model. Multimedia Tools and Applications, 82(4):5049–5072, 2023.
[16] Margarita Geleta, Cristina Punti, Kevin McGuinness, Jordi Pons, Cristian Canton, and Xavier Giro-i Nieto. Pixinwav: Residual steganography for hiding pixels in audio. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2485–2489. IEEE, 2022.
[17] Sercan ¨O Arık, Heewoo Jun, and Gregory Diamos. Fast spectrogram inversion using multihead convolutional neural networks. IEEE Signal Processing Letters, 26(1):94–98, 2018.
[18] Marco Cuturi and Mathieu Blondel. Soft-dtw: a differentiable loss function for time-series. In International conference on machine learning, pages 894–903. PMLR, 2017.
[19] Keith Ito and Linda Johnson. The lj speech dataset. https://keithito.com/LJ-Speech-Dataset/, 2017.
[20] John Garofolo, Lori Lamel, William Fisher, Jonathan Fiscus, David Pallett, Nancy Dahlgren, and Victor Zue. TIMIT Acoustic-Phonetic Continuous Speech Corpus, 1993.
[21] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2012(VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.
[22] Anish Shah, Eashan Kadam, Hena Shah, Sameer Shinde, and Sandip Shingade. Deep residual networks with exponential linear unit. In Proceedings of the third international symposium on computer vision and the internet, pages 59–65, 2016.
[23] Chunling Han, Rui Xue, Rui Zhang, and Xueqing Wang. A new audio steganalysis method based on linear prediction. Multimedia Tools and Applications, 77:15431–15455, 2018.