Cipher text only attack on speech time scrambling systems using correction of audio spectrogram



1 Department of Communicative Sciences and Disorders, Michigan State University, Michigan, US

2 Department of Electrical and Computer Engineering, Tabriz University, Tabriz, Iran

3 Department of Electrical Engineering, Khajeh Nasir Toosi University of Technology, Tehran, Iran


Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities of speech are extracted in time and frequency using short time Fourier transform. We show that spectrograms of cipher-texts are in fact scrambled puzzles. Then, different techniques including estimation, image processing, and graph theory are fused together in order to create and solve these puzzles. Conducted tests show that the proposed method achieves accuracy of 87.8% and intelligibility of 92.9%. These scores are 50.9% and 34.6%, respectively, higher than scores of previous method. Finally a novel method, based on moving spectrogram distance, is proposed that can give accurate estimation of segment length of the scrambler system.


[1] H. Ghasemzadeh and M. H. Keyvanrad, Toward a Robust and Secure Echo Steganography Method Based on Parameters Hopping," in Signal Processing and Intelligent Systems, 2015.
[2] H. Ghasemzadeh, M. T. Khass, and M. K. Arjmandi, Audio steganalysis based on reversed psychoacoustic model of human hearing," Digital signal processing, vol. 51, pp. 133-141, 2016.
[3] S. Sridharan, E. Dawson, and B. Goldburg, Fast Fourier transform based speech encryption system," IEE Proceedings I (Communications, Speech and Vision), vol. 138, pp. 215-223, 1991.
[4] M. A. Brandau, Implementation of a real-time voice encryption system," 2008.
[5] J. Guo, J.-C. Yen, and H.-F. Pai, New voice over Internet protocol technique with hierarchical data security protection," in Vision, Image and Signal Processing, IEE Proceedings-, 2002, pp. 237-243.
[6] V. enk, V. Deli, and V. Miloevi, A new speech scrambling concept based on Hadamard matrices," Signal Processing Letters, IEEE, vol. 4, pp.161-163, 1997.
[7] K. Sakurai, K. Koga, and T. Muratani, A speech scrambler using the fast Fourier transform technique," Selected Areas in Communications,IEEE Journal on, vol. 2, pp. 434-442, 1984.
[8] A. Matsunaga, K. Koga, and M. Ohkawa, An analog speech scrambling system using the FFT technique with high-level security," Selected Areas in Communications, IEEE Journal on, vol.7, pp. 540-547, 1989.
[9] E. Dawson, Design of a discrete cosine transform based speech scrambler," Electronics letters, vol. 27, pp. 613-614, 1991.
[10] S. Sridharan, E. Dawson, and B. Goldburg, Speech encryption in the transform domain," Electronics Letters, vol. 26, pp. 655-657, 1990.
[11] B. Goldburg, S. Sridharan, and E. Dawson, Design and cryptanalysis of transform-based analog speech scramblers," Selected Areas in Communications, IEEE Journal on, vol. 11, pp. 735-744,1993.
[12] A. Jameel, M. Y. Siyal, and N. Ahmed, Transform-domain and DSP based secure speech communication system," Microprocessors and Microsystems, vol. 31, pp. 335-346, 2007.
[13] A. S. Bopardikar, Speech encryption using wavelet packets," 2005.
[14] S. Sadkhan, N. Abdulmuhsen, and N. F. Al Tahan, A proposed analog speech scrambler based on parallel structure of wavelet transforms," in Radio Science Conference, 2007. NRSC 2007. National, 2007, pp. 1-12.

[15] D. Tseng and J. Chiu, An OFDM speech scrambler without residual intelligibility," in TENCON 2007-2007 IEEE Region 10 Conference, 2007, pp.1-4.
[16] H. Li, Z. Qin, L. Shao, and B. Wang, A novel audio scrambling algorithm in variable dimension space," in Advanced Communication Technology, 2009. ICACT 2009. 11th International Conference on, 2009, pp. 1647-1651.
[17] Q.-H. Lin, F.-L. Yin, T.-M. Mei, and H. Liang, A blind source separation based method for speech encryption," Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 53, pp.1320-1328, 2006.
[18] W.-Q. Yan, W.-G. Fu, and M. S. Kankanhalli, Progressive audio scrambling in compressed domain," Multimedia, IEEE Transactions on, vol.10, pp. 960-968, 2008.
[19] C.-P. Wu and C. J. Kuo, Design of integrated multimedia compression and encryption systems," Multimedia, IEEE Transactions on, vol.7, pp. 828-839, 2005.
[20] L. Zeng, X. Zhang, L. Chen, Z. Fan, and Y.Wang, Scrambling-based speech encryption via compressed sensing," EURASIP Journal on Advances in Signal Processing, vol. 2012, pp. 1-12,2012.
[21] S. James, S. George, and P. Deepthi, Secure selective encryption of compressed audio," in Emerging Research Areas and 2013 International Conference on Microelectronics, Communications and Renewable Energy(AICERA/ICMiCR), 2013 Annual International Conference on, 2013, pp. 1-6.
[22] Z. Su, J. Jiang, S. Lian, G. Zhang, and D.Hu, Hierarchical selective encryption for G. 729 speech based on bit sensitivity," Journal of Internet Technology, vol. 11, pp. 599-607, 2010.
[23] G.-R. Kwon, C. Wang, S. Lian, and S.-s. Hwang, Advanced partial encryption using watermarking and scrambling in MP3," Multimedia Tools and Applications, vol. 59, pp. 885-895, 2012.
[24] K. Datta and I. S. Gupta, Partial encryption and watermarking scheme for audio files with controlled degradation of quality," Multimedia tools and applications, vol. 64, pp. 649-669, 2013.
[25] B. Goldburg, S. Sridharan, and E. Dawson, Cryptanalysis of frequency domain analogue speech scramblers," IEE Proceedings I (Communications, Speech and Vision), vol. 140, pp.235-239, 1993.
[26] B. Goldburg, E. Dawson, and S. Sridharan, The automated cryptanalysis of analog speech scramblers," in Advances in CryptologyEUROCRYPT91, 1991, pp. 422-430.
[27] Y.-X. Zhao, M.-C. Su, Z.-L. Chou, and J. Lee, A puzzle solver and its application in speech descrambling," in Proc. 2007 WSEAS Int. Conf.Computer Engineering and Applications, 2007,pp. 171-176.

[28] A. Jolfaei, X.-W. Wu, and V. Muthukkumarasamy, On the Security of Permutation Only Image Encryption Schemes," Information Forensics and Security, IEEE Transactions on,vol. 11, pp. 235-246, 2016.
[29] H. C. Van Tilborg and S. Jajodia, Encyclopedia of cryptography and security: Springer Science & Business Media, 2014.
[30] H. Ghasemzadeh, H. Mehrara, and M. Tajik Khas, Cipher-text only attack on hopping window time domain scramblers," in 4th. International Conference on Computer and Knowledge Engineering (iccke2014), 2014.
[31] K. Grchenig, Foundations of time-frequency analysis: Springer Science & Business Media, 2013.
[32] B. S. Atal and S. L. Hanauer, Speech analysis and synthesis by linear prediction of the speech wave," The Journal of the Acoustical Society of America, vol. 50, pp. 637-655, 1971.
[33] S. S. Haykin, Adaptive filter theory: Pearson Education India, 2008.
[34] D. Kahn, The Codebreakers: The comprehensive history of secret communication from ancient times to the internet: Simon and Schuster, 1996.
[35] H. Ghasemzadeh and M. Khalil Arjmandi, Optimum solution and evaluation of rectangular jigsaw puzzles based on branch and bound method and combinatorial accuracy," Multimedia Tools and Applications, pp. 1-25, 2017.
[36] H. Ghasemzadeh, A metaheuristic approach for solving jigsaw puzzles," in Intelligent Systems(ICIS), 2014 Iranian Conference on, 2014, pp.1-6.
[37] J.-C. Fournier, Graphs Theory and Applications: With Exercises and Problems vol. 72: John Wiley & Sons, 2010.
[38] J. S. Garofolo, L. F. Lamel, W. M. Fisher,J. G. Fiscus, and D. S. Pallett, DARPA TIMIT acoustic-phonetic continous speech corpus CD-ROM. NIST speech disc 1-1.1," NASASTI/Recon Technical Report N, vol. 93, 1993.
[39] E. Zwicker and H. Fastl, Psychoacoustics: facts and models: Springer Science & Business Media,2013.