Hemant Patil

PhD (Computer Science), IIT Kharagpur

  • 079-68261650, Lab: 079-68261587
  • Office: # 4103, FB-4, DA-IICT, Gandhinagar, Gujarat, India – 382007 Lab: CEP 006, Speech Lb, DA-IICT, Gandhinagar, Gujarat, India – 382007
  • hemant_patil@daiict.ac.in, hemant_patil1977@yahoo.com
  • https://sites.google.com/site/hemantpatildaiict/

Biography

Hemant A. Patil received Ph.D. degree from the Indian Institute of Technology (IIT), Kharagpur, India, in July 2006. Since 2007, he has been a faculty member at DA-IICT Gandhinagar, India and developed Speech Research Lab at DA-IICT, which is recognized as ISCA speech labs. Dr. Patil is member of IEEE, IEEE Signal Processing Society, IEEE Circuits and Systems Society, International Speech Communication Association (ISCA), EURASIP and an affiliate member of IEEE SLTC. He is regular reviewer for ICASSP and INTERSPEECH, Speech Communication, Elsevier, Computer Speech and Language, Elsevier and Int. J. Speech Tech, Springer, Circuits, Systems and Signal Processing, Springer. He has published around 226 research publications in national and international conferences/journals/book chapters. He visited department of ECE, University of Minnesota, Minneapolis, USA (May-July, 2009) as short term scholar. He has been associated (as PI) with three MeitY sponsored projects in ASR, TTS and QbE-STD. He was co-PI for DST sponsored project on India-Digital Heritage (IDH)-Hampi. His research interests include speech and speaker recognition, TTS, infant cry analysis. He has received DST Fast Track Award for Young Scientists for infant cry analysis. He has coedited a book on Forensic Speaker Recognition with Dr. Amy Neustein (EIC, IJST Springer). Presently, he is coediting two books in speech technology for medical-domain. Dr. Patil has taken a lead role in organizing several ISCA supported events, such as summer/winter schools/CEP workshops (such as speaker and language recognition, speech source modeling, text-to-speech synthesis, speech production-perception link, advances in speech processing) and progress review meetings for two MeitY consortia project all at DA-IICT Gandhingagar. Dr. Patil has supervised 04 doctoral (including doctoral thesis supervision in spoofing attacks) and 42 M.Tech. theses. Presently, he is supervising 03 doctoral students. Recently, he offered a joint tutorial with Prof. Haizhou Li during Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2017 and also during INTERSPEECH 2018. He will be offfering joint tutorial with Prof. H. Kawahar during APSIPA ASC 2018, Honolulu, USA, Nov. 12-15, 2018. He has been selected as APSIPA Distinguished Lecturer (DL) for 2018-2019. He has delivered 11 APSIPA DLs in three countries, India, China and Canada.

Specializations

Speech Signal Processing, Speech and Speaker Recognition (Voice Biometrics), Development of Countermeasures for Spoofing Attacks on Automatic Speaker Verification, Voice Conversion

Publications

  • Hardik B. Sailor and Hemant A. Patil, ” Novel unsupervised auditory filterbank learning using convolutional RBM for speech recognition,” in ACM/IEEE Trans. Audio, Speech and Language Processing, vol. 24, no. 12, pp. 2341-2353, Dec. 2016.
  • H. B. Sailor and H. A. Patil, “Auditory feature representation using convolutional restricted Boltzmann machine and Teager energy operator for speech recognition,” Journal of Acoust. Soc. of America (JASA) Express Letters , vol. 141, no. 6, pp. 1–7, June 2017.
  • M. C. Madhavi, and H. A. Patil, “Partial matching and search space reduction for QbE-STD,” in Computer Speech & Language, Elsevier, vol. 45, pp. 58-082, Sept. 2017.
  • H. A. Patil, and M. C. Madhavi, Combining evidences from magnitude and phase information using VTEO for person recognition using humming,” in special issue of Recent advances in speaker and language recognition and characterization Computer Speech and Language, Elsevier, In Press, Sept. 2017.
  • Tanvina B. Patel and Hemant A. Patil, “CochleSignals and Systems, Advanced Digital Signal Processing, Speech Communication (PG level), Speech Technology (UG level).ar filter and instantaneous frequency based features for spoofed speech detection”, in IEEE Journal of Selected Topics in Signal Processing (JSTSP), Special Issue on Spoofing and Countermeasures for Automatic Speaker Verification, vol. 11, no. 4, pp. 618-631, June 2017.

Teaching

  • Signals and Systems (BTech Sem III Core Course)
  • Speech Technology (BTech Sem VI elective and Open to MTech and PhD)
  • Speech Communication
  • Advanced Digital Signal Processing