Titre : | Automatic classification of audio sequences : application to algerian dialects | Type de document : | texte manuscrit | Auteurs : | Soumia Bougrine, Auteur ; Hadda Cherroun, Directeur de thèse | Editeur : | Laghouat : Université Amar Telidji - Département d'informatique | Année de publication : | 2019 | Importance : | 155 p. | Format : | 30 cm. | Accompagnement : | 1 disque optique numérique (CD-ROM) | Langues : | Anglais | Catégories : | THESES :10 informatique
| Mots-clés : | Algerian dialects Deep Neural Networks Dialect Identification Prosody Speech Corpus | Résumé : | Dialect IDentification (DID) is a challenging task, becomes more complicated when dealing with under-resourced and an inter-country dialects. Speech database or corpora are crucial for both developing and evaluating accurate DID systems. The aims of our work are, on the one hand, to build Spoken Arabie Algerian corpora knowing that Algerian dialects are among the most complex Arabic dialects and their study received very title attention. On the other hand, to develop an DID system to identify Arabic intra- country dialects. Our three contributions are dedicated to Arabic Algerian Natural Language Processing fields. First, concerning language resource, we have building two spoken Arabic Algerian corpora. First, a parallel corpus ALG-DARIDJAH dedicated especially to linguistics studies. It has been collected using a direct recording method which allow more control on the collected data quality. It is a medium size corpus encompasses some sub-dialects spoken in 17 departments with 109 speakers and more than 6K utterances, The second corpus is KALAM'DZ which is a large web-based corpus dedicated to train machine learning systems. Its designed collection and processing recipe is a generic one and it relies on many open source tools. KALAM'DZ corpus encompasses the 8 major dialects with 4881 speakers and more than 104.4 hours. Our second contribution concerns corpus annotation. In fact, we investigate an altruistic crowdsourcing to validate our semi-automatic dialect annotation of KALAM'DZ corpus. From this experiments, we have determined a list of best practices for altruistic crowdsourcing for corpus annotations. Finally, we have investigated two solutions country context, relying on the prosodic features and new machine learning techniques. For systems, we have focused on measuring the discriminative power of the prosody informa- tion and deep learning modeling, especially Feed-Forward Neural Network. The first system is based on a flat classification while the second uses the hierarchical one that employs the hierarchy structure of Algerian dialects. The performances of our both systems are compar- ative to those of literature despite they deal with inter-country Arabic dialects or Algerian dialect areas. The obtained results show that the prosodic features are less discriminative than acoustic one but its showed their superiority in performance when the test utterance sizes are short which is a DID requirements. While, the ADID system based on a combina- tion of prosodic and acoustic information has improved the dialect detection. However, the hierarchical system has significant improvement the ADID | note de thèses : | Thèse de doctorat en informatique |
Automatic classification of audio sequences : application to algerian dialects [texte manuscrit] / Soumia Bougrine, Auteur ; Hadda Cherroun, Directeur de thèse . - Laghouat : Université Amar Telidji - Département d'informatique, 2019 . - 155 p. ; 30 cm. + 1 disque optique numérique (CD-ROM). Langues : Anglais Catégories : | THESES :10 informatique
| Mots-clés : | Algerian dialects Deep Neural Networks Dialect Identification Prosody Speech Corpus | Résumé : | Dialect IDentification (DID) is a challenging task, becomes more complicated when dealing with under-resourced and an inter-country dialects. Speech database or corpora are crucial for both developing and evaluating accurate DID systems. The aims of our work are, on the one hand, to build Spoken Arabie Algerian corpora knowing that Algerian dialects are among the most complex Arabic dialects and their study received very title attention. On the other hand, to develop an DID system to identify Arabic intra- country dialects. Our three contributions are dedicated to Arabic Algerian Natural Language Processing fields. First, concerning language resource, we have building two spoken Arabic Algerian corpora. First, a parallel corpus ALG-DARIDJAH dedicated especially to linguistics studies. It has been collected using a direct recording method which allow more control on the collected data quality. It is a medium size corpus encompasses some sub-dialects spoken in 17 departments with 109 speakers and more than 6K utterances, The second corpus is KALAM'DZ which is a large web-based corpus dedicated to train machine learning systems. Its designed collection and processing recipe is a generic one and it relies on many open source tools. KALAM'DZ corpus encompasses the 8 major dialects with 4881 speakers and more than 104.4 hours. Our second contribution concerns corpus annotation. In fact, we investigate an altruistic crowdsourcing to validate our semi-automatic dialect annotation of KALAM'DZ corpus. From this experiments, we have determined a list of best practices for altruistic crowdsourcing for corpus annotations. Finally, we have investigated two solutions country context, relying on the prosodic features and new machine learning techniques. For systems, we have focused on measuring the discriminative power of the prosody informa- tion and deep learning modeling, especially Feed-Forward Neural Network. The first system is based on a flat classification while the second uses the hierarchical one that employs the hierarchy structure of Algerian dialects. The performances of our both systems are compar- ative to those of literature despite they deal with inter-country Arabic dialects or Algerian dialect areas. The obtained results show that the prosodic features are less discriminative than acoustic one but its showed their superiority in performance when the test utterance sizes are short which is a DID requirements. While, the ADID system based on a combina- tion of prosodic and acoustic information has improved the dialect detection. However, the hierarchical system has significant improvement the ADID | note de thèses : | Thèse de doctorat en informatique |
|