Catalogue des ouvrages Université de Laghouat
A partir de cette page vous pouvez :
Retourner au premier écran avec les étagères virtuelles... |
Détail de l'auteur
Auteur Soumia Bougrine
Documents disponibles écrits par cet auteur



Automatic classification of audio sequences / Soumia Bougrine
Titre : Automatic classification of audio sequences : application to algerian dialects Type de document : texte manuscrit Auteurs : Soumia Bougrine, Auteur ; Hadda Cherroun, Directeur de thèse Editeur : Laghouat : Université Amar Telidji - Département d'informatique Année de publication : 2019 Importance : 155 p. Format : 30 cm. Accompagnement : 1 disque optique numérique (CD-ROM) Langues : Anglais Catégories : THESES :10 informatique Mots-clés : Algerian dialects Deep Neural Networks Dialect Identification Prosody Speech Corpus Résumé : Dialect IDentification (DID) is a challenging task, becomes more complicated when dealing with under-resourced and an inter-country dialects. Speech database or corpora are crucial for both developing and evaluating accurate DID systems. The aims of our work are, on the one hand, to build Spoken Arabie Algerian corpora knowing that Algerian dialects are among the most complex Arabic dialects and their study received very title attention. On the other hand, to develop an DID system to identify Arabic intra- country dialects. Our three contributions are dedicated to Arabic Algerian Natural Language Processing fields. First, concerning language resource, we have building two spoken Arabic Algerian corpora. First, a parallel corpus ALG-DARIDJAH dedicated especially to linguistics studies. It has been collected using a direct recording method which allow more control on the collected data quality. It is a medium size corpus encompasses some sub-dialects spoken in 17 departments with 109 speakers and more than 6K utterances, The second corpus is KALAM'DZ which is a large web-based corpus dedicated to train machine learning systems. Its designed collection and processing recipe is a generic one and it relies on many open source tools. KALAM'DZ corpus encompasses the 8 major dialects with 4881 speakers and more than 104.4 hours. Our second contribution concerns corpus annotation. In fact, we investigate an altruistic crowdsourcing to validate our semi-automatic dialect annotation of KALAM'DZ corpus. From this experiments, we have determined a list of best practices for altruistic crowdsourcing for corpus annotations. Finally, we have investigated two solutions country context, relying on the prosodic features and new machine learning techniques. For systems, we have focused on measuring the discriminative power of the prosody informa- tion and deep learning modeling, especially Feed-Forward Neural Network. The first system is based on a flat classification while the second uses the hierarchical one that employs the hierarchy structure of Algerian dialects. The performances of our both systems are compar- ative to those of literature despite they deal with inter-country Arabic dialects or Algerian dialect areas. The obtained results show that the prosodic features are less discriminative than acoustic one but its showed their superiority in performance when the test utterance sizes are short which is a DID requirements. While, the ADID system based on a combina- tion of prosodic and acoustic information has improved the dialect detection. However, the hierarchical system has significant improvement the ADID note de thèses : Thèse de doctorat en informatique Automatic classification of audio sequences : application to algerian dialects [texte manuscrit] / Soumia Bougrine, Auteur ; Hadda Cherroun, Directeur de thèse . - Laghouat : Université Amar Telidji - Département d'informatique, 2019 . - 155 p. ; 30 cm. + 1 disque optique numérique (CD-ROM).
Langues : Anglais
Catégories : THESES :10 informatique Mots-clés : Algerian dialects Deep Neural Networks Dialect Identification Prosody Speech Corpus Résumé : Dialect IDentification (DID) is a challenging task, becomes more complicated when dealing with under-resourced and an inter-country dialects. Speech database or corpora are crucial for both developing and evaluating accurate DID systems. The aims of our work are, on the one hand, to build Spoken Arabie Algerian corpora knowing that Algerian dialects are among the most complex Arabic dialects and their study received very title attention. On the other hand, to develop an DID system to identify Arabic intra- country dialects. Our three contributions are dedicated to Arabic Algerian Natural Language Processing fields. First, concerning language resource, we have building two spoken Arabic Algerian corpora. First, a parallel corpus ALG-DARIDJAH dedicated especially to linguistics studies. It has been collected using a direct recording method which allow more control on the collected data quality. It is a medium size corpus encompasses some sub-dialects spoken in 17 departments with 109 speakers and more than 6K utterances, The second corpus is KALAM'DZ which is a large web-based corpus dedicated to train machine learning systems. Its designed collection and processing recipe is a generic one and it relies on many open source tools. KALAM'DZ corpus encompasses the 8 major dialects with 4881 speakers and more than 104.4 hours. Our second contribution concerns corpus annotation. In fact, we investigate an altruistic crowdsourcing to validate our semi-automatic dialect annotation of KALAM'DZ corpus. From this experiments, we have determined a list of best practices for altruistic crowdsourcing for corpus annotations. Finally, we have investigated two solutions country context, relying on the prosodic features and new machine learning techniques. For systems, we have focused on measuring the discriminative power of the prosody informa- tion and deep learning modeling, especially Feed-Forward Neural Network. The first system is based on a flat classification while the second uses the hierarchical one that employs the hierarchy structure of Algerian dialects. The performances of our both systems are compar- ative to those of literature despite they deal with inter-country Arabic dialects or Algerian dialect areas. The obtained results show that the prosodic features are less discriminative than acoustic one but its showed their superiority in performance when the test utterance sizes are short which is a DID requirements. While, the ADID system based on a combina- tion of prosodic and acoustic information has improved the dialect detection. However, the hierarchical system has significant improvement the ADID note de thèses : Thèse de doctorat en informatique Réservation
Réserver ce document
Exemplaires
Code-barres Cote Support Localisation Section Disponibilité Thd 10-38 Thd 10-38 Thése BIBLIOTHEQUE DE FACULTE DES SCIENCES théses (sci) Disponible thed 10-12 thed 10-12 Thése SALLE DES THESES bibliothèque centrale théses en informatique Disponible Automatic classification of audio sequences / Soumia Bougrine
Titre : Automatic classification of audio sequences : application to algerian dialects Titre original : Classification automatique des séquences Audio : Application aux dialectes algériennes Type de document : document multimédia Auteurs : Soumia Bougrine, Auteur ; Cherroun,Hadda, Directeur de la recherche Editeur : Laghouat : Université Amar Telidji - Département d'informatique Année de publication : 2019 Importance : 155 p. Format : 30 cm ISBN/ISSN/EAN : 978-2-35113-065-0 Note générale : Thd 10-38 : BIBLIOTHEQUE DE FACULTE DES SCIENCES
hed 10-12 : bibliothèque centraleLangues : Anglais Catégories : THESES :20 Langue et litterature anglaise Mots-clés : Algerian dialects Deep Neural Networks Dialect Identification Prosody Speech Corpus Résumé : Dialect IDentification (DID) is a challenging task, becomes more complicated when dealing with under-resourced and an inter-country dialects. Speech database or corpora are crucial for both developing and evaluating accurate DID systems. The aims of our work are, on the one hand, to build Spoken Arabie Algerian corpora knowing that Algerian dialects are among the most complex Arabic dialects and their study received very title attention. On the other hand, to develop an DID system to identify Arabic intra- country dialects. Our three contributions are dedicated to Arabic Algerian Natural Language Processing fields. First, concerning language resource, we have building two spoken Arabic Algerian corpora. First, a parallel corpus ALG-DARIDJAH dedicated especially to linguistics studies. It has been collected using a direct recording method which allow more control on the collected data quality. It is a medium size corpus encompasses some sub-dialects spoken in 17 departments with 109 speakers and more than 6K utterances, The second corpus is KALAM'DZ which is a large web-based corpus dedicated to train machine learning systems. Its designed collection and processing recipe is a generic one and it relies on many open source tools. KALAM'DZ corpus encompasses the 8 major dialects with 4881 speakers and more than 104.4 hours. Our second contribution concerns corpus annotation. In fact, we investigate an altruistic crowdsourcing to validate our semi-automatic dialect annotation of KALAM'DZ corpus. From this experiments, we have determined a list of best practices for altruistic crowdsourcing for corpus annotations. Finally, we have investigated two solutions country context, relying on the prosodic features and new machine learning techniques. For systems, we have focused on measuring the discriminative power of the prosody informa- tion and deep learning modeling, especially Feed-Forward Neural Network. The first system is based on a flat classification while the second uses the hierarchical one that employs the hierarchy structure of Algerian dialects. The performances of our both systems are compar- ative to those of literature despite they deal with inter-country Arabic dialects or Algerian dialect areas. The obtained results show that the prosodic features are less discriminative than acoustic one but its showed their superiority in performance when the test utterance sizes are short which is a DID requirements. While, the ADID system based on a combina- tion of prosodic and acoustic information has improved the dialect detection. However, the hierarchical system has significant improvement the ADID note de thèses : Thèse de doctorat en informatique Automatic classification of audio sequences = Classification automatique des séquences Audio : Application aux dialectes algériennes : application to algerian dialects [document multimédia] / Soumia Bougrine, Auteur ; Cherroun,Hadda, Directeur de la recherche . - [S.l.] : Laghouat : Université Amar Telidji - Département d'informatique, 2019 . - 155 p. ; 30 cm.
ISBN : 978-2-35113-065-0
Thd 10-38 : BIBLIOTHEQUE DE FACULTE DES SCIENCES
hed 10-12 : bibliothèque centrale
Langues : Anglais
Catégories : THESES :20 Langue et litterature anglaise Mots-clés : Algerian dialects Deep Neural Networks Dialect Identification Prosody Speech Corpus Résumé : Dialect IDentification (DID) is a challenging task, becomes more complicated when dealing with under-resourced and an inter-country dialects. Speech database or corpora are crucial for both developing and evaluating accurate DID systems. The aims of our work are, on the one hand, to build Spoken Arabie Algerian corpora knowing that Algerian dialects are among the most complex Arabic dialects and their study received very title attention. On the other hand, to develop an DID system to identify Arabic intra- country dialects. Our three contributions are dedicated to Arabic Algerian Natural Language Processing fields. First, concerning language resource, we have building two spoken Arabic Algerian corpora. First, a parallel corpus ALG-DARIDJAH dedicated especially to linguistics studies. It has been collected using a direct recording method which allow more control on the collected data quality. It is a medium size corpus encompasses some sub-dialects spoken in 17 departments with 109 speakers and more than 6K utterances, The second corpus is KALAM'DZ which is a large web-based corpus dedicated to train machine learning systems. Its designed collection and processing recipe is a generic one and it relies on many open source tools. KALAM'DZ corpus encompasses the 8 major dialects with 4881 speakers and more than 104.4 hours. Our second contribution concerns corpus annotation. In fact, we investigate an altruistic crowdsourcing to validate our semi-automatic dialect annotation of KALAM'DZ corpus. From this experiments, we have determined a list of best practices for altruistic crowdsourcing for corpus annotations. Finally, we have investigated two solutions country context, relying on the prosodic features and new machine learning techniques. For systems, we have focused on measuring the discriminative power of the prosody informa- tion and deep learning modeling, especially Feed-Forward Neural Network. The first system is based on a flat classification while the second uses the hierarchical one that employs the hierarchy structure of Algerian dialects. The performances of our both systems are compar- ative to those of literature despite they deal with inter-country Arabic dialects or Algerian dialect areas. The obtained results show that the prosodic features are less discriminative than acoustic one but its showed their superiority in performance when the test utterance sizes are short which is a DID requirements. While, the ADID system based on a combina- tion of prosodic and acoustic information has improved the dialect detection. However, the hierarchical system has significant improvement the ADID note de thèses : Thèse de doctorat en informatique Réservation
Réserver ce document
Exemplaires
Code-barres Cote Support Localisation Section Disponibilité Thlg 20.514 Thlg 20.514 Thése BIBLIOTHEQUE DES LITTERATURES ET LANGUES Lettres et langue anglaises (bll) Disponible Étude et simulation d’un algorithme de point de reprise avec plusieurs initiateurs / Soumia Bougrine
Titre : Étude et simulation d’un algorithme de point de reprise avec plusieurs initiateurs Type de document : document multimédia Auteurs : Soumia Bougrine, Auteur ; Fatima Zahra Abdelhafidi, Directeur de thèse Editeur : Laghouat : Université Amar Telidji - Département d'informatique Année de publication : 2012 Importance : 54 p. Accompagnement : 1 disque optique numérique (CD-ROM) Note générale : Option : Réseaux, systèmes et applications réparties Langues : Français Mots-clés : Systèmes répartis NS2 Point de reprise avec plusieurs initiateurs Résumé : Trois dans la littérature. La première est la classe coordonnée, la deuxième est la classe classes des protocoles de calcul des points de reprise global ont été proposées non-coordonnée et la dernière est la classe CIC (Communication induced checkpointing ). Dans ce mémoire, on s’intéresse à la classe coordonnée avec plusieurs initiateurs. A cet effet on a proposé deux solutions qui calculent le point de reprise global cohérent tout en minimisant le nombre de message utilisé, une évaluation de performances est réalisée en utilisant la simulation avec NS2. Les résultats obtenus montrent que la deuxième solution est meilleure que la première solution. note de thèses : Mémoire de master en informatique Étude et simulation d’un algorithme de point de reprise avec plusieurs initiateurs [document multimédia] / Soumia Bougrine, Auteur ; Fatima Zahra Abdelhafidi, Directeur de thèse . - Laghouat : Université Amar Telidji - Département d'informatique, 2012 . - 54 p. + 1 disque optique numérique (CD-ROM).
Option : Réseaux, systèmes et applications réparties
Langues : Français
Mots-clés : Systèmes répartis NS2 Point de reprise avec plusieurs initiateurs Résumé : Trois dans la littérature. La première est la classe coordonnée, la deuxième est la classe classes des protocoles de calcul des points de reprise global ont été proposées non-coordonnée et la dernière est la classe CIC (Communication induced checkpointing ). Dans ce mémoire, on s’intéresse à la classe coordonnée avec plusieurs initiateurs. A cet effet on a proposé deux solutions qui calculent le point de reprise global cohérent tout en minimisant le nombre de message utilisé, une évaluation de performances est réalisée en utilisant la simulation avec NS2. Les résultats obtenus montrent que la deuxième solution est meilleure que la première solution. note de thèses : Mémoire de master en informatique