Catalogue des ouvrages Université de Laghouat
A partir de cette page vous pouvez :
Détail de l'auteur
Documents disponibles écrits par cet auteur
Ajouter le résultat dans votre panier Faire une suggestion Affiner la recherche

Titre : | Free growdsourcing-based corpus annotation | Type de document : | texte manuscrit | Auteurs : | Fatna Bougrine, Auteur ; Soumia Djellikh, Auteur ; Hadda Cherroun, Directeur de thèse | Editeur : | Laghouat : Université Amar Telidji - Département d'informatique | Année de publication : | 2017 | Importance : | 67 p. | Format : | 30 cm. | Accompagnement : | 1 disque optique numérique (CD-ROM) | Note générale : | Option : Networks, systems and distributed applications (Réseaux,systèmes et applications réparties) | Langues : | Anglais | Mots-clés : | Algerian dialects ANLP Corpus Annotation Crowdcrafting Crowdsourcing | Résumé : | Large corpora are very useful to develop and validate Natural Language Processing (NLP) systems. However, these corpora are generally collected and annotated automatically. To validate such annotation, two solutions are possible. We can use skills of expert, which can be costly and time consuming, or use crowdsourcing technique. Crowdsourcing can be defined as the act of attracting many non experts to complete a certain task by using paid/unpaid dedicated platform. In this work, we are interested to validate a semi-automatic dialect annotation of Kalam’DZ corpus. Our approach relies on free crowdsourcing using Crowdcrafting platform. The validation is performed on 10% (11 hours) of the total size of Kalam’DZ. A quality control of this validation is ensured through a confrontation with expert annotation, which shows that more than 80% of annotations are similar. Our results confirm that free crowdsourcing is effective for speech dialect annotation. | note de thèses : | Mémoire de master en informatique |
Free growdsourcing-based corpus annotation [texte manuscrit] / Fatna Bougrine, Auteur ; Soumia Djellikh, Auteur ; Hadda Cherroun, Directeur de thèse . - Laghouat : Université Amar Telidji - Département d'informatique, 2017 . - 67 p. ; 30 cm. + 1 disque optique numérique (CD-ROM). Option : Networks, systems and distributed applications (Réseaux,systèmes et applications réparties) Langues : Anglais Mots-clés : | Algerian dialects ANLP Corpus Annotation Crowdcrafting Crowdsourcing | Résumé : | Large corpora are very useful to develop and validate Natural Language Processing (NLP) systems. However, these corpora are generally collected and annotated automatically. To validate such annotation, two solutions are possible. We can use skills of expert, which can be costly and time consuming, or use crowdsourcing technique. Crowdsourcing can be defined as the act of attracting many non experts to complete a certain task by using paid/unpaid dedicated platform. In this work, we are interested to validate a semi-automatic dialect annotation of Kalam’DZ corpus. Our approach relies on free crowdsourcing using Crowdcrafting platform. The validation is performed on 10% (11 hours) of the total size of Kalam’DZ. A quality control of this validation is ensured through a confrontation with expert annotation, which shows that more than 80% of annotations are similar. Our results confirm that free crowdsourcing is effective for speech dialect annotation. | note de thèses : | Mémoire de master en informatique |
|
Réservation
Réserver ce document
Exemplaires
Disponibilité |
---|
MF 01-22 | MF 01-22 | Thése | BIBLIOTHEQUE DE FACULTE DES SCIENCES | théses (sci) | Disponible |