Self-Supervised Contrastive Learning for Low-Resource Classification Tasks

Authors

  • Chukwuemeka Okonkwo,Fatima Bello Self-Supervised Contrastive Learning for Low-Resource Classification Tasks

DOI:

Keywords:

Self-supervised learning, contrastive learning, low-resource NLP, African languages, Nigerian healthcare AI, few-shot learning, data augmentation, prototype networks

Abstract

The scarcity of annotated training data represents one of the most pressing bottlenecks for the deployment of artificial intelligence systems in Nigeria and across sub-Saharan Africa. This challenge is particularly acute in domains such as Nigerian indigenous language processing (covering Yoruba, Igbo, Hausa, Efik, Tiv, and over 500 other languages), healthcare diagnostics at resource-limited facilities, and agricultural disease detection in smallholder farming communities. In this paper, we present SSCL-LR (Self-Supervised Contrastive Learning for Low-Resource), a novel unified framework that combines momentum-based contrastive representation learning with task-adaptive prototype alignment to achieve strong classification performance under severely constrained annotation budgets. Our approach introduces two key technical innovations specifically designed for the Nigerian and broader African low-resource context: (1) a Hierarchical Augmentation Strategy (HAS) that generates semantically consistent positive training pairs across linguistic, visual, and medical modalities; and (2) a Dynamic Prototype Alignment (DPA) mechanism that progressively refines class decision boundaries using limited labeled examples combined with abundant unlabeled data. We evaluate SSCL-LR on eight benchmark datasets, including four Nigerian-specific corpora: the Hausa-NLP News Classification dataset, the Yoruba Sentiment Corpus (YorubaSenti), the Nigeria Centre for Disease Control (NCDC) clinical notes dataset, and the Cassava Leaf Disease dataset sourced from smallholder farms in Benue and Nasarawa States. Across all benchmarks, SSCL-LR achieves 4.1%--9.3% improvements over state-of-the-art baselines under 1-shot, 5-shot, and 10-shot settings. Our code, pre-trained models, and the newly curated NaijaLowRes benchmark suite are publicly released at https://github.com/acair-unilag/sscl-lr.

Downloads 66 and Views 0

References

1. Adelani DI, et al. MasakhaNER: Named entity recognition for African languages. Trans Assoc Comput Linguist. 2021;9:1116–31. 2. AfricaNLP Workshop Organisers. Proceedings of the 2nd Workshop on African Natural Language Processing. EACL; 2021. 3. Asubiaro TV. A Yoruba lexical resource for NLP applications. In: Proc Afr Lang Technol Workshop, LREC; 2019. 4. Chen T, Kornblith S, Norouzi M, Hinton G. A simple framework for contrastive learning of visual representations. In: Proc Int Conf Mach Learn (ICML); 2020. p. 1597–607. 5. Food and Agriculture Organization of the United Nations. Nigeria: Agricultural sector review. Rome: FAO; 2023. 6. Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proc ICML; 2017. p. 1126–35. 7. Gao T, Yao X, Chen D. SimCSE: Simple contrastive learning of sentence embeddings. In: Proc EMNLP; 2021. p. 6894–910. 8. Grill JB, et al. Bootstrap your own latent: A new approach to self-supervised learning. In: Proc NeurIPS; 2020. p. 21271–84. 9. He K, Fan H, Wu Y, Xie S, Girshick R. Momentum contrast for unsupervised visual representation learning. In: Proc CVPR; 2020. p. 9729–38. 10. National Bureau of Statistics. Nigerian Gross Domestic Product Report Q3 2024. Abuja: NBS; 2024. 11. Nekoto W, et al. Participatory research for low-resourced machine translation: A case study in African languages. In: Findings of EMNLP; 2020. p. 2144–60. 12. National Information Technology Development Agency. National Digital Economy Policy and Strategy (2020–2030). Abuja: NITDA; 2020. 13. Ogueji K, Zhu Y, Lin J. Small data? No problem! Exploring pretrained multilingual models for low-resource African languages. In: Proc ACL MultilingualNLP; 2021. 14. Shode I, Akinjobi D, Afolabi A, Adelani D, Dossou B, Emezue C. AFRISENTI-SEMEVAL: Sentiment analysis for low-resource African languages using Twitter data. In: Proc SemEval; 2023. 15. Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Proc NeurIPS; 2017. 16. Sohn K, et al. FixMatch: Simplifying semi-supervised learning with consistency and confidence. In: Proc NeurIPS; 2020. 17. Udandarao V, Gupta A, Albanie S. SuS-X: Training-free name-only transfer of vision-language models. In: Proc ICCV; 2023. 18. Wu Z, Xiong Y, Yu SX, Lin D. Unsupervised feature learning via non-parametric instance discrimination. In: Proc CVPR; 2018. p. 3733–42. 19. Yan Y, et al. ConSERT: A contrastive framework for self-supervised sentence representation transfer. In: Proc ACL; 2021. p. 5065–75. 20. Zbontar J, Jing L, Misra I, LeCun Y, Deny S. Barlow Twins: Self-supervised learning via redundancy reduction. In: Proc ICML; 2021. p. 12310–20.

Published

2026-04-16

How to Cite

Chukwuemeka Okonkwo,Fatima Bello. (2024). Self-Supervised Contrastive Learning for Low-Resource Classification Tasks International Journal of Management, Engineering and Social Sciences,1(1), 23-28.

Issue

Section

Articles