Automated Humanitarian Data Classification
Year: 2021
Section: Natural Language Processing
Status: Complete
Organization: Humanitarian Data Exchange
NLP classification pipeline to auto-tag humanitarian datasets and reduce manual triage in emergency data workflows.
Outcome: Reached 91.8% average classification accuracy across key humanitarian data categories used in rapid assessments.
Methods: Python, FastText embeddings, Multi-layer perceptron, Supervised classification
Problem
Large volumes of crisis data arrive with inconsistent metadata, making category assignment and prioritization slow for response teams.
Work
I built a training dataset from scraped HDX resources, engineered text features, and trained a neural classifier to infer category labels from content patterns.
Delivery
The resulting workflow demonstrated strong automated triage potential for humanitarian information management pipelines.