Brain Drain reasons analysis, Text Mining in Sociology

The project is about Brain Drain in Tunisia. A survey was conducted among Tunisian people in order to get answers of several questions about this topic. As a team, we are mainly interested in making analysis about two basic questions among all the questions being answered:

✦ What are the reasons that would push you to leave Tunisia? ✦ Socially, what’s the difference between Tunisia and abroad in social life?

The text mining project pipeline is as follow:

  1. Read questionnaire data from a csv file
  2. Initialize raw corpus
  3. Text cleaning
  4. Text representation
  5. Topic modeling
  6. Clustering
  7. Predictive modeling
  8. Visualization and interpretation of final results

We resort technically to the programming language Python.

Mouna Belaid
Data Consultant | R-Ladies Global Team Member | Community Lead of R-Ladies Paris