# churn_prediction **Repository Path**: jfdwd/churn_prediction ## Basic Information - **Project Name**: churn_prediction - **Description**: The probabilities calculated by the model BG-NBD are used to define the target variable to predict customer churn. - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2022-01-05 - **Last Updated**: 2022-01-05 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # The importance of customer churn prediction According to the Leo Vegas CEO, Gustaf Hagman (see the picture below), in 2018 the overall revenue of the company was around €360 million. Considering that online gambling is only 10% of the total gambling and that there is an increasing trend in everything going online the perspective of growing is promising. ![cassino revenue source H2CG](assets/revenues.jpg) In this growing market keeping customers satisfied is paramount so they continue using the product. In this scenario, customer churn prediction is a very reasonable alternative to approach customer retention based on its past behavior. # Task From a dataset containing daily aggregations for around 10k customers who signed up during a calendar year define the prediction target and create a predictive model that you would be comfortable sharing with a hypothetical stakeholder. # Solution After an initial exploratory data analysis and feature engineering, the probabilities generated by the model BG-NBD were used to define the target variable (churn/not churn). To predict customer churn 5 models were build, compared and evaluated. The final criteria to choose the model was a trade-off between performance and explainability. # Setup ## Setup a conda environment Create and activate a conda environment of your choice, here I call it churn_lv: ```python conda create --name churn_lv python==3.8 conda activate churn_lv ``` ## Install the packages through the command ```python pip install -r requirements.txt ``` # Execution sequence The notebooks should be executed in the following order: 1. _eda.ipynb_: - performs an exploratory data analysis 2. _feature_eng.ipynb_: - performs feature engineering 3. _bg-nbd_model.ipynb_: - implements the model BG-NBD 4. _target_definition.ipynb_: - uses the results from the model BG-NBD to define the target variable 5. _model_building.ipynb_: - build, select and evaluate the model 6. _model_building_reduced_features.ipynb_: - remove some features, build, select and evaluate the model # Directory structure or the repository ```bash . ├── assets ├── data ├── .ipynb_checkpoints ├── models ├── notebooks ├── README.md ├── references ├── reports ├── requirements.txt └── src ```