Random Forest with R-Programming

The following packages are required for the random forest if(!require(tidyverse)){install.packages("tidyverse");library(tidyverse)} if(!require(janitor)){install.packages("janitor");library(janitor)} # for rename if(!require(randomForest)){install.packages("randomForest");library(randomForest)} if(!require(caret)){install.packages("caret");library(caret)} # for `confustionMatrix` A Random forest is made of Random Trees Data <- read_csv(file = here::here("content/post/2022-06-26-random-forest", "german_credit.csv")) Exploring the dataset Data <- clean_names(Data) Data$creditability <- as.factor(Data$creditability) glimpse(Data) ## Rows: 1,000 ## Columns: 21 ## $ creditability <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, … ## $ account_balance <dbl> 1, 1, 2, 1, 1, 1, 1, 1, 4, 2, 1, 1, … ## $ duration_of_credit_month <dbl> 18, 9, 12, 12, 12, 10, 8, 6, 18, 24,… ## $ payment_status_of_previous_credit <dbl> 4, 4, 2, 4, 4, 4, 4, 4, 4, 2, 4, 4, … ## $ purpose <dbl> 2, 0, 9, 0, 0, 0, 0, 0, 3, 3, 0, 1, … ## $ credit_amount <dbl> 1049, 2799, 841, 2122, 2171, 2241, 3… ## $ value_savings_stocks <dbl> 1, 1, 2, 1, 1, 1, 1, 1, 1, 3, 1, 2, … ## $ length_of_current_employment <dbl> 2, 3, 4, 3, 3, 2, 4, 2, 1, 1, 3, 4, … ## $ instalment_per_cent <dbl> 4, 2, 2, 3, 4, 1, 1, 2, 4, 1, 2, 1, … ## $ sex_marital_status <dbl> 2, 3, 2, 3, 3, 3, 3, 3, 2, 2, 3, 4, … ## $ guarantors <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, … ## $ duration_in_current_address <dbl> 4, 2, 4, 2, 4, 3, 4, 4, 4, 4, 2, 4, … ## $ most_valuable_available_asset <dbl> 2, 1, 1, 1, 2, 1, 1, 1, 3, 4, 1, 3, … ## $ age_years <dbl> 21, 36, 23, 39, 38, 48, 39, 40, 65, … ## $ concurrent_credits <dbl> 3, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, … ## $ type_of_apartment <dbl> 1, 1, 1, 1, 2, 1, 2, 2, 2, 1, 1, 1, … ## $ no_of_credits_at_this_bank <dbl> 1, 2, 1, 2, 2, 2, 2, 1, 2, 1, 2, 2, … ## $ occupation <dbl> 3, 3, 2, 2, 2, 2, 2, 2, 1, 1, 3, 3, … ## $ no_of_dependents <dbl> 1, 2, 1, 2, 1, 2, 1, 2, 1, 1, 2, 1, … ## $ telephone <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, … ## $ foreign_worker <dbl> 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, … Assess the creditabiliy with the help of other variables # code ------------------------------------------------------------------------- ggplot(data = Data, aes(x = age_years, color = creditability, fill = creditability)) + geom_histogram(binwidth = 5, position = "identity", alpha = 0....

June 26, 2022 · 6 min · Dr. Ankit Deshmukh