# Exploring Loan Data from Prosper

## I. Dataset:

Based on Prosper website: Prosper is America’s first marketplace lending platform, with over $7 billion in funded loans. Prosper allows people to invest in each other in a way that is financially and socially rewarding. On Prosper, borrowers list loan requests between$2,000 and $35,000 and individual investors invest as little as$25 in each loan listing they select. Prosper handles the servicing of the loan on behalf of the matched borrowers and investors.

Data Set contains information about Prosper loans:

> loan %>% dim
[1] 113937     82
• There are 113937 different loans which are categorized with 82 different variables.
> loan %>% str
'data.frame':   113937 obs. of  82 variables:
$ListingKey : Factor w/ 113066 levels "00003546482094282EF90E5",..: 7180 7193 6647 6669 6686 6689 6699 6706 6687 6687 ...$ ListingNumber                      : int  193129 1209647 81716 658116 909464 1074836 750899 768193 1023355 1023355 ...
$ListingCreationDate : Factor w/ 113064 levels "2005-11-09 20:44:28.847000000",..: 14184 111894 6429 64760 85967 100310 72556 74019 97834 97834 ...$ CreditGrade                        : Ord.factor w/ 9 levels ""<"NC"<"HR"<"E"<..: 6 1 3 1 1 1 1 1 1 1 ...
$Term : int 36 36 36 36 36 60 36 36 36 36 ...$ LoanStatus                         : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
$ClosedDate : Factor w/ 2803 levels "","2005-11-25 00:00:00",..: 1138 1 1263 1 1 1 1 1 1 1 ...$ BorrowerAPR                        : num  0.165 0.12 0.283 0.125 0.246 ...
$BorrowerRate : num 0.158 0.092 0.275 0.0974 0.2085 ...$ LenderYield                        : num  0.138 0.082 0.24 0.0874 0.1985 ...
$EstimatedEffectiveYield : num NA 0.0796 NA 0.0849 0.1832 ...$ EstimatedLoss                      : num  NA 0.0249 NA 0.0249 0.0925 ...
$EstimatedReturn : num NA 0.0547 NA 0.06 0.0907 ...$ ProsperRating..numeric.            : int  NA 6 NA 6 3 5 2 4 7 7 ...
$ProsperRating..Alpha. : Factor w/ 8 levels "","A","AA","B",..: 1 2 1 2 6 4 7 5 3 3 ...$ ProsperScore                       : num  NA 7 NA 9 4 10 2 4 9 11 ...
$ListingCategory..numeric. : int 0 2 0 16 2 1 1 2 7 7 ...$ BorrowerState                      : Factor w/ 52 levels "","AK","AL","AR",..: 7 7 12 12 25 34 18 6 16 16 ...
$Occupation : Factor w/ 68 levels "","Accountant/CPA",..: 37 43 37 52 21 43 50 29 24 24 ...$ EmploymentStatus                   : Factor w/ 9 levels "","Employed",..: 9 2 4 2 2 2 2 2 2 2 ...
$EmploymentStatusDuration : int 2 44 NA 113 44 82 172 103 269 269 ...$ IsBorrowerHomeowner                : Factor w/ 2 levels "False","True": 2 1 1 2 2 2 1 1 2 2 ...
$CurrentlyInGroup : Factor w/ 2 levels "False","True": 2 1 2 1 1 1 1 1 1 1 ...$ GroupKey                           : Factor w/ 707 levels "","00343376901312423168731",..: 1 1 335 1 1 1 1 1 1 1 ...
$DateCreditPulled : Factor w/ 112992 levels "2005-11-09 00:30:04.487000000",..: 14347 111883 6446 64724 85857 100382 72500 73937 97888 97888 ...$ CreditScoreRangeLower              : int  640 680 480 800 680 740 680 700 820 820 ...
$CreditScoreRangeUpper : int 659 699 499 819 699 759 699 719 839 839 ...$ FirstRecordedCreditLine            : Factor w/ 11586 levels "","1947-08-24 00:00:00",..: 8639 6617 8927 2247 9498 497 8265 7685 5543 5543 ...
$CurrentCreditLines : int 5 14 NA 5 19 21 10 6 17 17 ...$ OpenCreditLines                    : int  4 14 NA 5 19 17 7 6 16 16 ...
$TotalCreditLinespast7years : int 12 29 3 29 49 49 20 10 32 32 ...$ OpenRevolvingAccounts              : int  1 13 0 7 6 13 6 5 12 12 ...
$OpenRevolvingMonthlyPayment : num 24 389 0 115 220 1410 214 101 219 219 ...$ InquiriesLast6Months               : int  3 3 0 0 1 0 0 3 1 1 ...
$TotalInquiries : num 3 5 1 1 9 2 0 16 6 6 ...$ CurrentDelinquencies               : int  2 0 1 4 0 0 0 0 0 0 ...
$AmountDelinquent : num 472 0 NA 10056 0 ...$ DelinquenciesLast7Years            : int  4 0 0 14 0 0 0 0 0 0 ...
$PublicRecordsLast10Years : int 0 1 0 0 0 0 0 1 0 0 ...$ PublicRecordsLast12Months          : int  0 0 NA 0 0 0 0 0 0 0 ...
$RevolvingCreditBalance : num 0 3989 NA 1444 6193 ...$ BankcardUtilization                : num  0 0.21 NA 0.04 0.81 0.39 0.72 0.13 0.11 0.11 ...
$AvailableBankcardCredit : num 1500 10266 NA 30754 695 ...$ TotalTrades                        : num  11 29 NA 26 39 47 16 10 29 29 ...
$TradesNeverDelinquent..percentage. : num 0.81 1 NA 0.76 0.95 1 0.68 0.8 1 1 ...$ TradesOpenedLast6Months            : num  0 2 NA 0 2 0 0 0 1 1 ...
$DebtToIncomeRatio : num 0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...$ IncomeRange                        : Ord.factor w/ 8 levels "Not displayed"<..: 5 6 1 5 8 8 5 5 5 5 ...
$IncomeVerifiable : Factor w/ 2 levels "False","True": 2 2 2 2 2 2 2 2 2 2 ...$ StatedMonthlyIncome                : num  3083 6125 2083 2875 9583 ...
$LoanKey : Factor w/ 113066 levels "00003683605746079487FF7",..: 100337 69837 46303 70776 71387 86505 91250 5425 908 908 ...$ TotalProsperLoans                  : int  NA NA NA NA 1 NA NA NA NA NA ...
$TotalProsperPaymentsBilled : int NA NA NA NA 11 NA NA NA NA NA ...$ OnTimeProsperPayments              : int  NA NA NA NA 11 NA NA NA NA NA ...
$ProsperPaymentsLessThanOneMonthLate: int NA NA NA NA 0 NA NA NA NA NA ...$ ProsperPaymentsOneMonthPlusLate    : int  NA NA NA NA 0 NA NA NA NA NA ...
$ProsperPrincipalBorrowed : num NA NA NA NA 11000 NA NA NA NA NA ...$ ProsperPrincipalOutstanding        : num  NA NA NA NA 9948 ...
$ScorexChangeAtTimeOfListing : int NA NA NA NA NA NA NA NA NA NA ...$ LoanCurrentDaysDelinquent          : int  0 0 0 0 0 0 0 0 0 0 ...
$LoanFirstDefaultedCycleNumber : int NA NA NA NA NA NA NA NA NA NA ...$ LoanMonthsSinceOrigination         : int  78 0 86 16 6 3 11 10 3 3 ...
$LoanNumber : int 19141 134815 6466 77296 102670 123257 88353 90051 121268 121268 ...$ LoanOriginalAmount                 : int  9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
$LoanOriginationDate : Factor w/ 1873 levels "2005-11-15 00:00:00",..: 426 1866 260 1535 1757 1821 1649 1666 1813 1813 ...$ LoanOriginationQuarter             : Factor w/ 33 levels "Q1 2006","Q1 2007",..: 18 8 2 32 24 33 16 16 33 33 ...
$MemberKey : Factor w/ 90831 levels "00003397697413387CAF966",..: 11071 10302 33781 54939 19465 48037 60448 40951 26129 26129 ...$ MonthlyLoanPayment                 : num  330 319 123 321 564 ...
$LP_CustomerPayments : num 11396 0 4187 5143 2820 ...$ LP_CustomerPrincipalPayments       : num  9425 0 3001 4091 1563 ...
$LP_InterestandFees : num 1971 0 1186 1052 1257 ...$ LP_ServiceFees                     : num  -133.2 0 -24.2 -108 -60.3 ...
$LP_CollectionFees : num 0 0 0 0 0 0 0 0 0 0 ...$ LP_GrossPrincipalLoss              : num  0 0 0 0 0 0 0 0 0 0 ...
$LP_NetPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...$ LP_NonPrincipalRecoverypayments    : num  0 0 0 0 0 0 0 0 0 0 ...
$PercentFunded : num 1 1 1 1 1 1 1 1 1 1 ...$ Recommendations                    : int  0 0 0 0 0 0 0 0 0 0 ...
$InvestmentFromFriendsCount : int 0 0 0 0 0 0 0 0 0 0 ...$ InvestmentFromFriendsAmount        : num  0 0 0 0 0 0 0 0 0 0 ...
$Investors : int 258 1 41 158 20 1 1 1 1 1 ...$ ListingCategory.factor             : Ord.factor w/ 21 levels "Not Available"<..: 1 3 1 17 3 2 2 3 8 8 ...

## II-A. Univariate Plot Session:

### 1. Original Loan:

• LoanOriginalAmount

[Figure 1]

> loan $LoanOriginalAmount %>% summary Min. 1st Qu. Median Mean 3rd Qu. Max. 1000 4000 6500 8337 12000 35000 • Loan Origination Quarter [Figure 2] • Loan Term The length of the loan expressed in months. [Figure 3] • Loan Status: The current status of the loan: Cancelled, Chargedoff, Completed, Current, Defaulted, FinalPaymentInProgress, PastDue. The PastDue status will be accompanied by a delinquency bucket. [Figure 4] ### 2. Interest Rate: [Figure 5] > loan[,c("LenderYield", "EstimatedEffectiveYield", "BorrowerAPR","BorrowerRate")] %>% summary LenderYield EstimatedEffectiveYield BorrowerAPR BorrowerRate Min. :-0.0100 Min. :-0.183 Min. :0.00653 Min. :0.0000 1st Qu.: 0.1242 1st Qu.: 0.116 1st Qu.:0.15629 1st Qu.:0.1340 Median : 0.1730 Median : 0.162 Median :0.20976 Median :0.1840 Mean : 0.1827 Mean : 0.169 Mean :0.21883 Mean :0.1928 3rd Qu.: 0.2400 3rd Qu.: 0.224 3rd Qu.:0.28381 3rd Qu.:0.2500 Max. : 0.4925 Max. : 0.320 Max. :0.51229 Max. :0.4975 NA's :29084 NA's :25  ### 3. Loan Assessment By Credit & Rating: • Credit Grade: The Credit rating that was assigned at the time the listing went live. Applicable for listings pre-2009 period and will only be populated for those listings. [Figure 6] To generate a more meaningful chart, I have to reorganize the levels of CreditGrade in terms of credit risk rating. [Figure 7] • Prosper Rating: The Prosper Rating assigned at the time the listing was created: 0 - N/A, 1 - HR, 2 - E, 3 - D, 4 - C, 5 - B, 6 - A, 7 - AA. Applicable for loans originated after July 2009. > loan$ProsperRating..Alpha. %>% table
.
A    AA     B     C     D     E    HR
29084 14551  5372 15581 18345 14274  9795  6935

[Figure 8]

### 4. Loan Assessment By Income & Asset:

• Income Factor:

[Figure 9]

• Homeowner Factor:

[Figure 10]

### 5. Other Information related to Loan:

[Figure 11]

[Figure 12]

[Figure 13]

Notice that:

• Most loans are made from people in big states.
• Prosper need to improve about the quality of the Occupation data because two highgest percentage of loan makers are listed as Other and Professional. And we cannot know what they exactly do for a living.
• It seems that most of the borrowers have problem with their debts since the purpose of 51.2% of borrowers is debt consolidation.

## II-B. Univariate Analysis

### What is the structure of your dataset?

The dataset contains 113937 observations. Each observation is described by 82 variables.

I choose with 17 features out of 81, and classify them into 4 groups:

• Original Loan: LoanOriginalAmount,LoanOriginationQuarter, Term, LoanStatus
• Interest Rate: LenderYield, EstimatedEffectiveYield, BorrowerAPR, BorrowerRate
• Rating or Loan Assessment: Credit Grade, Prosper Rating, EmploymentStatus, IncomeRange, IsBorrowerHomeowner.
• Supporting Variables: BorrowerState, Occupation, ListingCategory, EstimatedReturn.

### What is/are the main feature(s) of interest in your dataset?

My main interest in this dataset is LenderYield which is the lender yield on the loan because I want to analyze this dataset at a glance of an investor. So there are some questions that I try to answer:

• Which are factors that influence or have relationship with LenderYield ?
• Is there any risk of investing money?
• Which is a good strategy of investing money into Prosper Loan Platform?

### What other features in the dataset do you think will help support your investigation into your feature(s) of interest?

• Original Loan:

• LoanOriginalAmount tells us the distribution of all the amounts of loan. We can see that most of the loans are under $10000. This density plot shows the frequency of the loan amount are peaked at multiple of$5000. [Figure 1]
• Loan Origination Quarter tells us which quarter has more loan. And how prosper operates each year. [Figure 2]
• Loan Term indicates that majority of loan terms are 36-months. Only a small fraction of loans are 12-months loans. [Figure 3]
• Loan Status : Defaulted and Chargedoff are accounted for nearly 15% of the number of loans. It means the risk of investing in the bad loan is pretty high. Also, I believe this variable provide a lot of information about the loan. Especially, I would like to focus on Chargedoff, Defaulted and Completed because it tells you how past loans performed. [Figure 4]
• Interest Rate: We cannot find the specific interest rate in the dataset. However, I believe Lender Yield, Estimated Effective Yield, Borrower APR and Borrower Rate can represent as the interest rate. Even though they are used for different purpose, their distribution are nigh identical (Based on [Figure 5]) => My question is: Are they correlated among each other?

• Loan Assessment/ Loan Rating:

a) By Credit & Rating:

• Since Credit Grade Rating were used before 2009, Credit Grade are not applied to 74.6% of current loan. [Figure 6 & 7]
• We can see that Prosper Rating Numeric and Prosper Rating Alpha are similar. Three most common loans are A, B and C. [Figure 8]

b) By Income & Asset:

• Income Factor: Employed and full-time tend to borrow more than the rest. Among those employed people who have income ranging from $25000 to$75000 take out more loans. And most of them have proof for their income. [Figure 9]
• Half of the loans are made by homeowners. [Figure 10]

### Did you create any new variables from existing variables in the dataset?

I did not create any new variable. However, I have to reorder levels of CreditGrade and ProsperRating..Alpha. in term of risk rating.

• One of interesting observation is the multi-modal distribtuion of LoanOriginal Amount. It seems that Loans are made in the multiple of $5000. • Since I have used alot of bar plots in univariate plot session, I have created a function as called as drawbarplot. In this function, it has already clean the data such as removing NA and tidy them by generating factors for categorical variables. ## III-A. Bivariate Plot Session: ### 1. Loan Status: [Figure 14] [Figure 15] ### 2. Interest Rate [Figure 16] > loan[,c("LenderYield", "EstimatedEffectiveYield", "BorrowerAPR","BorrowerRate")] %>% subset(!is.na(LenderYield) & !is.na(EstimatedEffectiveYield) & !is.na(BorrowerAPR) & !is.na(BorrowerRate)) %>% cor LenderYield EstimatedEffectiveYield BorrowerAPR BorrowerRate LenderYield 1.0000000 0.8953425 0.9933345 0.9999958 EstimatedEffectiveYield 0.8953425 1.0000000 0.8956348 0.8952825 BorrowerAPR 0.9933345 0.8956348 1.0000000 0.9933333 BorrowerRate 0.9999958 0.8952825 0.9933333 1.0000000 [Figure 17] ### 3. Prosper Rating: [Figure 18] ## III-B. Bivariate Analysis ### Talk about some of the relationships you observed in this part of the investigation. How did the feature(s) of interest vary with other features in the dataset? • Loan Status: I focus strongly on LoanStatus variable, especially Chargedoff, Defaulted and Completed. It indicates the health status of all the past loans. • The proportion of ChargedOff loan is higher when Prosper Rating is lower whereas The proportion of Completed loan is lower when Prosper Rating is lower => It means the risk of being charged off is higher with lower rating loan. [Figure 14] However, borrowers who have higher risk take larger loans. [Figure 15] • As an investor, it is wise to ignore loans from unemployed persons. It is hard for them to repay the loan ( Based on IncomeVerifiable vs Loan Status chart). So I mainly focus on the loans of the employed. Income are splitted into 5 groups:$0-$24999,$25000-$49999,$50000-$74999,$74999-$100000,$100000+. We also see that proportion of Chargedoff loan is higher when the income is lower. However the difference in proportion is not large. [Figure 14]
• The proportion of Chargedoff loan is higher when the term of loan is higher. [Figure 14] However, bigger loans are made under longer loan terms. [Figure 15]
• Interest Rate:

• Clearly, LenderYield, EstimatedEffectiveYield, BorrowerAPR, BorrowerRate are strongly positively correlated. It means we only need to analyze the behavior of one of them. And I chose LenderYield to represent as interest rate.[Figure 16]
• There is a linear relationship between LenderYield and ProspeRating..numeric. And the slope of this relationship is downward. [Figure 17] => It means higher risk loans tends to yeild more.
• Prosper Rating: There is a linear trend between Prosper Rating and

• Credit Score Lower Range
• Credit Score Upper Range
• Number of Public Inquires in the last 6 months

### Did you observe any interesting relationships between the other features (not the main feature(s) of interest)?

I thought Homeowner is one of the factor contributing to the difference in loan status. However, it shows that the distribution of loan status between homeowner and non-homeowner are almost similar. It means the proportion of charged loans are in-different whether the borrowers own a house or not. [Figure 14]

### What was the strongest relationship you found?

The strongest relationship is between LenderYield and BorrowerRate. The correlation coefficient is nearly 1.

## IV-A. Multivariate Plots Section

Based on the bivariate analysis, I want to check the linear relationship between ProsperRatingNumeric and predictors such as CreditScoreRangeLower, CreditScoreRangeUpper, TradesNeverDelinquent..percentage., InquiriesLast6Months. Also I add 2 more predictors: PublicRecordsLast12Months, DebtToIncomeRatio to this model.

Call:
lm(formula = ProsperRating..numeric. ~ CreditScoreRangeLower +
InquiriesLast6Months + PublicRecordsLast12Months + DebtToIncomeRatio -
1, data = .)

Residuals:
Min      1Q  Median      3Q     Max
-5.3206 -0.8063  0.1158  0.8861  9.3861

Coefficients:
Estimate Std. Error  t value Pr(>|t|)
CreditScoreRangeLower               0.499747   0.003779  132.239  < 2e-16 ***
CreditScoreRangeUpper              -0.481626   0.003688 -130.602  < 2e-16 ***
TradesNeverDelinquent..percentage.  1.202335   0.040808   29.463  < 2e-16 ***
InquiriesLast6Months               -0.278500   0.003306  -84.235  < 2e-16 ***
PublicRecordsLast12Months          -0.154405   0.044775   -3.448 0.000564 ***
DebtToIncomeRatio                  -0.766687   0.014584  -52.572  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.283 on 77551 degrees of freedom
(36380 observations deleted due to missingness)
Multiple R-squared:  0.9169,    Adjusted R-squared:  0.9169
F-statistic: 1.427e+05 on 6 and 77551 DF,  p-value: < 2.2e-16

[Figure 19]

[Figure 20]

Call:
lm(formula = LenderYield ~ factor(ProsperRating..numeric.) +
factor(Term) + factor(LoanStatus), data = .)

Residuals:
Min        1Q    Median        3Q       Max
-0.156886 -0.010819  0.000239  0.012114  0.189556

Coefficients:
Estimate Std. Error  t value Pr(>|t|)
(Intercept)                       0.2714997  0.0007577  358.316   <2e-16 ***
factor(ProsperRating..numeric.)2 -0.0126750  0.0005205  -24.352   <2e-16 ***
factor(ProsperRating..numeric.)3 -0.0643999  0.0004772 -134.955   <2e-16 ***
factor(ProsperRating..numeric.)4 -0.1175471  0.0005344 -219.966   <2e-16 ***
factor(ProsperRating..numeric.)5 -0.1617421  0.0005506 -293.737   <2e-16 ***
factor(ProsperRating..numeric.)6 -0.2097962  0.0005376 -390.248   <2e-16 ***
factor(ProsperRating..numeric.)7 -0.2356068  0.0006597 -357.161   <2e-16 ***
factor(Term)36                    0.0406862  0.0006017   67.615   <2e-16 ***
factor(Term)60                    0.0635214  0.0006921   91.775   <2e-16 ***
factor(LoanStatus)Completed      -0.0047244  0.0003593  -13.149   <2e-16 ***
factor(LoanStatus)Defaulted       0.0008353  0.0007717    1.082    0.279
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.02244 on 25994 degrees of freedom
(29079 observations deleted due to missingness)
Multiple R-squared:  0.929, Adjusted R-squared:  0.9289
F-statistic: 3.4e+04 on 10 and 25994 DF,  p-value: < 2.2e-16

###############################################################################

Analysis of Variance Table

Response: LenderYield
Df  Sum Sq Mean Sq  F value    Pr(>F)
factor(ProsperRating..numeric.)     6 166.464 27.7440 55111.56 < 2.2e-16 ***
factor(Term)                        2   4.578  2.2889  4546.72 < 2.2e-16 ***
factor(LoanStatus)                  2   0.105  0.0525   104.32 < 2.2e-16 ***
Residuals                       25994  13.086  0.0005
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

## IV-B. Multivariate Analysis

### Talk about some of the relationships you observed in this part of the investigation. Were there features that strengthened each other in terms of looking at your feature(s) of interest?

• Based on the coefficient table and QQ-Plot [Figure 19], it shows that there is a strong linear relationship and goodness of fit of the model between Prosper Rating and following predictors:

• Credit Scores Range (Lower and Upper): It means that Prosper carefully review credit score of the borrowers and add credit scores into their rating assessment.
• Deliquency status: borrowers who take good care of their payment history will be assess as less risk and high rating.
• The number of public records on the borrower’s credit report. Public records include bankruptcies, liens and judgments.
• the ratio between borrowers' debt and income.
• For each loan term, there is linear relationship between Prosper Rating and LoanStatus.

### Were there any interesting or surprising interactions between features?

Since we have the relationship betwen Prosper Rating and other predictors, we can save time upon analyzing loans by looking at Prosper Rating. Prosper Rating is a good measure of risk.

## V. Final Plots and Summary

### Plot One

At the begining, we have too many variables. Now, we can narrow down to the most three important variables: LoanStatus, ProsperRating(Numeric) and LenderYield.

[Figure 21]

### Description One

The above plot shows that higher risk loans yield more. As an investor, I think the yield for Prosper Rating 7 (AA-lowest risk) is too low. If we invest in these loan, the average yielding is about 6% to 7% annually. So we should focus on loans with rating 4, 5(which is C and B) since the risk of being charged off is less than 25% but the yielding is about 15% to 20%. It means instead of investing in a large amount of money into 1 loan, we limit our risk by divesify into different small loans with rating B and C, the expected return rate could be about 16% to 18%.

[Figure 22]

### Description Two

I have created ListingYear variable to see whether the risk of bad loans are gradually exposure when the lending platform is more mature. And this is proven by the above plot. We can see that in 2009 and 2010, the number of blue dots are still noticable, but the risk of blue dots are dissolved by time when the number of loans increases.

[Figure 23]

### Description Three

There are some risks that cannot be accounted for in this dataset are the health of our economy and business management of Prosper:

• If the economy went bad, the number of bad loans would increase.
• Because loan terms create some barriers to liquidity of the loans for investors. If Prosper poorly managed their business and cannot compete with other competitors, the risk of its bankcruptcy is still there, but we cannot control it.

Therefore the plot above can be used as an indicator to estimate how Prosper operates. If the proportion of Prosper loans of new quarter drops compared to the same quarter of last year, we can think of stopping our investment into new loans.

# Reflection

The dataset of Prosper loan had nearly 114000 loans from Nov 2005 to March 2014. Over the year the number of loans has been increasing. Based on my analysis, I can conclude that Prosper Rating is reliable. It is good indicator of risks and bad loans. Also, we find out that higher risk loans tend to yield more. By March 2014, the lending platform has matured enough to exposure the risk of bad loans. In other words, if we divesify our investments, we can expect higher return with higher risk loans on average. However, there are some unaccountable risks such as economy recession or Prosper’s pooly management. Morever, it is hard to find short term loans as 12-months with higher yeilding rate. Therefore we tend to invest to 36-months or 60-months loans.