# 1 Summary

To see the the code used in this post, visit my GitHub repository for this site

• Objectives: To visualise a dataset and understand its main trends.
• Challenge: Largest dataset worked with so far.
• Data points: 13424100
• Language: R

# 2 Purpose of this post

Muhammad Yunus and the Grameen Bank won the Nobel Peace Prize in 2006 for “their efforts through microcredit to create economic and social development from below.” Back in 1976, Yunus, at the time a professor at the University of Chittagong (Bangladesh), noticed that small amounts of money could make a substantial difference to people living in poverty. He started to loan money to people that didn’t meet the requirements listed by the mainstream banking system. It was reported that these type of loans were effective to “emerge” from poverty using default rates lower than those of commercial banks, reported at 2%. Eventually, in October 1983, Muhammad Yunus founded Grameen Bank, considered to be the first microfinance institution.

Founded in 2005, Kiva, has the same mission as Grameen Bank except that anyone can become a Kiva banker. This online platform enables microcredit lending to help low-income entrepreneurs around the world with a couple of clicks. Pretty neat, huh? In this post, I unpack a large dataset published by Kiva on the Kaggle platform and explore these microloans.

# 3 The data

The dataset was published on the Kaggle platform. The complete dataset was a zip file with size 232.7 MB containing four files: kiva_loans.csv, kiva_mpi_region_locations.csv,loan_theme_ids.csv, and loan_themes_by_region.csv. After I looked at the contents, I chose to work with the first one kiva_loans.csv with 671,205 observations and 20 variables.

# 4 Data cleaning

## 4.1 Borrower gender

From the variable descriptions, I expected borrower_gendersto have only two levels, male or female. I see many more levels, 11298 to be precise. This isn’t very clear so I fix that first by creating five levels:

[1] "mixed_genders" "mult_females"  "mult_males"    "single_female"
[5] "single_male"  

## 4.2 Loan amounts

Now, since I’m trying to make sense of loans around the world, it’s better if all loans are in the same currency for reference. I use the quantmode package to convert all loans into a single currency (US dollars). There were two currencies that were unavailable.

## 4.3 Country codes

Finally, with 86 countries there are 86 levels. Perhaps it would be interesting to create another category called region to produce less levels and have a better understanding of the overall function of regional distributions.

Country codes are in the ISO-3166 format, so I the associated region code found here and created five regions: Africa, Asia, Europe, Oceania, and South America.

[1] "Africa"   "Americas" "Asia"     "Europe"   "Oceania" 

## 4.4 Dates

I calculate two lengths of time that I think are interesting. First, how much time is there between posting the loan to disbursement (total_time). Second, how long does a loan take to get funded (giving_time)?

# 5 Exploratory data analysis

## 5.4 Which are the top ten countries for Kiva loans?

Here I create three different top ten rankings to order countries by number of total Kiva loans, loans per capita, and by internet users. For the internet users ranking, I used this ranking, which is based on numbers published by the International Telecommunications Union.

# 6 Questions

• Why is retail and not food (as in other regions) the second most common use for loans in Asia?
• Who are the givers? Where are they? Does proximity of the lender to the borrower have anything to do with funding times?
• Does the Kiva website have anything to do with funding times? For example, giving_time, the time between posting the loan and the loan being fully funded, has two peaks, at around one week and one month. Is this due to the platform and the promotion of loans that have been posted for a certain amount of time?

# 7 Conclusion

In this post I analyse 671,205 Kiva loans from around the world. Most loans are requested by single females, EU requests the fewest loans, weekly repayment is an unpopular form of paying back loans and entertainment, wholesale, manufacturing, and construction amount less than 2.2% of sectors. The main uses for Kiva loans are agriculture, retail, and food with some variations amongst regions. Half of the loans are 4.22 USD or less and are funded by 12 or less lenders. The median time between posting on the Kiva platform and disbursing it to the borrower is 16.89 days. I mainly use the tidyverse, stringr, and quantmode packages.