- 1 Summary
- 2 Purpose of this post
- 3 The data
- 4 Data cleaning
- 5 Exploratory data analysis
- 5.1 Which gender borrows the most by region?
- 5.2 What type of repayment interval is used in each region?
- 5.3 What are Kiva loans used across regions?
- 5.4 Which are the top ten countries for Kiva loans?
- 5.5 How many lenders fund a single Kiva microloan per country?
- 5.6 How much, how long, how many lenders, how long to fund and disburse
- 6 Questions
- 7 Conclusion
To see the the code used in this post, visit my GitHub repository for this site
- Objectives: To visualise a dataset and understand its main trends.
- Challenge: Largest dataset worked with so far.
- Data points: 13424100
- Language: R
2 Purpose of this post
Muhammad Yunus and the Grameen Bank won the Nobel Peace Prize in 2006 for “their efforts through microcredit to create economic and social development from below.” Back in 1976, Yunus, at the time a professor at the University of Chittagong (Bangladesh), noticed that small amounts of money could make a substantial difference to people living in poverty. He started to loan money to people that didn’t meet the requirements listed by the mainstream banking system. It was reported that these type of loans were effective to “emerge” from poverty using default rates lower than those of commercial banks, reported at 2%. Eventually, in October 1983, Muhammad Yunus founded Grameen Bank, considered to be the first microfinance institution.
Founded in 2005, Kiva, has the same mission as Grameen Bank except that anyone can become a Kiva banker. This online platform enables microcredit lending to help low-income entrepreneurs around the world with a couple of clicks. Pretty neat, huh? In this post, I unpack a large dataset published by Kiva on the Kaggle platform and explore these microloans.
3 The data
The dataset was published on the Kaggle platform. The complete dataset was a zip file with size 232.7 MB containing four files:
loan_themes_by_region.csv. After I looked at the contents, I chose to work with the first one
kiva_loans.csv with 671,205 observations and 20 variables.
4 Data cleaning
4.1 Borrower gender
From the variable descriptions, I expected
borrower_gendersto have only two levels, male or female. I see many more levels, 11298 to be precise. This isn’t very clear so I fix that first by creating five levels:
 "mixed_genders" "mult_females" "mult_males" "single_female"  "single_male"
4.2 Loan amounts
Now, since I’m trying to make sense of loans around the world, it’s better if all loans are in the same currency for reference. I use the
quantmode package to convert all loans into a single currency (US dollars). There were two currencies that were unavailable.
4.3 Country codes
Finally, with 86 countries there are 86 levels. Perhaps it would be interesting to create another category called region to produce less levels and have a better understanding of the overall function of regional distributions.
Country codes are in the ISO-3166 format, so I the associated region code found here and created five regions: Africa, Asia, Europe, Oceania, and South America.
 "Africa" "Americas" "Asia" "Europe" "Oceania"
I calculate two lengths of time that I think are interesting. First, how much time is there between posting the loan to disbursement (
total_time). Second, how long does a loan take to get funded (
5 Exploratory data analysis
5.1 Which gender borrows the most by region?
5.2 What type of repayment interval is used in each region?
5.3 What are Kiva loans used across regions?
5.4 Which are the top ten countries for Kiva loans?
Here I create three different top ten rankings to order countries by number of total Kiva loans, loans per capita, and by internet users. For the internet users ranking, I used this ranking, which is based on numbers published by the International Telecommunications Union.
5.5 How many lenders fund a single Kiva microloan per country?
5.6 How much, how long, how many lenders, how long to fund and disburse
- Why is retail and not food (as in other regions) the second most common use for loans in Asia?
- Who are the givers? Where are they? Does proximity of the lender to the borrower have anything to do with funding times?
- Does the Kiva website have anything to do with funding times? For example,
giving_time, the time between posting the loan and the loan being fully funded, has two peaks, at around one week and one month. Is this due to the platform and the promotion of loans that have been posted for a certain amount of time?
In this post I analyse 671,205 Kiva loans from around the world. Most loans are requested by single females, EU requests the fewest loans, weekly repayment is an unpopular form of paying back loans and entertainment, wholesale, manufacturing, and construction amount less than 2.2% of sectors. The main uses for Kiva loans are agriculture, retail, and food with some variations amongst regions. Half of the loans are 4.22 USD or less and are funded by 12 or less lenders. The median time between posting on the Kiva platform and disbursing it to the borrower is 16.89 days. I mainly use the tidyverse, stringr, and quantmode packages.