R v Python Series: Part 01: Data Imports

 


Introduction

Data imports are very easy to do in both R and Python.  The examples were inspired by Matt Dancho and his incredible library of training videos on Business Science University

In the R examples, we will be using the RStudio IDE.  It is assumed that you have a project set up in the folder where the data files exist. For more information about getting a project set up in R Studio and the Python environment set up in VS Code, click the links below.

In both examples, we will be importing and joining 3 different excel data sets.  Do not worry if you do not have the exact data sets to follow along.  You can simply use a data set of your own and follow along.


R: Data Import

The three excel files that we will ingest are located in the following path on my PC.


In the R script below, we will load the following libraries and type the scripts below.  The three lines of code below the library imports load the data into R.

library(tidyverse)
library(readxl)

bikes      <- read_excel("00_Data_Files/bikes.xlsx")
bikeshops  <- read_excel("00_Data_Files/bikeshops.xlsx")
orderlines <- read_excel("00_Data_Files/orderlines.xlsx")

To test to see if the data was imported correctly, simply type the names of the objects and press Ctrl + Enter after each line.

bikes
bikeshops
orderlines

The results of the data imports are listed below.











Python: Data Import

In the Python scripts below, we assume your VS code environment is installed and up and running.  We will be using a different set of files and folders for the Python environment.  The excel files we loaded in the R examples above are located in this path for the Python example.





To import the folder with the excel files, click on File > Open Folder and find the folder containing your Excel files.





If you successfully loaded the files into your environment, you will see them as shown below.




Next, we will load some popular libraries that exist in Python

# 1.0 Load Libraries ----

# # Load Libraries

# Core Python Data Analysis
from numpy.core.defchararray import index
import pandas as pd
import numpy as np

The code to load the .xlsx files is very similar compared to R, as shown below.

bikes_df      = pd.read_excel("00_data_raw/bikes.xlsx")
bikeshops_df  = pd.read_excel("00_data_raw/bikeshops.xlsx")
orderlines_df = pd.read_excel("00_data_raw/orderlines.xlsx")

To test to see if the data was imported correctly, simply type the names of the objects and press Shift + Enter after each line.

bikes_df
bikeshops_df
orderlines_df

The results of the data imports are listed below.









Conclusion

The data import process between R and Python were almost identical.  Both the platforms make it incredibly easy to read in .xlsx files.  

For the complete list of R v Python topics, click on the links below.



Popular posts from this blog

MySQL Part 1: Getting MySQL Set Up in goormIDE

Do Popular Market Index Returns Follow a Normal Distribution?