How do I read a CSV file from Google Drive using Python Colab?

Seven steps to read a CSV file using PyDrive

Tired of that old story: download CSV file, upload into the collab, read/load the data frame and after a while, you need to repeat everything again because the information was not stored there?

Don’t worry, your problems are over!

I will show you a very useful technique that I have used in my Data Science projects using Google Colab (Python 3). As you are doing now, I went to the community and found many colleagues sharing their knowledge, I decided to do the same.

In this article, I will show you how to use PyDrive to read a file in CSV format directly from your Google Drive using Python3 in the Google Colab environment.

Let’s go to step by step?

1) Install PyDrive

The first step is to install PyDrive. As we are using a Notebook environment, the installation using PIP will always have the exclamation mark (!) in front.

!pip install -U -q PyDrive

from pydrive.auth import GoogleAuth

from pydrive.drive import GoogleDrive

from google.colab import auth

from oauth2client.client import GoogleCredentials

2) Authenticate

The second step is to authenticate and create a PyDrive client. See there:

auth.authenticate_user()

gauth = GoogleAuth()

gauth.credentials = GoogleCredentials.get_application_default()

drive = GoogleDrive(gauth)

3) Authorizing

As soon as you execute this part of the code, the authenticator will ask you to click on the link that appears in your notebook. You must follow this third step, click on the link, authenticate with your Gmail account and copy the generated code. Return to your notebook and paste this code into the requested area. Press Enter, and you’re done, you’re authenticated!

4) Generating a shareable link

Now comes a more tricky part. This is the fourth step. Go to your Google Drive, find your file and perform the same procedure to share that file, generating a shareable link:

1) find your file and click on it;
2) click on the “share” button;
3) generate a shareable link “get link”

5) Getting the file_id

For the fifth step, pretend that this is the full URL (it is fake, don’t worry :). Extract only the selected code. This is your file_id.

fileDownloaded = drive.CreateFile({‘id’:’XXXXXXXXXXXXXX’})

6) Load the CSV

For the sixth step, tell your notebook now the name of the CSV file you want to load into memory.

fileDownloaded.GetContentFile(‘example.csv’)

7) Showing the Results

The last, seventh step, just use the good-Pandas, turn this into a Data Frame and display its header. o/

import pandas as pd

df = pd.read_csv(‘example.csv’, delimiter=’;’ )

df.head()

That’s it folks!!!
I hope I have helped.

Whenever you need something, leave a comment there.

Sigmundo Preissler Jr, PhD
@sigmundojr

Data Scientist | Machine Learning | Advanced Analytics