How to Import Kaggle Datasets Directly into Google Colab

We will look at the process of import Kaggle Datasets into Google Colab in this article.

Getting Going:

In this section, we will go through two distinct ways to start using Colab. In the first way, we will download our dataset via the Kaggle API, and then we are ready to use it. Another approach involves manually downloading our dataset from the Kaggle website and using it for production or analysis purposes. Visit https://colab.research.google.com after first signing into your Google account.

Kaggle is used by almost all aspiring data scientists. Datasets for every domain are kept there. Every conceivable use case, including those in the medical field, e-commerce, and even astrophysics, has a dataset available. Users demonstrate their data science and machine learning expertise by practising on diverse datasets.

Kaggle datasets come in a variety of sizes. Some datasets might range in size from less than 1 MB to 100 GB. Additionally, certain Deep Learning techniques demand GPU support, which increases training time. A promising technology, Google Colab, can assist newcomers in testing their programmes in a cloud setting.

1. Downloading one of the Kaggle Dataset into a Jupyter Notebook

Selecting our dataset from Kaggle should be your first and greatest priority. Additionally, you can choose datasets from contests. I have selected two datasets for this article: one at random and one from the current competition.

2. Install the Essential Packages after Downloading them.

3. Download API Credentials

We must log in to the Kaggle services to download data from Kaggle. We require an API token for this. You may quickly generate this token from our Kaggle account's profile page. Simply to our Kaggle profile, and from there.

We will see an API section and a "Create New API Token" button on the next page. When you click on it, a kaggle.json file with your login and key will be downloaded. In the following stage, we will use a username and key.

Scroll down to the API section after selecting the Account tab.

The login and the API key will be downloaded in a file called Kaggle. json. You just need to do this step once; you do not need to create credentials each time we download a dataset.

4. Copy the link to your Kaggle Dataset and Paste it into the Open Datasets Library to Download it.

Launch Google Colab and link to the cloud host (basically start the notebook interface). Following that, upload the "Kaggle. json" file you just obtained from Kaggle.

How to Import Kaggle Datasets Directly into Google Colab
How to Import Kaggle Datasets Directly into Google Colab

We just learned how to use Google Colab to import datasets from Kaggle. It is likely that we only want to download one file as we are only worried about it. Then, we can use the "-f" flag with the filename. This will only download that file. Both the contests and datasets commands support the "-f" flag.

5. Now that we have our Dataset, we can use it.

  • Excel file used to read file
  • CSV file reading a file
  • With a text file, read the file

Output

How to Import Kaggle Datasets Directly into Google Colab

Second Method is to quickly Download the Kaggle Dataset:

  1. Open the Dataset tab on the Kaggle website.
  2. Choose any Dataset and press the Download button.
  3. Unzip the downloaded file (if it is in Zip format).
  4. Upload Your Dataset to Google Colab Notebook as a file or folder. As seen in the provided image, you will be given the opportunity to submit your file or folder after selecting on Upload your folder/file.
  5. Our dataset has now been successfully uploaded to Google Colab Notebook.
  6. Our Kaggle dataset is now available for use.

Advantages of Google Colab

A wonderful tool for practising data science problems is Google Colab. This free GPU support is one of the Colab's key advantages. Google Colab helps data science aspirants with their hardware issues because they are initially limited on computation resources. Because the Colab laptops are powered by Linux instances, you can simply interface with the kernel and run all the standard Linux commands.