Working With CSV Files in Python

Working With CSV Files in Python
Image by Danni Liu adapted from DONOT6/ Canva

After much effort, I finally managed to scrape the data I wanted to use to build my rare disease dashboard, a project inspired by my trip to Germany last year. You can read about the backstory here in this blog: What is Web Scraping and What is it Used for?

Although conceptually, I understood what I needed to do to scrape the data, I still struggled. I headbutted several walls and wanted to give up on so many occasions. I spent hours tweaking codes and experimenting to mostly no avail. But boy, it was exhilarating when I finally got the codes to work! I was left with one more huddle after that, which was I didn't yet know how to save the data to a file in Python.

I want to work with the file type CSV, and I'll explain why later. So, having learnt one approach to working with CSV files in Python, I would like to share what I learnt with you.

In this blog, I will cover the following:

  • What is CSV
  • CSV in Python
  • How to Read CSV Files in Python
  • How to Write CSV Files in Python

Before we delve into CSV, have you ever wondered why we have so many file types, just image files alone, we have the likes of jpeg, gif, png, tiff etc.

Professor David Malan explained this in his CS50 week 0 class. He is my idol. 😍 For those who don't know who he is, I would say he is a celebrity professor and teaches at Harvard. He is one of the best university teachers I've ever come across. The guy exudes so much passion when he teaches. Check out the link to the week 0 class of CS50. Guaranteed, you will fall in love with him. If you have two hours to spare, I highly encourage you to watch this lecture. This introductory lecture focuses on computational thinking, a mental set of skills for solving complex problems derived from computing and computer science. It is a valuable skill relevant to not only computer science but also our daily lives.

Enough digression; let's bring our attention back to CSV.

What is CSV

CSV stands for Comma Separated Values. It is a simple file format used to store tabular data. It is similar to Excel tables, minus all the pretty formatting. They contain just plain text files separated by commas.

CSV is a widely supported file type for storing tabular data. Its advantages over other file types are because of its flexibility and compatibility. They can be easily opened in various programs, including spreadsheet programs like Microsoft Excel, Google Sheets, and Numbers, and text editors like Notepad. CSV files are also easy to import into databases, making them a convenient format for storing and analyzing large amounts of data.

CSV files are simple. They store data in a tabular format, with each row representing a new record and each column representing a field. CSV files do not include any formatting, images, or other complex features, which makes them relatively small in size and quick to transfer.

These are why I want to save the scraped data on CSV.

CSV in Python

There are a few approaches to working with CSV files in Python. I've come across a couple. One uses the Pandas library, and the other uses the Python built-in module: CSV. The approach I'll cover in this blog is the CSV module.

In CSV module documentation, you'll find all the functions available in the documentation. I don't understand the use of all the functions. That said, what I know are the two most relevant and commonly used functions. You guessed it, it's the read and write functions, which I’ll focus on below.

How to Read CSV Files in Python

Here is a screenshot of the CSV file we will read:Blog-31--Rare-disease-csv-file

To read a CSV file, we can use csv.reader( ) function. This function returns a reader object that can be used to read the rows of the CSV file. Here's an example of how to use it:
Blog-31--csv_reader-code

The csv.reader( ) function returns each CSV file row as a list of values. The values are in the same order as the columns in the CSV file. Here is the output:
Blog-31--csv_reader-code-output

How to Write CSV Files in Python

To write a CSV file in Python, we can use csv.writer( ) function. Here is an example of how we can use this function to write a list of rows to a CSV file:
Blog-31--csv_writer-code

This will create a CSV file called "name_age.csv" with the following content:
Blog-31--csv_writer-code-output

You can find more information about CSV module in the Python documentation: https://docs.python.org/3/library/csv.html#.
If you prefer to learn more about CSV module through a video, then this is a good tutorial by Corey Schafer to watch: Python Tutorial: CSV Module - How to Read, Parse, and Write CSV Files.