Now that you can create your own line charts, it's time to learn about more chart types!
By the way, if this is your first experience with writing code in Python, you should be very proud of all that you have accomplished so far, because it's never easy to learn a completely new skill! If you stick with the course, you'll notice that everything will only get easier (while the charts you'll build will get more impressive!), since the code is pretty similar for all of the charts. Like any skill, coding becomes natural over time, and with repetition.
In this tutorial, you'll learn about bar charts and heatmaps.
As always, we begin by setting up the coding environment.
import pandas as pd
pd.plotting.register_matplotlib_converters()
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
print("Setup Complete")
Setup Complete
In this tutorial, we'll work with a dataset from the US Department of Transportation that tracks flight delays.
Opening this CSV file in Excel shows a row for each month (where 1
= January, 2
= February, etc) and a column for each airline code.
Each entry shows the average arrival delay (in minutes) for a different airline and month (all in year 2015). Negative entries denote flights that (on average) tended to arrive early. For instance, the average American Airlines flight (airline code: AA) in January arrived roughly 7 minutes late, and the average Alaska Airlines flight (airline code: AS) in April arrived roughly 3 minutes early.
As before, we load the dataset using the pd.read_csv
command.
# Path of the file to read
flight_filepath = "../input/flight_delays.csv"
# Read the file into a variable flight_data
flight_data = pd.read_csv(flight_filepath, index_col="Month")
You may notice that the code is slightly shorter than what we used in the previous tutorial. In this case, since the row labels (from the 'Month'
column) don't correspond to dates, we don't add parse_dates=True
in the parentheses. But, we keep the first two pieces of text as before, to provide both:
flight_filepath
), andindex_col="Month"
).Since the dataset is small, we can easily print all of its contents. This is done by writing a single line of code with just the name of the dataset.