Starting with Kaggle Tutorials(https://www.kaggle.com/learn/data-visualization-from-non-coder-to-coder)


1. Introduction to Seaborn


Seaborn is a python package based on Matplotlib which is one of Python's most powerful visualization libraries. It allows to customize matplotlib, thereby creating more attractive and informative statiscal graphics. As always, our coding environment is a Jupyter notebook that includes both text and code.

Before we write actual code, we need to include several modules. The first one is pandas, which creates dataframe objects that can store datasets. The remainder are matplotlib and seaborn which are both visualization modules in Python. To show our plots(which are the results of data visualization) without using the method show(), we include the command %matplotlib inline. The full script importing modules will look like the following.

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

The first example we will put our hands on deals with historical FIFA rankings our six countries. The dataset is stored as a CSV file, short for comma-separated values file. We first read the data to a dataframe object using the pandas library.

file_path = '../input/fifa.csv'
fifa_data = pd.read_csv(file_path, index_col="Date", parse_dates=True)

The 'Date' attribute will work somewhat as a primary key to distinguish different rows in the dataframe. The code parse_dates=True tells the notebook to understand each row label as a date opposed to any other data types such as integers or strings. Now, to take a quick look at the data, we use the head() method to print out the first five(or any number by passing a parameter).

https://prod-files-secure.s3.us-west-2.amazonaws.com/dbc17468-f66c-482b-9eff-0a465bb7b090/84ab4341-2343-4149-9c02-be7073bee019/fifa_head.png

Now we plot this data with a line chart which is a good type that can represent trends or change of certain data. Only one line of code is required to print the actual chart out, but we will include one more line of code to resize the size of the figure. Before we proceed, this is the best time to have a clear understanding about the terminologies we are using.

First of all a 'figure' is the top level element that contains all the plots we create. A 'plot' is a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables. Thus we are creating several 'plots' in a single 'figure'. Enough with the terms, now lets put our hands on the keyboard.

plt.figure(figsize=(16, 4))
sns.lineplot(data=fifa_data)
<matplotlib.axes._subplots.AxesSubplot at 0x7f2b4195f6a0>

https://prod-files-secure.s3.us-west-2.amazonaws.com/dbc17468-f66c-482b-9eff-0a465bb7b090/e7103840-a60a-4177-be74-c24fd1a97bc0/fifa_line.png


2. HandsOn - lineplots with FIFA