Exercise 6: Correlation Analysis

The steps to calculate the correlation matrix using the corr() method in pandas and visualize it using a heatmap in Seaborn to identify strongly correlated features in the Titanic dataset.

Step 1: Load the Titanic Dataset

Load the dataset:

   import pandas as pd
   import seaborn as sns
   import matplotlib.pyplot as plt

   url = 'https://raw.githubusercontent.com/drshahizan/dataset/main/titanic/train.csv'
   titanic = pd.read_csv(url)

Step 2: Calculate the Correlation Matrix

Select only numeric columns for correlation calculation:

   numeric_cols = titanic.select_dtypes(include=['number']).columns
   corr_matrix = titanic[numeric_cols].corr()

Step 3: Visualize the Correlation Matrix Using a Heatmap

Create the heatmap:

   plt.figure(figsize=(10, 8))
   sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', center=0)
   plt.title('Correlation Matrix of Titanic Dataset')
   plt.show()

Full Code

Here's the complete code in a single notebook:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Load the Titanic dataset from the provided URL
url = 'https://raw.githubusercontent.com/drshahizan/dataset/main/titanic/train.csv'
titanic = pd.read_csv(url)

# Select only numeric columns for correlation calculation
numeric_cols = titanic.select_dtypes(include=['number']).columns
corr_matrix = titanic[numeric_cols].corr()

# Create a heatmap to visualize the correlation matrix
plt.figure(figsize=(10, 8))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', center=0)
plt.title('Correlation Matrix of Titanic Dataset')
plt.show()

Contribution 🛠️

Please create an Issue for any improvements, suggestions or errors in the content.

You can also contact me using Linkedin for any other queries or feedback.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

exer6.md

exer6.md

Exercise 6: Correlation Analysis

Step 1: Load the Titanic Dataset

Step 2: Calculate the Correlation Matrix

Step 3: Visualize the Correlation Matrix Using a Heatmap

Full Code

Contribution 🛠️

Files

exer6.md

Latest commit

History

exer6.md

File metadata and controls

Exercise 6: Correlation Analysis

Step 1: Load the Titanic Dataset

Step 2: Calculate the Correlation Matrix

Step 3: Visualize the Correlation Matrix Using a Heatmap

Full Code

Contribution 🛠️