Skip to content

Files

Latest commit

 

History

History
72 lines (54 loc) · 3.24 KB

exer6.md

File metadata and controls

72 lines (54 loc) · 3.24 KB

Stars Badge Forks Badge Pull Requests Badge Issues Badge GitHub contributors Visitors

Exercise 6: Correlation Analysis

The steps to calculate the correlation matrix using the corr() method in pandas and visualize it using a heatmap in Seaborn to identify strongly correlated features in the Titanic dataset.

Step 1: Load the Titanic Dataset

  1. Load the dataset:
       import pandas as pd
       import seaborn as sns
       import matplotlib.pyplot as plt
    
       url = 'https://raw.githubusercontent.com/drshahizan/dataset/main/titanic/train.csv'
       titanic = pd.read_csv(url)

Step 2: Calculate the Correlation Matrix

  1. Select only numeric columns for correlation calculation:
       numeric_cols = titanic.select_dtypes(include=['number']).columns
       corr_matrix = titanic[numeric_cols].corr()

Step 3: Visualize the Correlation Matrix Using a Heatmap

  1. Create the heatmap:
       plt.figure(figsize=(10, 8))
       sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', center=0)
       plt.title('Correlation Matrix of Titanic Dataset')
       plt.show()

Full Code

Here's the complete code in a single notebook:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Load the Titanic dataset from the provided URL
url = 'https://raw.githubusercontent.com/drshahizan/dataset/main/titanic/train.csv'
titanic = pd.read_csv(url)

# Select only numeric columns for correlation calculation
numeric_cols = titanic.select_dtypes(include=['number']).columns
corr_matrix = titanic[numeric_cols].corr()

# Create a heatmap to visualize the correlation matrix
plt.figure(figsize=(10, 8))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', center=0)
plt.title('Correlation Matrix of Titanic Dataset')
plt.show()

Contribution 🛠️

Please create an Issue for any improvements, suggestions or errors in the content.

You can also contact me using Linkedin for any other queries or feedback.

Visitors