In this session, we'll explore more advanced topics, including data visualization with Matplotlib and Seaborn, advanced machine learning techniques with scikit-learn, and an introduction to web scraping with BeautifulSoup.
Data Visualization
Data visualization is crucial for understanding and communicating data insights. Python offers powerful libraries like Matplotlib and Seaborn for creating a wide range of visualizations.
Matplotlib
Matplotlib is a versatile library for creating static, interactive, and animated visualizations.
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
# Create a line plot
plt.plot(x, y)
plt.title('Line Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()
Seaborn
Seaborn is built on top of Matplotlib and provides a high-level interface for drawing attractive statistical graphics.
import seaborn as sns
import pandas as pd
# Sample data
data = {'Category': ['A', 'B', 'C', 'D'],
'Values': [4, 7, 1, 8]}
df = pd.DataFrame(data)
# Create a bar plot
sns.barplot(x='Category', y='Values', data=df)
plt.title('Bar Plot')
plt.show()
Advanced Machine Learning with Scikit-learn
Beyond basic models like linear regression, scikit-learn provides tools for more complex machine learning tasks.
Support Vector Machines (SVM)
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
# Load dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
# Create an SVM classifier
clf = SVC(kernel='linear')
clf.fit(X_train, y_train)
# Evaluate the model
accuracy = clf.score(X_test, y_test)
print(f'Accuracy: {accuracy}')
Web Scraping with BeautifulSoup
Web scraping involves extracting data from websites. BeautifulSoup is a popular library for parsing HTML and XML documents.
Basic Web Scraping Example
import requests
from bs4 import BeautifulSoup
# Send a GET request to the webpage
url = 'https://example.com'
response = requests.get(url)
# Parse the HTML content
soup = BeautifulSoup(response.content, 'html.parser')
# Extract specific data (e.g., all paragraph texts)
paragraphs = soup.find_all('p')
for p in paragraphs:
print(p.get_text())
These topics will help you visualize data effectively, apply advanced machine learning techniques, and extract data from the web.