To illustrate a simple linear regression example in Python, we can use synthetic data. Let's create a small dataset that simulates the relationship between engine size (in liters) and fuel efficiency (in miles per gallon) for a set of cars. We'll use the `scikit-learn`

library for the regression analysis and `matplotlib`

for plotting.

Here's a step-by-step guide along with the Python code:

- Generate Synthetic Data: Create a dataset of engine sizes and corresponding fuel efficiencies.
- Create a Linear Regression Model: Use
`scikit-learn`

to fit a linear regression model. - Predict and Plot: Predict fuel efficiency for a range of engine sizes and plot the results.

__Python Code__

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error, r2_score # Generate Synthetic Data np.random.seed(0) # for reproducibility engine_sizes = np.random.rand(100, 1) * 5 # Engine sizes between 0 and 5 liters fuel_efficiency = 30 - 3 * engine_sizes + np.random.randn(100, 1) * 2 # MPG # Create a Linear Regression Model model = LinearRegression() model.fit(engine_sizes, fuel_efficiency) # Predict and Plot engine_sizes_range = np.linspace(0, 5, 100).reshape(-1, 1) predicted_efficiency = model.predict(engine_sizes_range) # Plotting the results plt.scatter(engine_sizes, fuel_efficiency, color='blue', label='Data Points') plt.plot(engine_sizes_range, predicted_efficiency, color='red', label='Regression Line') plt.xlabel('Engine Size (Liters)') plt.ylabel('Fuel Efficiency (MPG)') plt.title('Simple Linear Regression Example') plt.legend() plt.show()

# Coefficients and Performance Metrics slope = model.coef_[0][0] intercept = model.intercept_[0] mse = mean_squared_error(fuel_efficiency, model.predict(engine_sizes)) r2 = r2_score(fuel_efficiency, model.predict(engine_sizes)) print(f"Slope: {slope}") print(f"Intercept: {intercept}") print(f"Mean Squared Error: {mse}") print(f"R-squared: {r2}")

The Python code successfully generated and analyzed a synthetic dataset using simple linear regression. Here's a summary of the results:

- Scatter Plot: The blue dots represent the synthetic data points, indicating the relationship between engine size (in liters) and fuel efficiency (in miles per gallon).
- Regression Line: The red line is the best-fit linear regression line through the data points, showing the predicted relationship.

__Regression Equation__

From the regression model, we obtained:

- Slope (b): $-3.03$. This means for each additional liter in engine size, the fuel efficiency decreases by approximately 3.03 MPG.
- Intercept (a): $30.44$. This represents the predicted fuel efficiency when the engine size is 0 liters.

__Performance Metrics__

- Mean Squared Error (MSE): $3.97$. This is the average squared difference between the observed actual outcomes and the outcomes predicted by the model.
- R-squared (R²): $0.83$. This value indicates that 83% of the variance in fuel efficiency is explained by the engine size.

## No comments:

## Post a Comment