Project Overview: The project focuses on analyzing the prices of used cars to predict their market value accurately. Leveraging a dataset containing information about various car attributes and their corresponding prices, the goal is to develop a robust predictive model that can estimate the price of used cars with high precision. By exploring relationships between different features and the final price, this analysis aims to provide valuable insights into the factors influencing car pricing in the market.
- Data Exploration:
- Imported libraries like Pandas, NumPy, Matplotlib, Seaborn, StatsModels.
- Read and explored the 'cars.csv' dataset, examining column values and addressing issues like kilometer data.
- Data Preprocessing:
- Engineered 'KM_Adj' feature from the 'Kilometers' column.
- Processed and manipulated data for analysis by dropping irrelevant columns and handling categorical variables.
- Exploratory Data Analysis (EDA):
- Explored relationships between 'Price' and 'Year' using scatter plots.
- Conducted OLS (Ordinary Least Squares) regression to analyze relationships and assess model fit.
- Testing and Model Development:
- Tested OLS assumptions and assessed multicollinearity.
- Performed feature selection using F-statistics and P-values.
- Standardized features and applied Linear Regression for modeling used car prices.
- Model Evaluation:
- Evaluated multicollinearity using variance inflation factor (VIF).
- Utilized transformed data and employed Linear Regression for predicting prices.
- Further Steps:
- Extracted target and input variables for the model.
- Utilized StandardScaler for preprocessing inputs and conducted model testing.
No comments:
Post a Comment