
Tech Stack
Description
A Machine Learning project focused on predicting laptop prices based on specifications such as CPU series, RAM, storage, and brand. Data was collected via scraping the Tokopedia e-commerce platform using the GraphQL API.
The process included comprehensive data cleaning, feature engineering, Exploratory Data Analysis (EDA), building multiple regression models (Linear Regression, Random Forest, XGBoost), and selecting the best model through hyperparameter tuning.
The XGBoost model yielded the best performance and was subsequently deployed via Streamlit Cloud with custom preprocessing and CPU series encoding.
- Scraped data from Tokopedia using the GraphQL API.
- Cleaned dataset (RAM/storage parsing, CPU parsing, brand normalization).
- Conducted extensive EDA (heatmap, price distribution, CPU vs Price, RAM vs Price, etc.).
- Performed feature engineering and hyperparameter tuning.
- Achieved optimal performance with the XGBoost model (MAE < IDR 2.5 million).
- Deployed the model using Streamlit Cloud with an interactive user interface (UI).
- Ensured complete documentation via a detailed README and Mermaid flow diagrams.
Page Info
Dashboard & Prediction UI
The main Streamlit page for inputting laptop specifications and receiving price predictions.

EDA
EDA diagrams and correlation heatmaps, and model training visuals.


Model Development
Model training visuals.
