Skip to content
Snippets Groups Projects
Ritwik's avatar
Merge branch 'regression/preprocessing/suraj' of gitlab.surrey.ac.uk:rm02120/mlmavericks_coursework into regression/preprocessing/suraj
90b1cde9
History

🤖 ML Mavericks – Machine Learning Coursework

Course: Machine Learning / Data Mining
Group Name: ML Mavericks
University: University of Surrey
Term: Spring 2025


📌 Project Overview

This project implements a complete Machine Learning / Data Mining pipeline using Python and Prolog. It covers multiple learning tasks across regression, classification, clustering, logic-based learning (ILP), and reinforcement learning.

We apply various machine learning algorithms to four different datasets and evaluate their performance using appropriate metrics and visualisations.


🧪 Datasets Used

Dataset Type Purpose
Regression (Small) Used for SVM & Perceptron
Regression (Large) Used for Decision Trees & MLP
Classification Used for SVM & Neural Networks
Clustering Used for KNN & Hierarchical

🧠 Algorithms Implemented

Category Algorithms
Regression SVM, Decision Tree, Perceptron, MLP
Classification SVM, Neural Network
Clustering KNN Clustering, Hierarchical Clustering
Logic-Based Learning Aleph ILP, FOIL ILP (in Prolog)
Reinforcement Learning Q-Learning, Deep Q-Learning (DQL)

🗂️ Project Structure

ml_mavericks/
├── data/                  # Contains all datasets (.csv)
├── notebooks/             # EDA and initial experiments in Jupyter
├── models/                # Saved ML models (optional)
├── outputs/               # Metrics, plots, ILP & RL outputs
├── src/                   # Modular Python code
│   ├── preprocess.py
│   ├── visualize.py
│   ├── regression_models.py
│   ├── classification_models.py
│   ├── clustering_models.py
│   ├── logic_learning_aleph.pl
│   ├── logic_learning_foil.pl
│   ├── reinforcement_q_learning.py
│   ├── reinforcement_dql.py
│   └── evaluate.py
├── main.py                # Main pipeline controller
└── requirements.txt       # Python dependencies

🚀 How to Run

1. Install Dependencies

pip install -r requirements.txt

2. Run the Pipeline

Example for regression task using SVM:

python main.py --task regression --model svm --dataset regression_small

📤 View Outputs

  • Evaluation metrics: /outputs/evaluation_results.csv
  • Visualisations: /outputs/plots/
  • ILP rule output: /outputs/logic_output/
  • Reinforcement learning logs: /outputs/reinforcement_logs/

🖼️ Visualisations

The pipeline outputs various visual aids:

  • Confusion matrices
  • Regression error plots
  • Clustering dendrograms
  • RL training reward curves

📘 Tools & Technologies

  • Python 3.x
  • Libraries: scikit-learn, matplotlib, seaborn, gym, stable-baselines3
  • SWI-Prolog for ILP (Aleph & FOIL)

👨‍💻 Team Members

Group Name: ML Mavericks

  • Ritwik Mishra
  • Shivasmi Sharma
  • Ishwari Niphade
  • Arpit Mahapatra
  • Suraj Borude

📌 Project Status

✅ In development – submitted as coursework for ML module Spring 2025.


📄 License

This project is for educational use only.