🤖 ML Mavericks – Machine Learning Coursework
Course: Machine Learning / Data Mining
Group Name: ML Mavericks
University: University of Surrey
Term: Spring 2025
📌 Project Overview
This project implements a complete Machine Learning / Data Mining pipeline using Python and Prolog. It covers multiple learning tasks across regression, classification, clustering, logic-based learning (ILP), and reinforcement learning.
We apply various machine learning algorithms to four different datasets and evaluate their performance using appropriate metrics and visualisations.
🧪 Datasets Used
Dataset Type | Purpose |
---|---|
Regression (Small) | Used for SVM & Perceptron |
Regression (Large) | Used for Decision Trees & MLP |
Classification | Used for SVM & Neural Networks |
Clustering | Used for KNN & Hierarchical |
🧠 Algorithms Implemented
Category | Algorithms |
---|---|
Regression | SVM, Decision Tree, Perceptron, MLP |
Classification | SVM, Neural Network |
Clustering | KNN Clustering, Hierarchical Clustering |
Logic-Based Learning | Aleph ILP, FOIL ILP (in Prolog) |
Reinforcement Learning | Q-Learning, Deep Q-Learning (DQL) |
🗂️ Project Structure
ml_mavericks/
├── data/ # Contains all datasets (.csv)
├── notebooks/ # EDA and initial experiments in Jupyter
├── models/ # Saved ML models (optional)
├── outputs/ # Metrics, plots, ILP & RL outputs
├── src/ # Modular Python code
│ ├── preprocess.py
│ ├── visualize.py
│ ├── regression_models.py
│ ├── classification_models.py
│ ├── clustering_models.py
│ ├── logic_learning_aleph.pl
│ ├── logic_learning_foil.pl
│ ├── reinforcement_q_learning.py
│ ├── reinforcement_dql.py
│ └── evaluate.py
├── main.py # Main pipeline controller
└── requirements.txt # Python dependencies
🚀 How to Run
1. Install Dependencies
pip install -r requirements.txt
2. Run the Pipeline
Example for regression task using SVM:
python main.py --task regression --model svm --dataset regression_small
📤 View Outputs
- Evaluation metrics:
/outputs/evaluation_results.csv
- Visualisations:
/outputs/plots/
- ILP rule output:
/outputs/logic_output/
- Reinforcement learning logs:
/outputs/reinforcement_logs/
🖼️ Visualisations
The pipeline outputs various visual aids:
- Confusion matrices
- Regression error plots
- Clustering dendrograms
- RL training reward curves
📘 Tools & Technologies
- Python 3.x
- Libraries:
scikit-learn
,matplotlib
,seaborn
,gym
,stable-baselines3
- SWI-Prolog for ILP (Aleph & FOIL)
👨💻 Team Members
Group Name: ML Mavericks
- Ritwik Mishra
- Shivasmi Sharma
- Ishwari Niphade
- Arpit Mahapatra
- Suraj Borude
📌 Project Status
✅ In development – submitted as coursework for ML module Spring 2025.
📄 License
This project is for educational use only.