COMM061 - NATURAL LANGUAGE PROCESSING - GROUP 43
Overview
This project is designed to deploy a Flask application that utilizes the SciBERT model for various NLP tasks. The project includes a Jupyter notebook, app.ipynb
, which is converted to a Python script during the CI/CD pipeline. The application is deployed using a CI/CD pipeline configured in GitLab.
Table of Contents
Project Structure
.
├── app.ipynb # Jupyter notebook for the application
├── requirements.txt # List of dependencies
├── pipeline.sh # Shell script for building the project
├── .gitlab-ci.yml # GitLab CI/CD pipeline configuration
└── README.md # Project documentation
Requirements
- Python 3.8+
- Flask
- Gunicorn
- Jupyter
- nbconvert
- PyTorch
- NumPy
- Transformers
- Datasets
- Evaluate
- Seqeval
You can find all required packages listed in requirements.txt
.
Setup
-
Clone the repository:
git clone <repository-url> cd <repository-directory>
-
Set up a virtual environment and install dependencies:
python3 -m venv .venv source .venv/bin/activate pip install -r requirements.txt
-
Run the application:
nohup python3 app.py &
Usage
-
Converting the Jupyter Notebook:
Convert
app.ipynb
to a Python script:jupyter nbconvert --to script app.ipynb
-
Running the Flask Application:
Ensure that
app.py
is present in the project directory after conversion. Then, start the Flask application:python3 app.py
-
Access the Application:
By default, the Flask application runs on
http://127.0.0.1:8080
. Open this URL in your web browser to access the application.
CI/CD Pipeline
The project includes a CI/CD pipeline configured in .gitlab-ci.yml
to automate the build and deployment process. The pipeline has the following stages:
-
Install:
- Installs dependencies from
requirements.txt
.
- Installs dependencies from
-
Test:
- Runs tests using
pytest
.
- Runs tests using
-
Build:
- Converts the Jupyter notebook to a Python script.
- Ensures the conversion was successful.
-
Deploy:
- Deploys the Flask application.
- Ensures the application runs in the background.
Pipeline Configuration
Here is a brief overview of the .gitlab-ci.yml
configuration:
stages:
- install
- test
- build
- deploy
variables:
VIRTUAL_ENV: ".venv"
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
cache:
paths:
- .cache/pip
before_script:
- python3 -m venv $VIRTUAL_ENV
- source $VIRTUAL_ENV/bin/activate
- pip install -r requirements.txt
install:
stage: install
script:
- echo "Installing dependencies..."
- pip install -r requirements.txt
artifacts:
paths:
- $VIRTUAL_ENV
test:
stage: test
script:
- echo "Running tests..."
- pytest tests/
artifacts:
when: always
reports:
junit: junit.xml
build:
stage: build
script:
- echo "Building the project..."
- jupyter nbconvert --to script app.ipynb
- if [ ! -f app.py ]; then echo "Conversion failed, exiting pipeline"; exit 1; fi
- echo "Build completed successfully."
artifacts:
paths:
- app.py
deploy:
stage: deploy
script:
- echo "Deploying the application..."
- nohup python3 app.py &
- sleep 5
- echo "Application should have started in the background."
only:
- main
License
This project is licensed under the MIT License - see the LICENSE file for details.