Skip to content
Snippets Groups Projects

COMM061 - NATURAL LANGUAGE PROCESSING - GROUP 43

Overview

This project is designed to deploy a Flask application that utilizes the SciBERT model for various NLP tasks. The project includes a Jupyter notebook, app.ipynb, which is converted to a Python script during the CI/CD pipeline. The application is deployed using a CI/CD pipeline configured in GitLab.

Table of Contents

Project Structure

.
├── app.ipynb                # Jupyter notebook for the application
├── requirements.txt         # List of dependencies
├── pipeline.sh              # Shell script for building the project
├── .gitlab-ci.yml           # GitLab CI/CD pipeline configuration
└── README.md                # Project documentation

Requirements

  • Python 3.8+
  • Flask
  • Gunicorn
  • Jupyter
  • nbconvert
  • PyTorch
  • NumPy
  • Transformers
  • Datasets
  • Evaluate
  • Seqeval

You can find all required packages listed in requirements.txt.

Setup

  1. Clone the repository:

    git clone <repository-url>
    cd <repository-directory>
  2. Set up a virtual environment and install dependencies:

    python3 -m venv .venv
    source .venv/bin/activate
    pip install -r requirements.txt
  3. Run the application:

    nohup python3 app.py &

Usage

  1. Converting the Jupyter Notebook:

    Convert app.ipynb to a Python script:

    jupyter nbconvert --to script app.ipynb
  2. Running the Flask Application:

    Ensure that app.py is present in the project directory after conversion. Then, start the Flask application:

    python3 app.py
  3. Access the Application:

    By default, the Flask application runs on http://127.0.0.1:8080. Open this URL in your web browser to access the application.

CI/CD Pipeline

The project includes a CI/CD pipeline configured in .gitlab-ci.yml to automate the build and deployment process. The pipeline has the following stages:

  1. Install:

    • Installs dependencies from requirements.txt.
  2. Test:

    • Runs tests using pytest.
  3. Build:

    • Converts the Jupyter notebook to a Python script.
    • Ensures the conversion was successful.
  4. Deploy:

    • Deploys the Flask application.
    • Ensures the application runs in the background.

Pipeline Configuration

Here is a brief overview of the .gitlab-ci.yml configuration:

stages:
  - install
  - test
  - build
  - deploy

variables:
  VIRTUAL_ENV: ".venv"
  PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"

cache:
  paths:
    - .cache/pip

before_script:
  - python3 -m venv $VIRTUAL_ENV
  - source $VIRTUAL_ENV/bin/activate
  - pip install -r requirements.txt

install:
  stage: install
  script:
    - echo "Installing dependencies..."
    - pip install -r requirements.txt
  artifacts:
    paths:
      - $VIRTUAL_ENV

test:
  stage: test
  script:
    - echo "Running tests..."
    - pytest tests/
  artifacts:
    when: always
    reports:
      junit: junit.xml

build:
  stage: build
  script:
    - echo "Building the project..."
    - jupyter nbconvert --to script app.ipynb
    - if [ ! -f app.py ]; then echo "Conversion failed, exiting pipeline"; exit 1; fi
    - echo "Build completed successfully."
  artifacts:
    paths:
      - app.py

deploy:
  stage: deploy
  script:
    - echo "Deploying the application..."
    - nohup python3 app.py &
    - sleep 5
    - echo "Application should have started in the background."
  only:
    - main

License

This project is licensed under the MIT License - see the LICENSE file for details.