Skip to content
Snippets Groups Projects
Commit c00d6590 authored by Wu, Jiantao (PG/R - Comp Sci & Elec Eng)'s avatar Wu, Jiantao (PG/R - Comp Sci & Elec Eng)
Browse files

change batches_ahead for Loader

parent 4dfeb51d
No related branches found
No related tags found
No related merge requests found
## Classification
### Finetune Classification
To evaluate finetuning stage for a pretrained model, run:
```
torchrun --master_port=29501 --nproc_per_node=2 evaluation/eval_cls.py --pretrained_weights <weight> --data_location <data_path> --data_set <data_set> --output_dir <output_dir> --epochs=1000 --arch vit_base --batch_size 64 --layer_decay 0
```
### avaliable datasets
The argument `--data_set` can be one of the following:
- IN1K
- ominiglot
- STL
- CIFAR10
- CIFAR100
- Cars
- Pets
- Aircraft
- Flowers
- Folder
Use one of the following settings:
- data_set: Cars, Pets, Aircraft, CIFAR10/100, ImageFolder (train/validation).
- epochs: 200 for IN100, 100 for IN1K, 1000 for other small datasets.
- head_type: 0 for CLS token only (DINO, iBOT), 1 for mean patch tokens (MAE, BEiT), 2 for concatnating the CLS token and mean patch tokens (dSiT)
- layer_decay: 0 for small datasets, 0.75 for large datasets (IN1K).
- arch: vit_small, _base ...
### Finetune: `eval_cls.py` `eval_cls_ffcv.py`
Example:
```
WANDB_PROJECT=SiT WANDB_NAME=threeaug torchrun --master_port=29501 --nproc_per_node=2 evaluation/eval_cls.py --pretrained_weights ../related_work/SiT/outputs/imagenet/vit_base/checkpoint.pth --data_location ../data --data_set Cars --output_dir ../related_work/SiT/outputs/imagenet/vit_base/eval/finetune_ibot2-cars --epochs=1000 --arch vit_base --batch_size 64 --layer_decay 0 --ThreeAugment --weight_decay 0.02
```
We reproduced the results of MAE on ImageNet. The results are as follows:
| **ImageNet Accuracy** | ViT-Base | ViT-Large | ViT-Huge |
|-----------------------|----------|-----------|----------|
| MAE repo | 83.664 | 85.952 | 86.928 |
| Our repo | | | |
| Our repo (dres) | 83.302 | | |
ImageNet 4GPUs:
```
--batch_size 64 --layer_decay 0.75 --weight_decay 0.05 --head_type=2
```
Training time is ~7h11m in 32 V100 GPUs for MAE repo.
### k-NN Classification
To evaluate k-NN classification on the frozen features, run:
To launch the evaluation, use `vitrun` or `submitit`. For example, to finetune a pre-trained model on ImageNet, run:
```
python -m torch.distributed.launch --master_port=29501 --nproc_per_node=2 evaluation/eval_knn.py --pretrained_weights <weight> --data_location <data_path> --data_set <data_set> --output_dir <output_dir> --head_type <> --dis_fn <cosine/euclidean>
submitit --module vitookit.evaluation.eval_cls_ffcv --train_path ~/data/ffcv/IN1K_train_500_95.ffcv --val_path ~/data/ffcv/IN1K_val_500_100.ffcv --fast_dir /raid/local_scratch/jxw30-hxc19/ --gin VisionTransformer.global_pool='"avg"' --blr 5e-4 --layer_decay 0.65 --weight_decay 0.05 --drop_path 0.1 --checkpoint_key=model -w ~/models/mae_pretrain_vit_base.pth
```
Use one of the following settings:
- head_type: 0 for CLS token only (DINO, iBOT), 1 for mean patch tokens (MAE, BEiT), 2 for concatnating the CLS token and mean patch tokens (dSiT)
- dis_fn: cosine is always better than euclidean
Here the effective batch size is 128 (batch_size per gpu) * 8 (gpus per node) = 1024.
### Linear Probing on ImageNet
To train a single classifier on frozen weights with customized learning rate, run:
```
torchrun evaluation/eval_linear.py --batch_size 512 --blr 0.1 --weight_decay 0.0 --accum_iter=4 --arch vit_small --data_location=<>
**dres** Dynamic Resolution for Efficient Supervised Learning.
```bash
vitrun --nproc_per_node=8 eval_cls_ffcv.py --train_path <> --val_path <> -w ~/models/mae_pretrain_vit_base.pth --checkpoint_key=model --layer_decay=0.65 --gin VisionTransformer.global_pool='"avg"' DynamicResolution.start_ramp=0 DynamicResolution.end_ramp=60 DynamicResolution.scheme=1 --dynamic_resolution
```
We follow the MAE recipe to train the linear classifier. Note that:
### Linear Prob
We follow the MAE recipe to train the linear classifier. Note that:
- The effective batch size is 16384 = 512 (batch_size per gpu) * 1 (nodes) * 8 (gpus per node) * 4 (accum_iter).
- The actual `lr` is computed by `lr`` = `blr`` * effective batch size / 256.
- Training time is ~2h20m for 90 epochs in 32 V100 GPUs.
......@@ -48,37 +47,19 @@ Reference results for [MAE in linear probing](https://github.com/facebookresearc
| | ViT-Base | ViT-Large | ViT-Huge |
|:------------------:|:--------:|:---------:|:--------:|
| paper (TF/TPU) | 68.0 | 75.8 | 76.6 |
| this repo (PT/GPU) | 67.8 | 76.0 | 77.2 |
| MAE repo (PT/GPU) | 67.8 | 76.0 | 77.2 |
| Our repo (PT/GPU) | 67.8 | 76.0 | 77.2 |
### Fine-Tuning on ImageNet
To fine-tune the pre-trained model, we apply layerwise decay and sweep the learning rate.
To train ViT-S/16 with 200 epochs, run:
```
./run.sh imagenet_cls $JOB_NAME vit_small teacher 8 \
--epochs 200 \
--drop_path 0.1 \
--layer_decay 0.75
```
To train ViT-B/16 with 100 epochs, run:
To train a single classifier on frozen weights, run:
```
./run.sh imagenet_cls $JOB_NAME vit_base teacher 8 \
--epochs 100 \
--drop_path 0.2 \
--layer_decay 0.65
submitit --module vitookit.evaluation.eval_linear_ffcv --train_path ~/data/ffcv/IN1K_train_500_95.ffcv --val_path ~/data/ffcv/IN1K_val_500_95.ffcv -w ~/models/mae_pretrain_vit_base.pth --checkpoint_key=model --gin VisionTransformer.global_pool='"avg"' --fast_dir /raid/local_scratch/jxw30-hxc19/ --batch_size=128 --accum_iter=16 --blr=0.1
```
To train ViT-L/16 with 50 epochs, run:
### k-NN Classification
To evaluate k-NN classification on the frozen features, run:
```
./run.sh imagenet_cls $JOB_NAME vit_large teacher 8 \
--epochs 50 \
--drop_path 0.4 \
--layer_decay 0.75 \
--batch_size 64 \
--enable_deepspeed \
--warmup_epochs 5 \
--update_freq 2
python -m torch.distributed.launch --master_port=29501 --nproc_per_node=2 evaluation/eval_knn.py --pretrained_weights <weight> --data_location <data_path> --data_set <data_set> --output_dir <output_dir> --head_type <> --dis_fn <cosine/euclidean>
```
### Unsupervised Classification on ImageNet
......@@ -227,4 +208,5 @@ for m in model_dir:
cmd = f"python evaluation/eval_fewshot_cls.py --data_location ../data --arch=vit_base -w {m}/checkpoint.pth --output_dir={m}/eval/fewshot --data_set={ds}"
print(cmd)
os.system(cmd)
```
\ No newline at end of file
```
......@@ -7,7 +7,49 @@ Install the package by
pip install git+https://gitlab.surrey.ac.uk/jw02425/vitoolkit.git
```
## Evaluation
# Run on HPC
See the available evaluations in [evaluation protocols](EVALUATION.md).
## commands
```bash
vitrun train_cls.py --data_location=../data/IMNET --gin VisionTransformer.global_pool='"avg"' -w wandb:dlib/EfficientSSL/lsx2qmys
```
## condor
```bash
condor_submit condor/eval_weka_cls.submit model_dir=outputs/dinosara/base ARCH=vit_base
```
## Slurm
```text
usage: submitit for evaluation [-h] [--module MODULE] [--ngpus NGPUS] [--nodes NODES] [-t TIMEOUT] [--mem MEM] [--partition PARTITION] [--comment COMMENT] [--job_dir JOB_DIR] [--fast_dir FAST_DIR]
options:
-h, --help show this help message and exit
--module MODULE Module to run
--ngpus NGPUS Number of gpus to request on each node
--nodes NODES Number of nodes to request
-t TIMEOUT, --timeout TIMEOUT
Duration of the job
--mem MEM Memory to request
--partition PARTITION
Partition where to submit
--comment COMMENT Comment to pass to scheduler
--job_dir JOB_DIR
--fast_dir FAST_DIR The dictory of fast disk to load the datasets
```
We move files to **FAST_DIR**. For example, to finetune a pre-trained model on ImageNet, run:
```bash
submitit --module vitookit.evaluation.eval_cls_ffcv --train_path ~/data/ffcv/IN1K_train_500_95.ffcv --val_path ~/data/ffcv/IN1K_val_500_95.ffcv --gin VisionTransformer.global_pool='"avg"' -w wandb:dlib/EfficientSSL/lsx2qmys
```
# Evaluation
There are many protocols for evaluating the performance of a model. We provide a set of evaluation scripts for different tasks. Use `vitrun` to launch the evaluation.
......@@ -36,32 +78,6 @@ The pretrained weights can be one of the following:
You can further specify the *key* and *prefix* to extract the weights from a checkpoint file. For example, `--pretrained_weights=ckpt.pth --checkpoint_key model --prefix module.` will extract the state dict from the key "model" in the checkpoint file and remove its prefix "module." in the keys.
# HPC
## commands
[training commands](evaluation/README.md)
## condor
```bash
condor_submit condor/eval_weka_cls.submit model_dir=outputs/dinosara/base ARCH=vit_base
condor_submit condor/eval_weka_seg.submit model_dir=outputs/dinosara/base
```
## Slurm
```bash
bin/submitit --module vitookit.evaluation.eval_cls_ffcv --train_path ~/data/ffcv/IN1K_train_500_95.ffcv --val_path ~/data/ffcv/IN1K_val_500_95.ffcv --gin VisionTransformer.global_pool='\"avg\"' -w wandb:dlib/EfficientSSL/lsx2qmys
```
## Test examples
We provide some simple examples.
<p float="center">
<img src="imgs/sample1.JPEG" width="32%" />
<img src="imgs/sample2.JPEG" width="32%" />
<img src="imgs/sample3.jpg" width="32%" />
</p>
## cluster
......
......@@ -142,8 +142,8 @@ def main():
slurm_signal_delay_s=120,
**kwargs
)
executor.update_parameters(name="eval")
evaluation = args.module.split(".")[-1]
executor.update_parameters(name=evaluation)
args.dist_url = get_init_file(args.job_dir).as_uri()
print("args:", args)
trainer = Trainer(args)
......
......@@ -20,9 +20,19 @@ pack_path = pkg_resources.get_distribution('vitookit').location
import re
import sys, os
from torch.distributed.run import main
from torch.distributed.run import parse_args, config_from_args, elastic_launch, uuid
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.argv[1] = os.path.join(pack_path,'vitookit','evaluation',sys.argv[1])
print(sys.argv)
sys.exit(main())
args = parse_args(None)
if args.standalone:
args.rdzv_backend = "c10d"
args.rdzv_endpoint = "localhost:0"
args.rdzv_id = str(uuid.uuid4())
config, cmd, cmd_args = config_from_args(args)
cmd_args[1] = os.path.join(pack_path, 'vitookit', 'evaluation', cmd_args[1])
# print(cmd, cmd_args)
elastic_launch(
config=config,
entrypoint=cmd,
)(*cmd_args)
......@@ -232,13 +232,13 @@ def main(args):
order = OrderOption.RANDOM if args.distributed else OrderOption.QUASI_RANDOM
data_loader_train = Loader(args.train_path, pipelines=ThreeAugmentPipeline(),
data_loader_train = Loader(args.train_path, pipelines=ThreeAugmentPipeline(),batches_ahead=1,
batch_size=args.batch_size, num_workers=args.num_workers,
order=order, distributed=args.distributed,seed=args.seed)
data_loader_val = Loader(args.val_path, pipelines=ValPipeline(),
batch_size=args.batch_size, num_workers=args.num_workers,
batch_size=args.batch_size, num_workers=args.num_workers, batches_ahead=1,
distributed=args.distributed,seed=args.seed)
mixup_fn = None
......
......@@ -39,21 +39,18 @@ from ffcv.loader import OrderOption
def get_args_parser():
parser = argparse.ArgumentParser('MAE linear probing for image classification', add_help=False)
parser.add_argument('--batch_size', default=512, type=int,
parser.add_argument('--batch_size', default=128, type=int,
help='Batch size per GPU (effective batch size is batch_size * accum_iter * # gpus')
parser.add_argument('--epochs', default=90, type=int)
parser.add_argument('--ckpt_freq', default=5, type=int)
parser.add_argument('--accum_iter', default=1, type=int,
parser.add_argument('--accum_iter', default=16, type=int,
help='Accumulate gradient iterations (for increasing the effective batch size under memory constraints)')
# Model parameters
parser.add_argument("--compile", action='store_true', default=False, help="compile model with PyTorch 2.0")
parser.add_argument("--checkpoint_key", default=None, type=str, help="checkpoint key to load")
parser.add_argument("--prefix", default=None, type=str, help="prefix of the model name")
parser.add_argument('--head_type', default=0, choices=[0, 1 ,2], type=int,
help="""How to aggress global information.
We typically set this to 0 for models with [CLS] token (e.g., DINO), 1 for models encouraging patch semantics e.g. BEiT, 2 for combining mean pool and CLS. 2 works well for all cases. """)
parser.add_argument('--input_size', default=224, type=int,
help='images input size')
......@@ -138,7 +135,7 @@ def main(args):
cudnn.benchmark = True
data_loader_val = Loader(args.val_path, pipelines=ValPipeline(),
batch_size=args.batch_size, num_workers=args.num_workers,
batch_size=args.batch_size, num_workers=args.num_workers, batches_ahead=1,
distributed=args.distributed,seed=args.seed)
global_rank = misc.get_rank()
......@@ -234,7 +231,7 @@ def main(args):
order = OrderOption.RANDOM if args.distributed else OrderOption.QUASI_RANDOM
data_loader_train = Loader(args.train_path, pipelines=SimplePipeline(),
batch_size=args.batch_size, num_workers=args.num_workers,
batch_size=args.batch_size, num_workers=args.num_workers, batches_ahead=1,
order=order, distributed=args.distributed,seed=args.seed)
for epoch in range(args.start_epoch, args.epochs):
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment