From be8a78ab4621c3dc0c8596729f081bdcc9a1b0de Mon Sep 17 00:00:00 2001 From: "Wu, Jiantao (PG/R - Comp Sci & Elec Eng)" <jiantao.wu@surrey.ac.uk> Date: Wed, 14 Feb 2024 11:56:25 +0000 Subject: [PATCH] Update evaluation scripts and configurations --- EVALUATION.md | 63 ++++++++++++++------------------------------------- README.MD | 2 +- bin/submitit | 2 +- 3 files changed, 19 insertions(+), 48 deletions(-) diff --git a/EVALUATION.md b/EVALUATION.md index d09947e..9ed4f0f 100644 --- a/EVALUATION.md +++ b/EVALUATION.md @@ -53,14 +53,11 @@ Reference results for [MAE in linear probing](https://github.com/facebookresearc To train a single classifier on frozen weights, run: ``` -submitit --module vitookit.evaluation.eval_linear_ffcv --train_path ~/data/ffcv/IN1K_train_500_95.ffcv --val_path ~/data/ffcv/IN1K_val_500_95.ffcv -w ~/models/mae_pretrain_vit_base.pth --checkpoint_key=model --gin VisionTransformer.global_pool='"avg"' --fast_dir /raid/local_scratch/jxw30-hxc19/ --batch_size=128 --accum_iter=16 --blr=0.1 +submitit --module vitookit.evaluation.eval_linear_ffcv --train_path ~/data/ffcv/IN1K_train_500_95.ffcv --val_path ~/data/ffcv/IN1K_val_500_95.ffcv -w ~/models/mae_pretrain_vit_base.pth --checkpoint_key=model --gin VisionTransformer.global_pool='"avg"' --fast_dir /raid/local_scratch/jxw30-hxc19/ --batch_size=128 --accum_iter=16 --blr=0.05 ``` +Effective batch size is 16384 = 128 (batch_size per gpu) * 16 (accum_iter) * 8. Learning rate is 3.2 = 0.05 * 16384 / 256. + -### k-NN Classification -To evaluate k-NN classification on the frozen features, run: -``` -python -m torch.distributed.launch --master_port=29501 --nproc_per_node=2 evaluation/eval_knn.py --pretrained_weights <weight> --data_location <data_path> --data_set <data_set> --output_dir <output_dir> --head_type <> --dis_fn <cosine/euclidean> -``` ### Unsupervised Classification on ImageNet To evaluate for unsupervised classification, run: @@ -80,6 +77,8 @@ For semi-supervsied classification, we use the data split defined in SimCLRv2, s --finetune_head_layer 1 ``` +## Dense Prediction + ### Object Detection and Instance Segmentation on COCO To train ViT-S/16 with Cascaded Mask R-CNN as the task layer, run: @@ -124,28 +123,29 @@ To test ViT-B/16 run: torchrun --nproc_per_node=4 evaluation/semantic_segmentation/test.py evaluation/semantic_segmentation/configs/linear/vit_base_512_ade20k_160k.py <ckpt path>/iter_40000.pth --launcher pytorch --eval mIoU --out < file path to write the results> ``` -### Transfer Learning on Smaller Datasets +### linear prob on COCO + +``CUDA_VISIBLE_DEVICES=1 MODEL_DIR= screen bash evaluation/awesome-semantic-segmentation-pytorch/scripts/linear_coco_dist.sh`` -For historical issues and reproductivity, we use the default default fine-tuning recipe (i.e., w/o layerwise decay, a smaller learing rate, and a longer training scheduler) proposed in DEiT. -The default configuration in [run.sh](https://github.com/bytedance/ibot/blob/main/run.sh) is for ViT-B/16, and just one-line command is easy to go: +## Nearest Neighbor Retrieval + +### k-NN Classification +To evaluate k-NN classification on the frozen features, run: ``` -./run.sh cifar10_cls+cifar_cls+cars_cls+flwrs_cls+inat_cls+inat19_cls $JOB_NAME vit_base teacher 8 +vitrun --nproc_per_node=2 eval_knn.py --pretrained_weights <weight> --data_location <data_path> --data_set <data_set> --output_dir <output_dir> --dis_fn <cosine/euclidean> ``` -Note: ViT-S/16 shares most of the configuration, except that we set the `--lr` as 5e-5 for INAT18 dataset and 2.5e-5 for INAT19 dataset. - -### Nearest Neighbor Retrieval Nerest neighbor retrieval is considered using the frozen pre-trained features following the evaluation protocol as DINO. We consider three settings: -**Video Object Segmentation on DAVIS** +### Video Object Segmentation on DAVIS ``` model_dir=<> python evaluation/eval_video_segmentation.py --pretrained_weights=${model_dir}/weights.pth --arch=vit_base --data_location=../data/DAVIS --output_dir=${model_dir}/eval_seg/video_knn-davis &&\ python evaluation/davis2017-evaluation/evaluation_method.py --task semi-supervised --davis_path ../data/DAVIS --results_path ${model_dir}/eval_seg/video_knn-davis ``` -**Image Retrieval on Paris and Oxford** +### Image Retrieval on Paris and Oxford ``` python evaluation/eval_image_retrieval.py --data_location ../data/revisited_paris_oxford/ --arch=vit_base -w outputs/models/SiT/checkpoint.pth --data_set=roxford5k ``` @@ -176,37 +176,8 @@ for m in model_dir: -**Copy Detection on Copydays** +### Copy Detection on Copydays ``` ./run.sh copydays_copydey $JOB_NAME vit_small teacher 1 \ --data_location data/copydays -``` - - -## linear prob on COCO - -``CUDA_VISIBLE_DEVICES=1 MODEL_DIR= screen bash evaluation/awesome-semantic-segmentation-pytorch/scripts/linear_coco_dist.sh`` - -## projection - -## Fewshot - -```bash -python evaluation/eval_fewshot_cls.py --pretrained_weights <weight> --data_location <data_path> --output_dir <output_dir> -``` - -batch evaluation: -```Python -model_dir= [ -'outputs/models/MMC', -'outputs/models/mae-base', -'outputs/models/SiT','outputs/models/iBOT-ViT_B','outputs/models/MSN-ViT_B','outputs/models/DINO-ViT_B','outputs/models/sit-ViT_B','outputs/models/MC_SSL-ViT_B' -] - -for m in model_dir: - for ds in ['ominiglot']: - cmd = f"python evaluation/eval_fewshot_cls.py --data_location ../data --arch=vit_base -w {m}/checkpoint.pth --output_dir={m}/eval/fewshot --data_set={ds}" - print(cmd) - os.system(cmd) -``` - +``` \ No newline at end of file diff --git a/README.MD b/README.MD index ef98acb..e23fd4f 100644 --- a/README.MD +++ b/README.MD @@ -46,7 +46,7 @@ options: We move files to **FAST_DIR**. For example, to finetune a pre-trained model on ImageNet, run: ```bash -submitit --module vitookit.evaluation.eval_cls_ffcv --train_path ~/data/ffcv/IN1K_train_500_95.ffcv --val_path ~/data/ffcv/IN1K_val_500_95.ffcv --gin VisionTransformer.global_pool='"avg"' -w wandb:dlib/EfficientSSL/lsx2qmys +submitit --module vitookit.evaluation.eval_cls_ffcv --train_path ~/data/ffcv/IN1K_train_500_95.ffcv --val_path ~/data/ffcv/IN1K_val_500_95.ffcv --fast_dir /raid/local_scratch/jxw30-hxc19/ --gin VisionTransformer.global_pool='"avg"' -w wandb:dlib/EfficientSSL/lsx2qmys ``` # Evaluation diff --git a/bin/submitit b/bin/submitit index 6673e68..15a5042 100755 --- a/bin/submitit +++ b/bin/submitit @@ -70,7 +70,7 @@ class Trainer(object): # move the dataset to fast_dir fast_dir = self.args.fast_dir - if fast_dir: + if fast_dir and self.module_args.rank==0: import shutil for key,value in self.module_args.__dict__.items(): if isinstance(value,str) and '.ffcv' in value: -- GitLab