Python Tutorial

Python Tutorial Python Features Python History Python Applications Python Install Python Example Python Variables Python Data Types Python Keywords Python Literals Python Operators Python Comments Python If else Python Loops Python For Loop Python While Loop Python Break Python Continue Python Pass Python Strings Python Lists Python Tuples Python List Vs Tuple Python Sets Python Dictionary Python Functions Python Built-in Functions Python Lambda Functions Python Files I/O Python Modules Python Exceptions Python Date Python Regex Python Sending Email Read CSV File Write CSV File Read Excel File Write Excel File Python Assert Python List Comprehension Python Collection Module Python Math Module Python OS Module Python Random Module Python Statistics Module Python Sys Module Python IDEs Python Arrays Command Line Arguments Python Magic Method Python Stack & Queue PySpark MLlib Python Decorator Python Generators Web Scraping Using Python Python JSON Python Itertools Python Multiprocessing How to Calculate Distance between Two Points using GEOPY Gmail API in Python How to Plot the Google Map using folium package in Python Grid Search in Python Python High Order Function nsetools in Python Python program to find the nth Fibonacci Number Python OpenCV object detection Python SimpleImputer module Second Largest Number in Python

Python OOPs

Python OOPs Concepts Python Object Class Python Constructors Python Inheritance Abstraction in Python

Python MySQL

Environment Setup Database Connection Creating New Database Creating Tables Insert Operation Read Operation Update Operation Join Operation Performing Transactions

Python MongoDB

Python SQLite

Python Questions

Plotly

Plotly with Matplotlib and Chart Studio Plotly with Pandas and Cufflinks

Python Tkinter (GUI)

Python Web Blocker

Introduction Building Python Script Script Deployment on Linux Script Deployment on Windows

Python MCQ

Python MCQ Python MCQ Part 2

Python Programs

next → ← prev

How to Create an Animation of the Embeddings During Fine-Tuning

There were more than 200,000 impressions of the activity. Numerous perusers communicated interest in how it was made and energetically gotten. This exposition is intended to help those perusers and anyone keen on making representations equivalent to them.

The objective of this post is to offer a careful instructional exercise on the best way to make such a liveliness, including every one of the important cycles, including calibrating, installing age, exception recognizable proof, PCA, Procrustes, survey, and movement creation.

Preparation: Fine-tuning

Pre-prepared Vision Transformer (ViT) model. For this, we use the CIFAR-10 dataset, which comprises 60,000 photographs partitioned into ten classifications: trucks, frogs, ponies, deer, felines, birds, and vehicles.

To complete the calibrating methodology for CIFAR-10, adhere to the directions given in the Embracing Face illustration for picture grouping with transformers. We likewise utilize a TrainerCallback to record the misfortune values during preparation into a CSV document for use in liveliness.

from transformers import TrainerCallback
class PrinterCallback(TrainerCallback):
 def on_log(self, args, state, control, logs=None, **kwargs):
 _ = logs.pop("total_flos", None)
 if state.is_local_process_zero:
 if len(logs) == 3: # skip last row
 with open("log.csv", "a") as f:
 f.write(",".join(map(str, logs.values())) + "\n")

To ensure an adequate number of designated spots for liveliness, it's basic to build the save stretch for designated spots by setting save_strategy="step" and a low number for save_step in TrainingArguments. The movement's approaches each address an alternate designated spot. An organizer is laid out for every designated spot during preparation, and the CSV record is likewise ready and prepared for utilization.

Embeddings Creation

To create embeddings from the test split of the CIFAR-10 dataset using various model checkpoints, we utilize the Transformers library's AutoFeatureExtractor and AutoModel functions.

Each embedding represents one of the 10,000 test pictures for a single model checkpoint as a 768-dimensional vector. These embeddings can be kept in the same folder as the checkpoints to keep track of everything.

Extracting Outliers

The Cleanlab library's OutOfDistribution class may be used to find outliers based on the embeddings for each checkpoint. The top 10 outliers for the animation may then be determined using the resultant scores.

from cleanlab.outlier import OutOfDistribution
def get_ood(sorted_checkpoint_folder, pdf):
 ...
 ood = OutOfDistribution()
 ood_train_feature_scores = ood.fit_score(features=embedding_np)
 df["scores"] = ood_train_feature_scores 

Using Procrustes Analysis and PCA

We visualize the embeddings in a 2D space using a Principal Component Analysis (PCA) using the scikit-learn package, which reduces the 768-dimensional vectors to 2 dimensions. Large leaps in the animation might happen when PCA is recalculated for each timestep due to axis flips or rotations. We use an extra Procrustes Analysis [3] from the SciPy package to geometrically shift every frame onto the previous frame to solve this problem. This analysis just requires translation, rotation, and uniform scaling. This makes the animation's transitions more fluid.

from sklearn.decomposition import PCA
from scipy.spatial import Procrustes
def make_pca(sorted_checkpoint_folder, pca_np):
 ...
 embedding_np_flat = embedding_np.reshape(-1, 768)
 pca = PCA(n_components=2)
 pca_np_new = pca.fit_transform(embedding_np_flat)
 _, pca_np_new, disparity = Procrustes(pca_np, pca_np_new)

Review in Spotlight

Before applying the finishing touches, we assess the entire animation in Spotlight. This process's first and last checkpoints are used to create embeddings, perform PCA, and identify outliers. Spotlight is launched, and the produced DataFrame is loaded.

Spotlight provides a comprehensive table with a list of all the fields in the dataset in the top left corner of the screen. In the top right corner are two PCA representations, one for the embeddings made using the first checkpoint and the other for the last checkpoint. The bottom section is a showcase of chosen photographs.

Create the animation

Each checkpoint receives its image, which is then saved with the appropriate checkpoint.

This is accomplished using the make_pca(...) and get_ood(...) functions, which generate the 2D points that constitute the embedding and extract the top 8 outliers, respectively. The various classes are represented by the colours used to lay out the 2D points. According to their score, the outliers are organized, and the high score scoreboard shows images of the pertinent outliers. The training loss is imported from a CSV file and shown as a line graph.

After that, all the images might be combined into a GIF using libraries like Imageio or others.

Conclusion:

An extensive lesson on how to create an animation that demonstrates the process of fine-tuning a Vision Transformer (ViT) model is provided in this article. The steps for creating and analyzing embeddings, visualizing the results, and creating an animation that integrates these components have all been covered.

Making such an animation is a powerful approach to teach these concepts to others and to help others understand the complex procedure of modifying a ViT model.

Next TopicHow to Create India Data Maps With Python and Matplotlib

← prev next →