XF-Blog
ProjectMachine LearningdevelopmentAbout
PROJECT
Capsule Network in Loma: Compare Loma Automatic Differentiation with PyTorch implementation
In this project, my main objective was to implement a capsule network [1] using the Loma language and then leverage the Loma AD compiler for automatic differentiation. Since this is a source-to-source approach, I wanted to compare its speed and memory usage against pytorch, which employs a mixture o... Read more
Fine-Tuning a Language Model Using DPO Method
Fine-tune a small GPT2 model (124M) using DPO and LoRA.
Our goal is to fine-tune a small GPT2 model (124M) using DPO and LoRA, with a dataset from Anthropic [1], which contains approximately 16K high-quality conversational interactions. The data represents diverse dialogues measured by metrics such as coherence, relevance, and engagement. We will use thi... Read more
preliminary experiment for LLM distillation and pretraining
This experiment is to verify the effectiveness of the various methods from papers.
This is a preliminary experiment for pretraining language model, and using distillation to accelerate training and improve performance. The experiment is to verify the effectiveness of the following method: DeepNet (https://arxiv.org/pdf/2203.00555.pdf) Distillation framework, and corresponding los... Read more
Are Small Language Models Low-rank?
Explore causal LM training and increase hidden dimension with low-rank matrices.
Over-parameterized language models are low rank intrinsically. In this project, I trained 2 causal language models with 28M parameters each, such that one is the baseline and another uses low-rank weights but has higher hidden dimensions, and compare their training speed and accuracy. Although they ... Read more
Revealing Category Preferences of ResNet Layers: Visualization Based on Web
Using web-based technology, this project visualizes the internal activation pattern of ResNet18 on the CIFAR10 dataset.
Using web-based technology, this project visualizes the internal activation pattern of ResNet18 on the CIFAR10 dataset. The visualization tries to show whether those kernels in convolutional layers tend to specialize to a certain class, and from a higher level, whether exists an internal activation ... Read more
An Evaluation of Four P300 ERP Classifiers' Generalization Performance in the Oddball Paradigm
P300 ERP is evoked when a person perceives a target stimuli, and it associates with the decision-making process that something important had occurred.
For classifying P300 event-related potential, usually need prior knowledge about the EEG signal during the target and non-target stimuli. However, different classifiers need different amounts of data to achieve a usable classification ability. In this final project, I explored 4 different classifier... Read more
Use Verlet Integration to Simulate Gravity
Finite Difference, Verlet Integration, and its Application
Online-demo: https://xiaonanfu-ucsd.github.io/verlet-gravity/ Differential equations are an important tool for classical mechanics, such as analyzing force and movement. In the simulation of the physical phenomenon of real-world objects, numerical differentiation is usually good enough to revea... Read more