Document

Recent

System: Kubuntu 22.04 There are two partitions on the original disk, EFI and Ubuntu root partition The Ubuntu root partition is using Btrfs filesystem, with two subvolumes, @ and @home Using sudo fdisk /dev/[disk] to partition For example, sudo fdisk /dev/nvme1n1 Create GPT partition table Cre... Read more

04-18-2024 Linux Btrfs

[Paper Note] Language Models are Unsupervised Multitask Learners

A general purpose training procedure that can be applied to a variety of NLP tasks in a zero-shot manner.

This paper presents a general purpose training procedure that can be applied to a variety of NLP tasks, using task instructions and task input as conditioning factors. A model trained with a massive, diverse, and unsupervised dataset can handle many tasks in a zero-shot manner and typically outperfo... Read more

11-23-2023 Transformer Language Model GPT paper-note

preliminary experiment for LLM distillation and pretraining

This experiment is to verify the effectiveness of the various methods from papers.

This is a preliminary experiment for pretraining language model, and using distillation to accelerate training and improve performance. The experiment is to verify the effectiveness of the following method: DeepNet (https://arxiv.org/pdf/2203.00555.pdf) Distillation framework, and corresponding los... Read more

11-12-2023 Transformer Language Model GPT TinyStories

Are Small Language Models Low-rank?

Explore causal LM training and increase hidden dimension with low-rank matrices.

Over-parameterized language models are low rank intrinsically. In this project, I trained 2 causal language models with 28M parameters each, such that one is the baseline and another uses low-rank weights but has higher hidden dimensions, and compare their training speed and accuracy. Although they ... Read more

06-14-2023 Transformer Language Model GPT

Revealing Category Preferences of ResNet Layers: Visualization Based on Web

Using web-based technology, this project visualizes the internal activation pattern of ResNet18 on the CIFAR10 dataset.

Using web-based technology, this project visualizes the internal activation pattern of ResNet18 on the CIFAR10 dataset. The visualization tries to show whether those kernels in convolutional layers tend to specialize to a certain class, and from a higher level, whether exists an internal activation ... Read more

03-25-2023 CNN ResNet Visualization

Project

preliminary experiment for LLM distillation and pretraining

This experiment is to verify the effectiveness of the various methods from papers.

11-12-2023 Transformer Language Model GPT TinyStories

Are Small Language Models Low-rank?

Explore causal LM training and increase hidden dimension with low-rank matrices.

06-14-2023 Transformer Language Model GPT

Revealing Category Preferences of ResNet Layers: Visualization Based on Web

Using web-based technology, this project visualizes the internal activation pattern of ResNet18 on the CIFAR10 dataset.

03-25-2023 CNN ResNet Visualization

An Evaluation of Four P300 ERP Classifiers' Generalization Performance in the Oddball Paradigm

P300 ERP is evoked when a person perceives a target stimuli, and it associates with the decision-making process that something important had occurred.

For classifying P300 event-related potential, usually need prior knowledge about the EEG signal during the target and non-target stimuli. However, different classifiers need different amounts of data to achieve a usable classification ability. In this final project, I explored 4 different classifier... Read more

03-21-2023 brain-computer-interface machine-learning CNN

Use Verlet Integration to Simulate Gravity

Finite Difference, Verlet Integration, and its Application

Online-demo: https://xiaonanfu-ucsd.github.io/verlet-gravity/ Differential equations are an important tool for classical mechanics, such as analyzing force and movement. In the simulation of the physical phenomenon of real-world objects, numerical differentiation is usually good enough to revea... Read more

07-24-2022 javascript 3D math differential equation

Machine Learning

[Paper Note] Language Models are Unsupervised Multitask Learners

A general purpose training procedure that can be applied to a variety of NLP tasks in a zero-shot manner.

11-23-2023 Transformer Language Model GPT paper-note

Development

Migrate Ubuntu on Btrfs to a New Disk

04-18-2024 Linux Btrfs

RECENT

PROJECT

MACHINE LEARNING

ABOUT