Finetune Transformer LM download

finetune-transformer-lm is a research codebase that accompanies the paper “Improving Language Understanding by Generative Pre-Training,” providing a minimal implementation focused on fine-tuning a transformer language model for evaluation tasks. The repository centers on reproducing the ROCStories Cloze Test result and includes a single-command training workflow to run the experiment end to end. It documents that runs are non-deterministic due to certain GPU operations and reports a median accuracy over multiple trials that is slightly below the single-run result in the paper, reflecting expected variance in practice. The project ships lightweight training, data, and analysis scripts, keeping the footprint small while making the experimental pipeline transparent. It is provided as archived, research-grade code intended for replication and study rather than continuous development.

Features

End-to-end ROCStories Cloze Test training via train.py with flags for submit and analysis
Dataset loader utilities for ROCStories and simple data directory configuration
Analysis helpers and reporting scripts for post-run evaluation
Configuration and optimization helpers through opt.py and supporting utils
Documented non-determinism from GPU ops with reported median accuracy over multiple runs
MIT-licensed, archived research code aimed at reproduction and study

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow Finetune Transformer LM

Finetune Transformer LM Web Site

User Reviews

Be the first to post a review of Finetune Transformer LM!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Generative AI

Registered

2025-10-04

Similar Business Software

CodeT5

Code for CodeT5, a new code-aware pre-trained encoder-decoder model. Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. This is the official PyTorch implementation for the EMNLP 2021 paper from Salesforce Research. CodeT5-large-ntp-py is specially...

See Software
LLaMA-Factory

LLaMA-Factory is an open source platform designed to streamline and enhance the fine-tuning process of over 100 Large Language Models (LLMs) and Vision-Language Models (VLMs). It supports various fine-tuning techniques, including Low-Rank Adaptation (LoRA), Quantized LoRA (QLoRA), and...

See Software
LongLLaMA

This repository contains the research preview of LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more. LongLLaMA is built upon the foundation of OpenLLaMA and fine-tuned using the Focused Transformer (FoT) method. LongLLaMA code is built upon the...

See Software
Contextual.ai

Customize contextual language models for your enterprise use case. Unlock your team’s full potential with RAG 2.0, the most accurate, reliable, and auditable way to build production-grade AI systems. We pre-train, fine-tune, and align all components as a single integrated system to achieve...

See Software
Florence-2

Florence-2-large is an advanced vision foundation model developed by Microsoft, capable of handling a wide variety of vision and vision-language tasks, such as captioning, object detection, segmentation, and OCR. Built with a sequence-to-sequence architecture, it uses the FLD-5B dataset...

See Software
Snowglobe

Snowglobe is a high-fidelity simulation engine that helps AI teams test LLM applications at scale by simulating real-world user conversations before launch. It generates thousands of realistic, diverse dialogues by creating synthetic users with distinct goals and personalities that interact with...

See Software

Report inappropriate content

Finetune Transformer LM

Code for "Improving Language Understanding by Generative Pre-Training"

Get an email when there's a new version of Finetune Transformer LM