🚀🚀 「大模型」50分钟完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 50 min!
carlo blog
The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
Large Language Model Text Generation Inference
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Train a 1B LLM with 1T tokens from scratch by personal
Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.