Model Training
What is Model Training?
Model training is how an AI learns: it's shown enormous amounts of data, makes guesses, gets corrected, and gradually improves until it can perform a task reliably. The result of training is a model's weights: the stored knowledge it carries into every future interaction.
At a glance
- Also known as
- Model learningNeural network trainingFine-tuning (for specialised training)
- Used for
- Building AI capabilities from scratchFine-tuning models on custom stylesAdapting foundation models for specific tasks
- Common tools
- PyTorchTensorFlowHugging faceKohya SSReplicate
- Related terms
- Model architectureFine-tuningLoRAOverfittingDatasetInference
Ready to create?
Direct scenes, design characters, and ship full films
All-in-one AI creative platform with simple, transparent pricing, no speed throttles, and an infinite Canvas for max creativity.
How it compares
Training is the process of teaching a model: it is computationally expensive, time-consuming, and happens before the model is deployed. Inference is the process of using a trained model to generate outputs: it is much faster and cheaper, and is what happens every time you enter a prompt into an AI tool.
Think of it like…
Model training is like a student studying for an exam over an extended period. Each practice question is a piece of training data; each wrong answer leads to a correction that adjusts the student's understanding. After enough study, the student has internalised patterns that let them answer new questions they haven't seen before. The finished, trained model is like that student at the end of revision: ready to be tested in the real world.
Pro tip
When fine-tuning a model on your own images for consistent character or style generation, prioritise image diversity over quantity: twenty varied, high-quality reference images will typically produce better results than a hundred near-identical shots, as the model needs to understand the subject from multiple angles and in different lighting conditions to generalise well.
Types and variations
- Model training encompasses several distinct processes.
- Pre-training involves training a model from scratch on a massive, broad dataset to give it general capabilities.
- Supervised fine-tuning involves further training on labelled examples to specialise the model for a particular task.
- Reinforcement learning from human feedback (RLHF) uses human preference signals to align model behaviour with desired outputs.
- Parameter-efficient fine-tuning methods such as LoRA train only a small subset of weights, making customisation accessible without full computational infrastructure.
- Self-supervised learning, used in many large foundation models, derives training signal from the structure of the data itself rather than explicit human labels.
Ready to make your first scene in Morphic?
Try MorphicCommon use cases
- Model training in creative AI workflows most commonly appears as fine-tuning: training a base image or video generation model on a set of reference images to capture a specific character's face, a particular visual style, or a brand's aesthetic.
- Platforms such as Replicate, RunPod, and dedicated tools like Kohya SS make it possible for individual creators to run fine-tuning jobs.
- Training is also implicitly at the core of every AI tool used in production: the capabilities of Sora, Stable Diffusion, Midjourney, and similar tools are all direct products of their training data and process.
Ready to create?
Direct scenes, design characters, and ship full films
All-in-one AI creative platform with simple, transparent pricing, no speed throttles, and an infinite Canvas for max creativity.
FAQs
Full pre-training of a large foundation model can take weeks to months on clusters of hundreds of GPUs and costs millions of pounds. Fine-tuning a personal LoRA model on a consumer GPU, by contrast, can take anywhere from twenty minutes to a few hours depending on dataset size and hardware.
Most large image generation models have been trained on billions of image-text pairs scraped from the internet. Video models add temporal data: sequences of frames with associated captions or metadata. The specific composition of training data varies by model and is often not fully disclosed by developers.
Overfitting occurs when a model memorises its training data too closely and loses the ability to generalise. In fine-tuning for creative use, an overfitted model might reproduce your reference images too literally, losing flexibility in response to varied prompts. Controlling training steps and data diversity helps avoid this.
Yes: parameter-efficient fine-tuning methods like LoRA have been made accessible through tools with graphical interfaces and detailed community guides. Full pre-training from scratch remains the domain of well-resourced teams, but meaningful customisation is within reach for technically curious creators.
Training (or pre-training) builds a model's capabilities from the ground up on a massive dataset. Fine-tuning takes an already trained model and continues training on a smaller, more specific dataset to specialise its behaviour: it is far cheaper and faster than training from scratch.
A model reflects the patterns present in its training data. If the data over-represents certain demographics, aesthetics, or cultural viewpoints, the model will reproduce those biases in its outputs. This is a significant and ongoing challenge in AI development, particularly for models used in public-facing creative production.