ToolJoltTools

LoRA Adapter Size Calculator

Trainable parameters and file size of LoRA adapters from rank, target modules and model architecture.

โ€”
Trainable params (M)
โ€”
Adapter file (FP16) (MB)
โ€”
Of a 7B base (%)

Defaults (7B-class, r=16, q/k/v/o) give a 33.5M-parameter, 67 MB adapter โ€” 0.48% of the base model. Doubling rank doubles size; adding MLP targets roughly quadruples it and usually helps more than raising rank.

Formula

params per adapted matrix = r ร— (d_in + d_out) ยท total = layers ร— ฮฃ adapted matrices โ€” file size = params ร— 2 B (FP16)
References: Hu et al. (2021), LoRA: Low-Rank Adaptation of Large Language Models; Dettmers et al. (2023), QLoRA (all-linear targeting)

About LoRA Adapter Size Calculator

LoRA's economics live in one formula: each adapted weight matrix gains two skinny factors costing rank ร— (in + out) parameters โ€” typically a fraction of a percent of the base model. This calculator turns rank and target-module choices into exact trainable-parameter counts and the MB your adapter file will weigh on the Hub. The defaults match the most common 7B recipe; switch to all-linear targeting to see why QLoRA's authors recommend it (4ร— the adapter, still under 2% of base) or to q,v-only to reproduce the original paper's minimal setup.

How to use LoRA Adapter Size Calculator

  1. 1Enter your values into LoRA Adapter Size Calculator โ€” sensible, domain-typical defaults are pre-filled so you see a real result immediately.
  2. 2The result recomputes live using the formula shown on the page; there is no button to press.
  3. 3Adjust any input to compare scenarios, then read the worked example to see the substituted numbers.

Why use LoRA Adapter Size Calculator?

  • โœ“Computes LoRA Adapter Size instantly in your browser โ€” no sign-up, no upload, no server round-trip.
  • โœ“100% free and unlimited, with the exact formula shown: params per adapted matrix = r ร— (d_in + d_out).
  • โœ“Runs entirely client-side, so every value you enter stays private on your device.
  • โœ“Live recompute as you type, with a worked example and authoritative references for trust.

Frequently asked questions

What rank should I use?+

r=8โ€“16 handles most style/format/domain fine-tunes; r=32โ€“64 for harder behavioral shifts or multi-task adapters. The QLoRA ablation found TARGET COVERAGE (adapting all linear layers) mattered more than rank โ€” broaden targets before raising r.

Why are LoRA files sometimes bigger than this estimate?+

Checkpoints may store optimizer states (3ร— larger), keep FP32 copies, or include the merged base modules. A clean save_pretrained adapter at FP16 should match this calculator within a few MB; anything 10ร— bigger is carrying training baggage.

Does a bigger adapter slow inference?+

Unmerged, each adapted layer adds a small bypass matmul โ€” a few percent latency. Merged (W + BA baked in), inference cost is exactly the base model's: zero overhead. Merge for deployment; keep adapters separate when hot-swapping many customers' tunes.

What is alpha and does it change the size?+

Alpha is a scalar scaling (effective update = ฮฑ/r ยท BA) โ€” zero parameters, zero size impact. Convention sets ฮฑ = 2r or ฮฑ = r; what matters is consistency between training and loading, or generations subtly weaken/strengthen.

Related tools

Related ML & AI tools

Sponsored