Tensor Memory Size Calculator

Bytes of any tensor from its shape and dtype — parse '64, 3, 224, 224' and compare FP32/BF16/INT8/INT4.

Tensor shape (comma-separated)e.g. a batch of 64 RGB 224×224 imagesData type

—

Elements (M)

—

Memory (MB)

—

(GB)

The default — one training batch of ImageNet-sized images — is 38.5 MB at FP32. Multiply by the dozens of activation tensors a deep network keeps for backprop and GPU memory vanishes fast.

Formula

bytes = ∏ dims × sizeof(dtype) — accepts shapes separated by commas, spaces or ×

References: PyTorch tensor dtype documentation

About Tensor Memory Size Calculator

Every out-of-memory error is, at bottom, a failure of this multiplication: product of dimensions times bytes per element. This calculator parses any shape string — commas, spaces or × — and prices it across the dtype menu from FP64 down to INT4. It is deliberately simple because the skill it builds is the estimation habit: knowing instantly that a 64×3×224×224 batch is 38 MB at FP32, that BF16 halves it, and that an 8K-token attention matrix at FP32 would be 256 MB per head if FlashAttention didn't exist.

How to use Tensor Memory Size Calculator

1Enter your values into Tensor Memory Size Calculator — sensible, domain-typical defaults are pre-filled so you see a real result immediately.
2The result recomputes live using the formula shown on the page; there is no button to press.
3Adjust any input to compare scenarios, then read the worked example to see the substituted numbers.

Why use Tensor Memory Size Calculator?

✓Computes Tensor Memory Size instantly in your browser — no sign-up, no upload, no server round-trip.
✓100% free and unlimited, with the exact formula shown: bytes = ∏ dims × sizeof(dtype) — accepts shapes separated by commas, spaces or ×.
✓Runs entirely client-side, so every value you enter stays private on your device.
✓Live recompute as you type, with a worked example and authoritative references for trust.

Frequently asked questions

Why does my GPU OOM when my tensors look small?+

Backprop keeps most intermediate activations alive until backward completes — dozens of copies of layer outputs, not one. Add gradients (same size as weights), optimizer states (2× weights for Adam) and allocator fragmentation, and 'small' tensors compound into gigabytes.

BF16 vs FP16 — same memory, what's the difference?+

Both are 2 bytes. BF16 keeps FP32's 8 exponent bits (same range, less precision) so it rarely overflows and usually trains without loss scaling; FP16 has more mantissa but a narrow range. Memory math is identical — choose by numerical behavior and hardware.

How is INT4 half a byte stored?+

Two 4-bit values pack into each byte, plus per-group scale factors stored separately (see our quantization calculators for the real effective rates ~4.1–4.8 bits). This tool's 0.5 B option gives the idealized packed size.

Do strides and views change memory use?+

Views, slices and transposes share the original storage — free. But .contiguous(), many reshape paths, and most elementwise ops materialize copies at full size. When tracing an OOM, count materialized storages, not tensor objects.

Related tools

Related ML & AI tools

🧠

ROC-AUC Calculator (from TPR/FPR points)

Trapezoidal area under the ROC curve from your (FPR, TPR) operating points — the threshold-independent ranking score.

● Live

🧠

Classification Threshold Cost Calculator

Find the probability cutoff that minimizes expected cost given your false-positive and false-negative penalties.

● Live

🧠

Silhouette Score Calculator

Cluster cohesion vs separation for one point — the building block of the silhouette metric for choosing K.

● Live