🧠

Machine Learning & AI Engineering

Calculators for ML engineers — VRAM and KV-cache budgets, parameter and FLOPs counts, training memory, and every classification/regression/ranking metric.

⭐ Popular ML & AI tools

🧠

ROC-AUC Calculator (from TPR/FPR points)

Trapezoidal area under the ROC curve from your (FPR, TPR) operating points — the threshold-independent ranking score.

● Live

🧠

Classification Threshold Cost Calculator

Find the probability cutoff that minimizes expected cost given your false-positive and false-negative penalties.

● Live

🧠

Silhouette Score Calculator

Cluster cohesion vs separation for one point — the building block of the silhouette metric for choosing K.

● Live

🧠

Adjusted Rand Index Calculator

Agreement between two clusterings (or clustering vs ground truth), corrected for chance — from pair counts.

● Live

🧠

Expected Calibration Error (ECE) Calculator

Weighted gap between predicted confidence and actual accuracy across bins — the standard model-calibration metric.

● Live

🧠

Top-K Accuracy Calculator

Fraction of samples whose true label is among the model's top-K predictions — the ImageNet-style metric.

● Live

More ML & AI tools

🧠

MAPE & sMAPE Calculator

Percentage forecast error and its symmetric variant — the demand/sales forecasting accuracy metrics, with their pitfalls.

● Live

🧠

MASE (Mean Absolute Scaled Error) Calculator

Scale-free forecast accuracy benchmarked against a naive forecast — the metric that works on any series, zeros included.

● Live

🧠

Normalized Mutual Information Calculator

Information shared between two clusterings or label sets, normalized to 0–1 — a chance-aware clustering metric.

● Live

🧠

Recommender Hit Rate & Coverage Calculator

Hit rate, catalog coverage and an aggregate-diversity view for recommendation systems — beyond just accuracy.

● Live

🧠

QA Exact-Match & Token-F1 Calculator

SQuAD-style exact-match and token-overlap F1 between a predicted answer and the reference — the QA evaluation pair.

● Live

🧠

Word Error Rate (WER) & CER Calculator

Speech-recognition and OCR accuracy — substitutions, deletions and insertions per reference word/character.

● Live

🧠

Classification Metrics — Confusion Matrix & Metrics Calculator

Compute accuracy, precision, recall, F1, specificity, MCC and more for classification metrics from TP/FP/FN/TN counts.

● Live

🧠

Spam Filter — Confusion Matrix & Metrics Calculator

Compute accuracy, precision, recall, F1, specificity, MCC and more for spam filter from TP/FP/FN/TN counts.

● Live

🧠

Medical Diagnostic Test — Confusion Matrix & Metrics Calculator

Compute accuracy, precision, recall, F1, specificity, MCC and more for medical diagnostic test from TP/FP/FN/TN counts.

● Live

🧠

Fraud Detection — Confusion Matrix & Metrics Calculator

Compute accuracy, precision, recall, F1, specificity, MCC and more for fraud detection from TP/FP/FN/TN counts.

● Live

🧠

Customer Churn Prediction — Confusion Matrix & Metrics Calculator

Compute accuracy, precision, recall, F1, specificity, MCC and more for customer churn prediction from TP/FP/FN/TN counts.

● Live

🧠

Sentiment Analysis — Confusion Matrix & Metrics Calculator

Compute accuracy, precision, recall, F1, specificity, MCC and more for sentiment analysis from TP/FP/FN/TN counts.

● Live

🧠

Image Classification (binary) — Confusion Matrix & Metrics Calculator

Compute accuracy, precision, recall, F1, specificity, MCC and more for image classification (binary) from TP/FP/FN/TN counts.

● Live

🧠

Manufacturing Defect Detection — Confusion Matrix & Metrics Calculator

Compute accuracy, precision, recall, F1, specificity, MCC and more for manufacturing defect detection from TP/FP/FN/TN counts.

● Live

🧠

F-beta Score Calculator

Weighted harmonic mean of precision and recall — tune β to favor recall (β>1) or precision (β<1) for your problem.

● Live

🧠

Cohen's Kappa Calculator

Inter-rater / model-vs-label agreement corrected for chance — the metric for labeling quality and annotation studies.

● Live

🧠

Macro / Micro / Weighted F1 Calculator

Aggregate per-class precision and recall into macro, micro and weighted F1 — and see why they disagree on imbalanced data.

● Live

🧠

PPV / NPV & Prevalence Calculator

Turn a test's sensitivity and specificity into predictive values at any prevalence — the base-rate calculator.

● Live

🧠

Balanced Accuracy & Youden's J Calculator

Average of sensitivity and specificity — the imbalance-proof accuracy, plus Youden's J for optimal threshold selection.

● Live

🧠

Regression Metrics Calculator (MAE, MSE, RMSE, R²)

Paste predicted vs actual values and get MAE, MSE, RMSE, MAPE and R² — every regression error metric at once.

● Live

🧠

Log Loss / Cross-Entropy Calculator

Binary cross-entropy for predicted probabilities vs true labels — the loss that punishes confident wrong answers.

● Live

🧠

Brier Score Calculator

Mean squared error of probability forecasts — the calibration metric for weather, risk and probabilistic models.

● Live

🧠

Precision@K & Recall@K Calculator

Top-K ranking quality for search, recommendations and RAG retrieval — relevance of the results you actually show.

● Live

🧠

Mean Reciprocal Rank (MRR) Calculator

Average of 1/rank of the first correct answer — the metric for QA, autocomplete and single-answer retrieval.

● Live

🧠

NDCG Calculator

Normalized Discounted Cumulative Gain — the gold-standard ranking metric with graded relevance and position discounting.

● Live

🧠

Mean Average Precision (MAP) Calculator

Average precision across recall levels for binary relevance — the IR and object-detection ranking standard.

● Live

🧠

Cosine Similarity Calculator

Angle-based similarity between two vectors — the core of semantic search, embeddings and recommendation.

● Live

🧠

Vector Distance Calculator (Euclidean, Manhattan, Chebyshev, Minkowski)

Compute L1, L2, L∞ and general Lp distances between two vectors — and see how the metric changes the 'nearest' point.

● Live

🧠

Levenshtein Edit Distance Calculator

Minimum insert/delete/substitute edits between two strings — plus normalized similarity for fuzzy matching and WER.

● Live

🧠

BLEU Score Calculator

N-gram precision with brevity penalty for machine translation and generation — the classic MT metric, computed live.

● Live

🧠

ROUGE Score Calculator

ROUGE-N and ROUGE-L recall/precision/F1 for summarization — how much of the reference your summary covers.

● Live

🧠

Perplexity Calculator

Convert a language model's cross-entropy loss (nats or bits) to perplexity — the 'effective branching factor' of predictions.

● Live

🧠

IoU & Dice Coefficient Calculator

Intersection-over-Union and Dice/F1 overlap for object detection and image segmentation — from areas or set counts.

● Live

🧠

Llama 3 8B VRAM Calculator

Estimate GPU memory to run Llama 3 8B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

Llama 3 70B VRAM Calculator

Estimate GPU memory to run Llama 3 70B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

Llama 3.1 405B VRAM Calculator

Estimate GPU memory to run Llama 3.1 405B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

Llama 2 7B VRAM Calculator

Estimate GPU memory to run Llama 2 7B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

Mistral 7B VRAM Calculator

Estimate GPU memory to run Mistral 7B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

Mixtral 8x7B VRAM Calculator

Estimate GPU memory to run Mixtral 8x7B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

Qwen2.5 7B VRAM Calculator

Estimate GPU memory to run Qwen2.5 7B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

Qwen2.5 72B VRAM Calculator

Estimate GPU memory to run Qwen2.5 72B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

Phi-3 Mini 3.8B VRAM Calculator

Estimate GPU memory to run Phi-3 Mini 3.8B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

Gemma 2 9B VRAM Calculator

Estimate GPU memory to run Gemma 2 9B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

Gemma 2 27B VRAM Calculator

Estimate GPU memory to run Gemma 2 27B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

DeepSeek-V3 671B (MoE) VRAM Calculator

Estimate GPU memory to run DeepSeek-V3 671B (MoE) — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

StableLM 2 1.6B VRAM Calculator

Estimate GPU memory to run StableLM 2 1.6B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

Command R 35B VRAM Calculator

Estimate GPU memory to run Command R 35B — weights, KV cache and overhead at FP16/INT8/INT4, with a fits-on-which-GPU verdict.

● Live

🧠

Llama 3 8B Fine-Tuning VRAM Calculator

Compare full fine-tuning vs LoRA vs QLoRA memory for Llama 3 8B — weights, gradients, optimizer states and activations.

● Live

🧠

Llama 3 70B Fine-Tuning VRAM Calculator

Compare full fine-tuning vs LoRA vs QLoRA memory for Llama 3 70B — weights, gradients, optimizer states and activations.

● Live

🧠

Mistral 7B Fine-Tuning VRAM Calculator

Compare full fine-tuning vs LoRA vs QLoRA memory for Mistral 7B — weights, gradients, optimizer states and activations.

● Live

🧠

Qwen2.5 7B Fine-Tuning VRAM Calculator

Compare full fine-tuning vs LoRA vs QLoRA memory for Qwen2.5 7B — weights, gradients, optimizer states and activations.

● Live

🧠

Qwen2.5 72B Fine-Tuning VRAM Calculator

Compare full fine-tuning vs LoRA vs QLoRA memory for Qwen2.5 72B — weights, gradients, optimizer states and activations.

● Live

🧠

Phi-3 Mini 3.8B Fine-Tuning VRAM Calculator

Compare full fine-tuning vs LoRA vs QLoRA memory for Phi-3 Mini 3.8B — weights, gradients, optimizer states and activations.

● Live

🧠

Gemma 2 9B Fine-Tuning VRAM Calculator

Compare full fine-tuning vs LoRA vs QLoRA memory for Gemma 2 9B — weights, gradients, optimizer states and activations.

● Live

🧠

Gemma 2 27B Fine-Tuning VRAM Calculator

Compare full fine-tuning vs LoRA vs QLoRA memory for Gemma 2 27B — weights, gradients, optimizer states and activations.

● Live

🧠

Llama 2 7B Fine-Tuning VRAM Calculator

Compare full fine-tuning vs LoRA vs QLoRA memory for Llama 2 7B — weights, gradients, optimizer states and activations.

● Live

🧠

Mixtral 8x7B Fine-Tuning VRAM Calculator

Compare full fine-tuning vs LoRA vs QLoRA memory for Mixtral 8x7B — weights, gradients, optimizer states and activations.

● Live

🧠

Llama 3 8B KV-Cache Calculator

Per-token and total key-value cache memory for Llama 3 8B across context length, batch size and cache precision.

● Live

🧠

Llama 3 70B KV-Cache Calculator

Per-token and total key-value cache memory for Llama 3 70B across context length, batch size and cache precision.

● Live

🧠

Llama 3.1 405B KV-Cache Calculator

Per-token and total key-value cache memory for Llama 3.1 405B across context length, batch size and cache precision.

● Live

🧠

Mistral 7B KV-Cache Calculator

Per-token and total key-value cache memory for Mistral 7B across context length, batch size and cache precision.

● Live

🧠

Qwen2.5 7B KV-Cache Calculator

Per-token and total key-value cache memory for Qwen2.5 7B across context length, batch size and cache precision.

● Live

🧠

Qwen2.5 72B KV-Cache Calculator

Per-token and total key-value cache memory for Qwen2.5 72B across context length, batch size and cache precision.

● Live

🧠

Gemma 2 9B KV-Cache Calculator

Per-token and total key-value cache memory for Gemma 2 9B across context length, batch size and cache precision.

● Live

🧠

Llama 2 7B KV-Cache Calculator

Per-token and total key-value cache memory for Llama 2 7B across context length, batch size and cache precision.

● Live

🧠

DeepSeek-V3 671B (MoE) KV-Cache Calculator

Per-token and total key-value cache memory for DeepSeek-V3 671B (MoE) across context length, batch size and cache precision.

● Live

🧠

Command R 35B KV-Cache Calculator

Per-token and total key-value cache memory for Command R 35B across context length, batch size and cache precision.

● Live

🧠

GGUF Q4_K_M Model Size Calculator

File size and RAM needed for any model in llama.cpp's most popular quant — Q4_K_M.

● Live

🧠

GGUF Q5_K_M Model Size Calculator

Size a Q5_K_M GGUF — the 'quality first' k-quant — for any parameter count, with RAM headroom.

● Live

🧠

GGUF Q8_0 Model Size Calculator

Near-lossless 8-bit GGUF sizing — when you want FP16 quality at half the memory.

● Live

🧠

GGUF Q2_K Model Size Calculator

Extreme-compression sizing: what a 2-bit k-quant really costs in GB and what you give up.

● Live

🧠

GPTQ 4-bit Model Size Calculator

VRAM footprint of GPTQ-quantized checkpoints (group size 128) for GPU serving.

● Live

🧠

AWQ 4-bit Model Size Calculator

Activation-aware 4-bit (AWQ) checkpoint sizing for vLLM/TensorRT-LLM deployments.

● Live

🧠

FP8 Model Size Calculator

Size models stored in FP8 (E4M3) — the H100-era serving format — with per-tensor scale overhead.

● Live

🧠

Transformer Parameter Count Calculator

Count parameters of a decoder-only transformer from layers, hidden size, FFN width, vocab and GQA heads.

● Live

🧠

Attention Layer Parameter Calculator

Parameters of one multi-head / GQA / MQA attention block — Wq, Wk, Wv, Wo broken out.

● Live

🧠

Feed-Forward (FFN/MLP) Parameter Calculator

Parameters and memory of a transformer FFN block — classic GELU vs gated SwiGLU, any width multiple.

● Live

🧠

Embedding Parameter & Memory Calculator

Vocab × hidden embedding-table cost — tied vs untied, plus the memory bill at FP16/INT8.

● Live

🧠

LLM FLOPs-per-Token Calculator

Forward-pass compute per token (≈2P) plus attention's context-dependent term — and tokens/s on your GPU.

● Live

🧠

Chinchilla Training FLOPs Calculator

Total training compute C = 6·N·D, the Chinchilla-optimal token count, and your over/under-training ratio.

● Live

🧠

LLM Training GPU-Hours & Cost Calculator

Turn a training budget (params × tokens) into GPU-hours, wall-clock days and rental dollars at your MFU.

● Live

🧠

MFU (Model FLOPs Utilization) Calculator

Measure how efficiently your training run uses its GPUs — observed tokens/s vs theoretical peak.

● Live

🧠

Attention FLOPs Calculator

Quadratic cost of self-attention: score and value FLOPs for a full sequence, vs the linear projection cost.

● Live

🧠

MoE Active-Parameter Calculator

Total vs active parameters of a mixture-of-experts transformer — experts, top-k routing and shared blocks.

● Live

🧠

Conv2D Output Size Calculator

Output H×W of a 2-D convolution from input size, kernel, stride, padding and dilation — with the floor-division gotcha.

● Live

🧠

Conv2D Parameter Calculator

Weights and biases of a conv layer — standard, depthwise and depthwise-separable compared side by side.

● Live

🧠

Conv2D FLOPs Calculator

Multiply-adds of a convolutional layer over a full feature map — see why early layers dominate compute.

● Live

🧠

Conv1D Output Size Calculator (Audio & Sequences)

Output length of 1-D convolutions over audio samples, sensor streams or token sequences — wav2vec-style stacks included.

● Live

🧠

Conv3D Output Size Calculator (Video & Medical)

Output D×H×W of 3-D convolutions for video clips and CT/MRI volumes — with voxel/frame budgeting.

● Live

🧠

Transposed Conv2D Output Calculator

Upsampling output size for deconvolutions in GANs, U-Nets and segmentation decoders — incl. output_padding.

● Live

🧠

Dilated Conv & Receptive Field Calculator

Effective kernel size and stacked receptive field of dilated (atrous) convolutions — WaveNet/DeepLab style.

● Live

🧠

'Same' Padding Calculator

Exact padding to preserve (or precisely control) feature-map size for any kernel/stride/dilation — incl. asymmetric TF cases.

● Live

🧠

Pooling Output Size Calculator

Max/avg pooling output sizes — with ceil_mode, the overlapping-pool case, and global pooling explained.

● Live

🧠

Tensor Memory Size Calculator

Bytes of any tensor from its shape and dtype — parse '64, 3, 224, 224' and compare FP32/BF16/INT8/INT4.

● Live

🧠

Matrix Multiplication FLOPs Calculator

Shape check + exact FLOPs/memory of (M×K)·(K×N) — and whether the matmul is compute- or bandwidth-bound on your GPU.

● Live

🧠

NumPy Broadcasting Shape Checker

Check whether two tensor shapes broadcast together and get the result shape — with the rule applied step by step.

● Live

🧠

Reshape & View Validity Checker

Verify a reshape preserves the element count, solve the -1 wildcard, and learn when view() fails but reshape() works.

● Live

🧠

Linear/Dense Layer Calculator

Parameters, FLOPs and activation memory of fully-connected layers — the MLP-head and projection workhorse.

● Live

🧠

Attention Matrix Memory Calculator

Memory of the s×s attention-score matrix — the quadratic wall FlashAttention removes. See it before you hit it.

● Live

🧠

RTX 3060 12GB LLM Calculator — What Can It Run?

Check which models fit on a RTX 3060 12GB: max parameters per precision, cache headroom and estimated tokens/sec.

● Live

🧠

RTX 4060 Ti 16GB LLM Calculator — What Can It Run?

Check which models fit on a RTX 4060 Ti 16GB: max parameters per precision, cache headroom and estimated tokens/sec.

● Live

🧠

RTX 3090 24GB LLM Calculator — What Can It Run?

Check which models fit on a RTX 3090 24GB: max parameters per precision, cache headroom and estimated tokens/sec.

● Live

🧠

RTX 4090 24GB LLM Calculator — What Can It Run?

Check which models fit on a RTX 4090 24GB: max parameters per precision, cache headroom and estimated tokens/sec.

● Live

🧠

RTX 5090 32GB LLM Calculator — What Can It Run?

Check which models fit on a RTX 5090 32GB: max parameters per precision, cache headroom and estimated tokens/sec.

● Live

🧠

RTX A6000 48GB LLM Calculator — What Can It Run?

Check which models fit on a RTX A6000 48GB: max parameters per precision, cache headroom and estimated tokens/sec.

● Live

🧠

Tesla T4 16GB LLM Calculator — What Can It Run?

Check which models fit on a Tesla T4 16GB: max parameters per precision, cache headroom and estimated tokens/sec.

● Live

🧠

A100 80GB LLM Calculator — What Can It Run?

Check which models fit on a A100 80GB: max parameters per precision, cache headroom and estimated tokens/sec.

● Live

🧠

H100 80GB LLM Calculator — What Can It Run?

Check which models fit on a H100 80GB: max parameters per precision, cache headroom and estimated tokens/sec.

● Live

🧠

Apple M4 Max (128GB unified) LLM Calculator — What Can It Run?

Check which models fit on a Apple M4 Max (128GB unified): max parameters per precision, cache headroom and estimated tokens/sec.

● Live

🧠

Activation Memory Calculator (Training)

Forward-pass activations a transformer stores for backprop — per layer and total, with and without checkpointing.

● Live

🧠

Optimizer State Memory Calculator

Memory of AdamW, SGD, Adafactor, 8-bit Adam and Lion states for any model size — the hidden 2–3× weights tax.

● Live

🧠

ZeRO / FSDP Per-GPU Memory Calculator

Per-GPU memory at ZeRO stages 0–3 (FSDP full-shard) for any model and cluster size — see what each stage shards.

● Live

🧠

Gradient Accumulation Calculator

Steps to simulate a target global batch from your micro-batch and GPU count — with tokens-per-step and LR advice.

● Live

🧠

Learning Rate Scaling Calculator

Scale a known-good LR to a new batch size — linear and square-root rules, with warmup-steps suggestion.

● Live

🧠

Cosine LR Schedule Calculator

Learning rate at any step of a warmup-plus-cosine schedule — the default decay of modern LLM training.

● Live

🧠

Dataset Tokens, Epochs & Steps Calculator

Convert dataset size, batch and sequence length into optimizer steps and epochs — and check repeat-data limits.

● Live

🧠

Words ↔ Tokens Calculator

Estimate LLM tokens from words, characters or pages — per-language ratios for English, code, Hindi, Chinese.

● Live

🧠

LoRA Adapter Size Calculator

Trainable parameters and file size of LoRA adapters from rank, target modules and model architecture.

● Live

🧠

RoPE Context Extension Calculator

Stretch a model's context with RoPE scaling — linear vs NTK-aware factors, effective θ and quality expectations.

● Live

🧠

Vector Database Storage Calculator

RAM/disk for N embeddings at any dimension — float32 vs PQ/binary compression, plus HNSW index overhead.

● Live

🧠

Speculative Decoding Speedup Calculator

Expected speedup from draft-model speculation given acceptance rate, draft length and model cost ratio.

● Live

🧠

API vs Self-Hosting LLM Cost Calculator

Break-even between per-token API pricing and renting GPUs — utilization is the whole story; see yours.

● Live

🧠

Model Download Time Calculator

How long a checkpoint takes to pull at your bandwidth — with the real-world Hub throttle and resume math.

● Live

🧠

Knowledge Distillation Compression Calculator

Size/speed/quality trade of distilling a teacher into a student — compression ratio, capacity gap and serving gains.

● Live

🧠

Pruning & Sparsity Savings Calculator

Memory and FLOPs after unstructured, 2:4 semi-structured, or structured pruning — with realistic speedup caveats.

● Live

🧠

GPU Electricity Cost Calculator

Power cost of running GPUs — per hour, month, and per million tokens, at your local tariff.

● Live

🧠

LLM Batching Throughput & Latency Calculator

How batch size trades per-user latency for aggregate tokens/sec — the roofline crossover made interactive.

● Live

🧠

Custom LLM VRAM Calculator (Any Architecture)

Full inference-memory budget for ANY transformer from raw config.json fields — weights, KV cache, overhead.

● Live

🧠

Context Window Token Budget Calculator

Split a context window between system prompt, RAG chunks, history and output — and catch overflow before the API does.

● Live

🧠

Mixed Precision Training Memory Calculator

FP32 vs FP16/BF16-mixed vs pure-BF16 vs FP8 training memory side by side — where the savings actually are.

● Live

🧠

KV-Cache Size Calculator (Any Model)

Generic per-token and total KV-cache memory from architecture fields — the long-context budgeting workhorse.

● Live

🧠

Warmup Steps Calculator

Right-size LR warmup from total steps, batch size and model scale — with the heuristics labs actually use.

● Live

🧠

Training Tokens-per-Dollar Calculator

Normalize GPU offers into the metric that decides training budgets — billions of tokens trained per dollar.

● Live