Robot Calibration & Hand-Eye Calibration: The Ultimate Guide

Q: Do I need a laser tracker, or can I use a cheaper method?

For true absolute accuracy (~0.15 mm) you need to measure ~10× tighter (~0.015 mm), which means a laser tracker or comparable photogrammetry. Cheaper methods (ballbar, reference sphere, vision target) reach ~0.3–0.5 mm — fine for a sanity check or a budget shop, not for verified sub-0.2 mm accuracy. Rent a tracker for the day if buying isn't justified.

A six-axis industrial arm will return to the same taught point thousands of times and land within ±0.03 mm of where it was last time. Show the same arm a new point — one it has never been taught, computed purely from its kinematic model — and it may miss by 1 mm. Sometimes 2 mm. The number on the datasheet that says "repeatability ±0.02 mm" is true and the gap to that 1 mm miss is the single most expensive misunderstanding in factory automation. People buy a robot for its repeatability and then write programs that depend on its accuracy, which is a different and much worse number, and then they spend three weeks touching up points by hand wondering why offline programming "doesn't work."

Calibration is how you close that gap. Not one thing — a family of related procedures, each attacking a different error source, each with its own measurement instrument, math, and failure modes. This guide walks the whole family: why accuracy and repeatability diverge, where the errors actually come from (and which ones calibration can fix versus which it can only compensate), kinematic identification with a laser tracker, tool-frame and base-frame calibration, mastering and encoder zeroing, the AX=XB hand-eye problem, payload identification, thermal drift, and how you prove the result with ISO 9283. Numbers with units, math you can read, and opinions with the reasons attached.

The take: Repeatability is a property of the hardware; accuracy is a property of the model, and the model is the cheap thing to fix. A €60k arm calibrated to ±0.15 mm absolute will out-perform a €120k arm running its factory-default kinematics for any task that involves CAD-driven points, vision guidance, or moving a program between two "identical" robots. Kinematic calibration is the highest-leverage half-day of measurement in the building — but only if you measure with something an order of magnitude better than your target, identify the observable parameters and no more, and then validate on poses you did not use to fit. Skip the validation and you have not calibrated, you have curve-fitted noise.

Companion reading: robot kinematics & motion planning, encoders, machine vision, and industrial robot arms.

Key takeaways
Accuracy vs repeatability: the gap that surprises people
Where the errors come from
Kinematic calibration: identifying the model
The measurement step: trackers, CMMs, photogrammetry
TCP and tool-frame calibration
Base and work-object frame calibration
Mastering, homing & encoder zeroing
Hand-eye calibration: the AX=XB problem
Payload & load identification
Thermal compensation & drift
When calibration pays off
Validation per ISO 9283
Tools & practical workflow
Frequently asked questions

Key takeaways

Repeatability and accuracy are different numbers, and the gap is large. A modern 6-axis arm is repeatable to ±0.02–0.05 mm but accurate (out of the box) only to ±0.5–2 mm. Repeatability is set by encoders, backlash, and structural stiffness; accuracy is set by how well the controller's model of the arm matches the steel. Calibration fixes the model, not the steel.
~90% of absolute-position error is geometric — wrong link lengths, twists, and joint offsets baked into the controller's nominal Denavit–Hartenberg table. These are constant, observable, and fully correctable by kinematic identification. This is why kinematic calibration delivers the biggest single improvement, typically taking a robot from ~1 mm to ~0.15 mm.
The remaining error is non-geometric and harder. Joint compliance (gravity sag and payload deflection), gearbox backlash and transmission error, thermal growth, and encoder eccentricity. Compliance and thermal effects can be modeled and compensated; backlash you mostly design around with consistent approach directions.
Measure with an instrument ~10× better than your target. Laser trackers (Leica AT960, API Radian, FARO Vantage) give ~15 µm + 6 µm/m volumetric accuracy and are the default for arm calibration. Photogrammetry/Creaform for larger volumes; a CMM only for small workcells or end-effectors.
TCP calibration is geometry, not kinematics. The 4-point method finds tool position by jogging one physical point to a fixed tip from several orientations; you need 5–6 points and orientation references to get the full tool frame. Garbage TCP makes good kinematics look broken.
Mastering/homing must be right first. Encoder zero offsets are part of the kinematic model. If a joint's zero is off by 0.1°, no amount of link-length fitting saves you — the error couples into every pose. Re-master after any motor/encoder/gearbox service.
Hand-eye calibration solves AX=XB — finding the rigid transform between the camera and either the flange (eye-in-hand) or the world (eye-to-hand). Tsai–Lenz and Park–Martin are the classic closed-form solvers; modern pipelines (OpenCV calibrateHandEye, MoveIt hand-eye, ROS) refine with nonlinear least squares. Rotation accuracy depends on having large, varied rotations between poses.
Payload identification matters for accuracy and safety. Wrong mass/CoG/inertia degrades path accuracy, trips collision detection, and on cobots breaks the force estimate. Most controllers (KUKA LoadDataDetermination, ABB LoadIdentify, FANUC) auto-identify it by running a characterization move.
Thermal drift is real and sneaky. A robot can move 0.1–0.3 mm over the first 1–2 hours from cold as joints warm. For sub-0.1 mm work, warm up the robot, or add temperature sensors and a thermal model.
Calibration pays off when you depend on accuracy, not repeatability: offline programming from CAD, vision-guided picking, multi-robot cells where programs must port between arms, metrology/inspection, and any drill/route/dispense task driven by a CAD path.
Validate per ISO 9283 on poses you did not use to fit the model. Report pose accuracy (AP) and pose repeatability (RP) over the standard test cube at 10%/50%/100% rated load and speed. A calibration that isn't validated on a hold-out set is not trustworthy.
Parameter observability is the trap. A naive DH model has redundant parameters that are not observable from the measurement geometry; fitting them blindly amplifies noise. Use a model that drops the unobservable parameters (modified-DH plus the Hayati correction for near-parallel axes) and check the identification Jacobian's condition number.

Accuracy vs repeatability: the gap that surprises people

The two words get used interchangeably in casual speech and they are not interchangeable at all. The dartboard analogy is overused but correct: repeatability is how tightly your throws cluster; accuracy is how close that cluster sits to the bullseye. A robot can be exquisitely repeatable and badly inaccurate — a tight cluster two inches left of center.

Repeatability (RP) is the robot's ability to return to a previously taught pose. You jog the arm to a point, save the joint angles, and command it back. The encoders read the same counts, the joints servo to the same angles, the tool lands in the same place — within the spread caused by encoder resolution, servo settling, backlash on the approach, and structural micro-vibration. This is what the datasheet's headline number describes, and for a quality 6-axis arm it is genuinely ±0.02–0.05 mm.

Accuracy (AP, "absolute accuracy") is the robot's ability to reach a pose specified in Cartesian coordinates it has never been taught — for example, a point read from a CAD file or computed by a vision system. To do this the controller runs inverse kinematics on its internal model of the arm, computes joint angles, and servos there. If the model says link 2 is 700.0 mm long but the steel is actually 700.4 mm, every IK solve inherits that error. Out of the box, absolute accuracy is typically ±0.5–2 mm, sometimes worse near the edge of the workspace.

Rule: Teach-and-repeat programs lean on repeatability and don't care about accuracy. Anything driven by external coordinates — CAD, vision, another robot's frame — leans on accuracy. Know which kind of program you are writing before you trust a number.

Here is the crux: calibration cannot improve repeatability. Repeatability is a hardware property — you change it by buying better encoders, stiffer gearboxes, less backlash, a heavier casting. Calibration improves accuracy by correcting the model, and it can only ever get you as good as your repeatability. If the arm scatters ±0.05 mm on a repeated point, no model on earth makes it accurate to ±0.01 mm. Repeatability is the floor; accuracy after calibration approaches but never beats it.

Property	What it measures	Set by	Typical 6-axis arm	Improved by
Repeatability (RP)	Return to a taught pose	Encoders, backlash, stiffness, servo	±0.02–0.05 mm	Better hardware (not calibration)
Accuracy (AP)	Reach a commanded Cartesian pose	Kinematic model fidelity	±0.5–2 mm (uncalibrated)	Calibration (model fitting)
Accuracy after kinematic cal	same	Model + measurement quality	±0.10–0.20 mm	More measurements, better instrument
Accuracy after full cal (+compliance/thermal)	same	Model + compensation	±0.05–0.10 mm	Compliance & thermal modeling

The gap between columns two and four in that "accuracy" row — roughly 1 mm down to 0.15 mm, a factor of ~6–10 — is what kinematic calibration buys you in a half-day. That is the leverage.

Where the errors come from

To know what calibration can and cannot fix, you have to know the error budget. Errors split cleanly into geometric (constant, in the kinematic geometry) and non-geometric (load- or temperature- or direction-dependent). Roughly 80–90% of absolute error in a well-built arm is geometric, which is the good news: geometric error is constant and fully correctable.

Geometric errors

These are mismatches between the controller's nominal kinematic parameters and the as-built machine. Every revolute joint contributes four DH parameters; manufacturing tolerances and assembly put each one slightly off:

Link length (a) error — the perpendicular distance between consecutive joint axes is off by tenths of a millimeter. Castings and machined surfaces have tolerances.
Link twist (α) error — consecutive joint axes aren't perfectly perpendicular/parallel as the nominal model assumes; they're off by hundredths of a degree. Small angles, long lever arms.
Joint offset (d) error — translation along a joint axis is slightly wrong.
Joint angle offset (θ offset, the encoder zero) — the angle the controller calls "zero" doesn't coincide with the geometric zero. This is the mastering error and it's the biggest single geometric contributor because it sits at the base of the chain and multiplies down it.

The leverage of an angular error is what makes this brutal. A small joint-angle error becomes a Cartesian error proportional to the distance from that joint to the tool:

Tip error from a single joint-angle error:

    e ≈ θ_err · L

where  θ_err = joint angle error (radians)
       L     = distance from that joint axis to the TCP (mm)

Example: θ_err = 0.05° on joint 1, TCP at L = 1500 mm reach
    θ_err = 0.05° × (π/180) = 8.73e-4 rad
    e ≈ 8.73e-4 × 1500 mm ≈ 1.31 mm

A twentieth of a degree at the base = 1.3 mm at the tool.
This is why mastering and base-joint zeros dominate the budget.

That single line — e ≈ θ_err · L — explains most of the surprise. Angular errors are tiny and the lever arm is long. It also explains why the base joints (1, 2, 3) matter far more than the wrist joints (4, 5, 6) for position accuracy: they have the whole arm hanging off them as a lever.

Non-geometric errors

These don't live in the link geometry and a pure DH fit can't capture them:

Joint compliance / structural deflection — gearboxes (especially harmonic drives) and links are not rigid. Under gravity and payload, the arm sags. A 10 kg payload at 1.5 m reach can deflect the tool 0.2–0.5 mm. This is configuration- and load-dependent, so it shows up as a residual that varies across the workspace. Compliance can be modeled (joint stiffness coefficients, often called elasto-geometric or stiffness calibration) and compensated.
Backlash — lost motion in the gear train when a joint reverses direction. Causes the tool to land in a slightly different place depending on approach direction. Hard to model cleanly; the practical fix is to always approach points from the same direction (unidirectional approach), which is also good practice for repeatability.
Gear transmission error — the output angle isn't a perfectly linear function of motor angle. Harmonic drives have a characteristic 2-cycle-per-revolution ripple of tens of arc-seconds. Periodic, position-dependent. Some high-end calibration captures it; most don't bother.
Thermal growth — links and gearboxes expand as they warm from cold start and from gearbox self-heating. Steel expands ~12 µm/m/°C, aluminum ~23 µm/m/°C. A 10 °C rise over a 1.5 m arm is ~0.18 mm (steel) to ~0.35 mm (aluminum). Slow drift over the first hour or two.
Encoder eccentricity / runout — if the encoder disc isn't perfectly centered on its axis, you get a once-per-revolution sinusoidal angle error. See encoders for why mounting and bearing quality dominate here.
Dynamic errors — tracking error during motion, vibration, controller lag. These are speed-dependent and are not what static calibration addresses (path accuracy at speed is its own ISO 9283 test).

Error source	Type	Typical magnitude	Behavior	Calibration fixes it?
Link length / twist / offset	Geometric	0.1–0.5 mm equiv.	Constant	Yes — kinematic identification
Encoder zero (mastering)	Geometric	0.5–2 mm if off	Constant	Yes — re-master + identify
Joint compliance (gravity/payload)	Non-geometric	0.1–0.5 mm	Config/load-dependent	Partly — stiffness model
Backlash	Non-geometric	0.02–0.1 mm	Direction-dependent	No — design around it
Gear transmission error	Non-geometric	tens of arc-sec	Periodic in joint angle	Rarely — advanced only
Thermal growth	Non-geometric	0.1–0.35 mm	Slow drift, time/temp	Partly — warm-up or thermal model
Encoder eccentricity	Non-geometric	arc-sec to arc-min	Periodic, 1/rev	Partly — per-joint correction
Dynamic / tracking	Dynamic	speed-dependent	Transient	No — controller tuning

Rule: Kinematic calibration corrects the constant geometric ~85% of the budget. To go below ~0.15 mm you have to start fighting the non-geometric residue — compliance and thermal first, because they're the largest and the most modelable.

Kinematic calibration: identifying the model

Kinematic calibration is parameter identification: you measure where the tool actually goes for many known joint configurations, then solve for the kinematic parameters that best explain the measurements. Four steps — model, measure, identify, compensate — and the discipline is mostly in steps one and three.

Step 1: The model

You need a parameterization of the kinematics whose parameters you'll fit. The standard is Denavit–Hartenberg, and you should use modified-DH (Craig's convention), which places the frame at the near end of each link and makes the parameter assignment cleaner for identification. Each joint contributes the four parameters from above: a (link length), α (link twist), d (link offset), θ (joint angle, with the calibrated offset). For an n-joint arm that's 4n nominal parameters plus 6 for the base frame and 6 for the tool — but you will not, and should not, fit all of them. (See robot kinematics for the forward-kinematics machinery these parameters feed.)

There is a famous trap in plain DH: when two consecutive joint axes are parallel (or nearly so — think the shoulder and elbow of most arms), the d and θ parameters become ill-defined and the model is singular with respect to small misalignments. A tiny twist between nominally parallel axes produces a huge, unstable change in d. The fix is the Hayati–Mirmirani correction: for near-parallel joints, replace the d parameter with an extra rotation parameter β about the y-axis. Use modified-DH + Hayati and this whole class of numerical instability disappears.

Rule: Never fit a raw DH model with near-parallel axes. Use modified-DH with the Hayati β correction for the parallel pairs, or your d parameters will run away to absurd values and your fit will look great on the training data and terrible everywhere else.

Step 2: Measure (covered in detail below)

Drive the robot to a set of m poses spread across the workspace (typically 30–100). At each, record the commanded joint angles q_i and measure the actual tool position (and orientation, if you can) with an external instrument.

Step 3: Identify (the least-squares solve)

This is the heart of it. The measured tool pose is a function of the joint angles and the true-but-unknown parameters p. The nominal model predicts a slightly wrong pose. Linearize the error in the parameters via the identification Jacobian and solve for the parameter corrections:

Kinematic identification — linearized least squares:

  measured pose:    x_i^meas    (from laser tracker)
  predicted pose:   x_i = f(q_i, p_nominal)   (forward kinematics)
  pose residual:    Δx_i = x_i^meas − f(q_i, p_nominal)

  For all m poses, stack:
        Δx = J · Δp        (J = identification Jacobian, ∂x/∂p)

  J has 3m (position-only) or 6m (full-pose) rows
     and as many columns as identifiable parameters.

  Least-squares correction (overdetermined, m >> #params):
        Δp = (Jᵀ J)⁻¹ Jᵀ Δx          (normal equations)
     or solve via SVD / pinv for stability:
        Δp = pinv(J) · Δx

  Update and iterate (it's mildly nonlinear):
        p ← p_nominal + Δp,   recompute J, repeat 2–4×
        until ||Δx|| stops shrinking.

In practice you wrap this in Levenberg–Marquardt rather than raw normal equations — it's more robust when JᵀJ is poorly conditioned, which it often is. The output is a corrected parameter set that you load into the controller (or into your offline model).

Parameter observability

The single most important concept and the one people skip. Not every parameter is observable from your measurements — some combinations of parameters produce identical tool motions and cannot be separated, and some produce motions your measurement geometry never sees. If you try to fit an unobservable parameter, the solver invents a value to soak up noise, and that value makes the model worse on new poses.

Diagnose it with the condition number of the identification Jacobian J. A well-conditioned identification has a condition number in the tens to low hundreds; thousands means you have near-unobservable parameters and the solve is amplifying measurement noise. The fixes: (1) use a minimal, observable parameter set (modified-DH + Hayati already drops the classic redundancies); (2) choose measurement poses that excite the parameters you want — spread orientations and reach widely, don't cluster; (3) optionally run an observability-optimized pose selection (the O1–O5 observability indices in the literature) to pick the most informative configurations.

Rule: Fit only observable parameters, choose poses that excite them, and always check the condition number. An over-parameterized fit with a great training residual and a terrible validation residual is the textbook symptom of fitting noise.

The measurement step: trackers, CMMs, photogrammetry

Your calibration is only as good as your measurement, and the rule is unforgiving: the instrument must be ~10× more accurate than your target. Calibrating to ±0.15 mm means measuring to ~±0.015 mm. That requirement alone rules out most things and points straight at the laser tracker.

Laser trackers are the default for arm calibration. A tracker (Leica Absolute Tracker AT960/AT930, API Radian, FARO Vantage/ION) sends a laser to a spherically-mounted retroreflector (SMR) on the robot flange and measures range by interferometry/absolute distance meter plus two angles. Volumetric accuracy is around ±15 µm + 6 µm/m, so ~±25 µm at 1.5 m. They measure at high rate, track a moving target, and reach across a whole cell. The 6DoF variants (Leica T-Mac, API STS) measure orientation too, which roughly doubles the information per pose and tightens the fit. This is what RoboDK, Dynalog CalibWare, and the OEM calibration services all use.

Photogrammetry / structured-light (Creaform) systems (Creaform MetraSCAN/C-Track, GOM/ZEISS, AICON) track coded targets or a probe with stereo cameras. Accuracy is in the 20–60 µm range over volume — slightly behind a tracker but excellent for large volumes, multi-robot cells, and when you want to digitize a fixture or work-object surface at the same time. C-Track-style dual-camera systems give 6DoF naturally.

CMM (coordinate measuring machine) is the most accurate (single-digit µm) but the worst fit for robot calibration: it's a fixed-volume gantry, you'd have to put the robot inside it, and the working volume rarely matches a robot's reach. Use a CMM to certify a TCP artifact or a small end-effector, not to calibrate the arm in situ.

Low-cost / on-machine methods exist and have their place: a calibrated ballbar or telescoping double-ballbar, a fixed reference sphere probed from many orientations, or vision-based methods using a calibrated camera and target. They get you to ~0.3–0.5 mm — useful for a sanity check or a budget shop, not for true absolute accuracy.

Instrument	Volumetric accuracy	6DoF?	Working volume	Best for	Rough cost
Laser tracker (Leica AT960, API Radian, FARO)	±15 µm + 6 µm/m	Optional (T-Mac/STS)	Whole cell, 10s of m	Arm kinematic calibration (the default)	€80k–150k+
Photogrammetry (Creaform, GOM, AICON)	~20–60 µm	Yes (dual-camera)	Large, multi-robot cells	Large volumes + surface digitizing	€60k–120k
CMM	1–5 µm	Pose via probing	Fixed, small	TCP artifacts, end-effectors	Fixed asset
Ballbar / reference sphere	~30–100 µm	No	Local	Cheap check, partial cal	€5k–20k
Vision target (camera + checkerboard)	~0.1–0.5 mm	Yes	Camera FoV	Hand-eye, budget cal	€1k–10k

Rule: If you can't measure ~10× tighter than your accuracy goal, you can't verify whether you hit it — and an unverifiable calibration is a guess. Borrow or rent a tracker for the day rather than calibrate with the wrong tool.

TCP and tool-frame calibration

The Tool Center Point is the working point of whatever the robot holds — the tip of a welding torch, the center of a gripper's jaws, the nozzle of a dispenser. The controller knows the flange pose from kinematics; the TCP offset is the rigid transform from the flange frame to the tool's working frame. Get it wrong and every Cartesian motion, every reorientation about the tool, every taught point is wrong by that offset.

This is geometry, not kinematics — you're finding a fixed 6-parameter transform, not fitting link parameters — but it's done on the robot and it's done constantly, so it deserves its own discipline.

The 4-point method (position only)

The classic. Place a fixed, sharp reference tip somewhere in the workspace. Jog the tool's working point to touch that single fixed point from four (or more) very different orientations. The flange is in a different pose each time, but the tool tip is at the same world point. The controller solves for the tool offset (x, y, z) that makes all four flange poses map the tip to one common point.

4-point TCP — the constraint:

  For each touch i:   p_world = T_flange,i · t_tool
  where  T_flange,i = flange pose (known from kinematics)
         t_tool     = unknown tool offset [x, y, z, 1]ᵀ
         p_world    = the (also unknown) fixed reference point

  All touches share one p_world ⇒ overdetermined linear system
  in (t_tool, p_world). Solve by least squares.

  Quality depends on ORIENTATION SPREAD: four nearly-identical
  orientations give a near-singular system. Spread them wide
  (≥ 45° apart, mix all wrist axes) for a good solve.

Accuracy is typically ±0.2–0.5 mm and is limited by how precisely a human can jog the tip to the reference and by the robot's own accuracy — a calibrated arm gives a better TCP. Use 5–6 points, not the minimum 4; the extra touches average out jogging error.

Getting orientation (the full tool frame)

The 4-point method gives only the tool position. For a frame you need the tool's orientation relative to the flange. Methods:

5/6-point (XYZ + Z, or XYZ + X + Z): after the 4-point position solve, jog the tool along its intended +Z (and +X) from the reference point to teach the tool's axis directions.
Reference-object / abc-world: orient the tool to match a known reference orientation.
CAD value: for a precisely machined tool of known geometry, just type the offset from the drawing. Often better than touch-up for a well-made part — and combine with a touch-check.

Rule: A bad TCP makes a perfectly calibrated arm look broken — reorienting about the tool will sweep the tip through an arc instead of pivoting in place. If "rotate about TCP" doesn't keep the tip stationary, your TCP is wrong, full stop. That test is the fastest TCP sanity check there is.

Base and work-object frame calibration

Two more frames, both essential for any program that references the world rather than the robot.

Base / world frame locates the robot's base in the cell's coordinate system. You need it whenever coordinates come from outside the robot: a conveyor, a fixture surveyed in CAD, a second robot, or a vision system reporting in world coordinates. Establish it by touching three known points (origin, +X direction, point in the +XY plane) with a calibrated TCP, or far better, by measuring the base frame directly with the laser tracker you already set up for kinematic calibration. Tracker-based base framing removes the human-jog error and is essential in multi-robot cells where two arms must agree on where the world is to better than 0.2 mm.

Work-object / user frame locates the part or fixture you're working on. You teach points in this frame so that if the fixture moves (or you move the program to a second, slightly different fixture), you re-teach only the frame, not every point. The 3-point method (origin, +X, +XY) is standard. The big payoff: programs become portable. A weld program written in a work-object frame survives the fixture being relocated 10 mm and rotated 1° — you re-survey the frame and every taught point follows.

Rule: Build the dependency chain deliberately — world → base → work-object → TCP. Each frame inherits the error of the frames above it. A 0.5 mm base-frame error sits under every point in every work-object on that robot, so spend your best measurement on the frames nearest the base.

Mastering, homing & encoder zeroing

Before any of the above means anything, the robot has to know what angle each joint is actually at. Mastering (a.k.a. homing, zeroing, or "syncing") establishes the correspondence between each joint's encoder reading and its true geometric angle. It is the θ-offset parameter from the DH model, and as the e ≈ θ_err · L math showed, it has the longest lever arm of any error in the machine.

Most industrial arms have a mechanical or optical reference per joint — a notch, a dial, a witness mark, or a reference cartridge/EMD that the controller probes — defining the master position. You drive each joint to its reference and tell the controller "this encoder count is the master angle." On absolute-encoder arms this survives power-down; on incremental-encoder arms the robot must home on startup. (The encoder distinction matters a lot here — see encoders.)

Why it must be right:

It's the largest geometric error if wrong. A 0.1° mastering error on joint 1 of a 1.5 m-reach arm is ~2.6 mm at the tool (e ≈ θ_err · L). No link-length fit can recover from a wrong zero — the optimizer will distort other parameters trying to compensate, ruining the whole model.
It changes after service. Replacing a motor, encoder, gearbox, or even a hard collision can shift the master. Always re-master after mechanical service on a joint, and re-run (at least) a quick accuracy check afterward.
It's a prerequisite, not a step. Kinematic identification includes refining the joint-angle offsets, but it converges far better if you start from a good mechanical master. Garbage mastering in, garbage parameters out.

Rule: Re-master after any service that touches a joint's motor, encoder, or gearbox — then re-verify accuracy. A robot that was calibrated to 0.1 mm and then had joint 3's motor swapped is no longer calibrated, regardless of what the controller still claims.

Hand-eye calibration: the AX=XB problem

The moment you bolt a camera to a robot (or aim one at its workspace), you have a new unknown: the rigid transform between the camera's optical frame and the robot's frames. The camera reports object poses in its coordinates; the robot moves in its coordinates; nothing useful happens until you know the transform between them. Finding it is hand-eye calibration, and it underpins all vision-guided robotics. (For the camera side — intrinsics, lens distortion, stereo, depth — see machine vision and LiDAR & depth cameras; for the broader sensing context, robot sensors.)

Two configurations

Eye-in-hand: camera mounted on the robot flange/wrist, moving with the arm. You're solving for X = flange→camera transform. Common in pick-and-place and inspection where the camera needs to get close.
Eye-to-hand (eye-to-base): camera fixed in the cell, watching the workspace. You're solving for X = base→camera (equivalently camera→base). Common when one fixed overhead camera serves the whole cell.

The math: AX = XB

The classic formulation. Move the robot between pairs of poses while observing a fixed calibration target. Between two robot poses, the robot's flange moves by a known relative transform A (from the robot's forward kinematics) and the camera's view of the target moves by a measured relative transform B (from the vision solve). The unknown hand-eye transform X satisfies:

Hand-eye:  A X = X B

  A = relative robot motion between two poses (from kinematics)
  B = relative camera-to-target motion        (from vision)
  X = the unknown camera↔flange (or camera↔base) transform

  Split into rotation and translation:
      R_A R_X = R_X R_B           (rotation: solve first)
      R_A t_X + t_A = R_X t_B + t_X   (translation: solve second)

  Rotation accuracy needs LARGE, VARIED rotations between poses.
  Pure translation moves give NO rotation info — X_rot stays
  unobservable. Use ≥ 10–15 poses with big, diverse orientation
  changes (tip the camera ≥ 30–45° about different axes).

The rotation part is solved first (it's independent of translation), then translation is solved using the recovered rotation. Closed-form solvers:

Tsai–Lenz (1989): the workhorse. Solves rotation via an angle-axis (Rodrigues) formulation, then translation linearly. Fast, well-understood, the reference implementation in OpenCV (CALIB_HAND_EYE_TSAI).
Park–Martin (1994): uses Lie-group / so(3) least squares for the rotation, often more robust to noise than Tsai–Lenz.
Horaud–Dornaika, Daniilidis (dual-quaternion): Daniilidis solves rotation and translation simultaneously using dual quaternions, which can be more accurate when the two are coupled.

Modern practice: get a closed-form initial estimate from one of the above, then refine with nonlinear least squares (bundle-adjustment-style, minimizing reprojection error over all poses jointly). OpenCV's calibrateHandEye offers all the classic methods; the MoveIt hand-eye calibration plugin and ROS pipelines wrap this with a live target (an ArUco/ChArUco board or AprilTag) and pose collection.

Practical notes

The dominant error driver is rotation diversity. People collect 12 poses that are all small nudges of position with the camera staring the same way, the rotation system is near-singular, and the result is a translation that looks plausible but a rotation that's off by a couple of degrees — which then throws position errors that grow with target distance. Tip and twist the camera aggressively across poses. Use a ChArUco board over a plain checkerboard (it tolerates partial occlusion and gives sub-pixel corners), keep the board flat and rigid, and span the camera's working depth.

Method	Rotation approach	Solves R,t	Noise robustness	Use when
Tsai–Lenz	Angle-axis (Rodrigues)	Sequentially	Good	Default; well-tested baseline
Park–Martin	Lie-group / so(3) LS	Sequentially	Better	Noisier data, want robustness
Horaud–Dornaika	Quaternion / nonlinear	Sequentially or joint	Good	Moderate noise
Daniilidis (dual-quaternion)	Dual quaternion	Simultaneously	Best when R,t coupled	R and t strongly coupled
Nonlinear refinement (BA)	Manifold optimization	Jointly, all poses	Best overall	Always, as a final polish

Rule: Hand-eye rotation accuracy lives or dies on orientation diversity between poses. If your poses don't include large, varied rotations, the rotation is unobservable no matter which solver you pick — and a 1° rotation error becomes a position error that grows linearly with how far the target sits from the camera.

Payload & load identification

The controller needs to know the mass, center of gravity, and inertia tensor of whatever the robot carries. This isn't just dynamics housekeeping — it bears directly on accuracy and safety.

Accuracy: payload load deflects the arm (the compliance term from the error budget). The controller's gravity-compensation and any stiffness model need the correct mass and CoG to predict and cancel that deflection. Wrong payload, wrong compensation, worse accuracy at speed.
Safety and collision detection: the controller estimates external forces by comparing expected joint torques (from the dynamic model + payload) against measured torques. If the declared payload is wrong, the residual is wrong, and collision detection either nuisance-trips or — worse — fails to trip. On cobots this is the foundation of force/torque-based safety and hand-guiding (robot safety covers the safety side).
Path tracking: feedforward dynamic compensation needs the inertia tensor to anticipate the torques for accelerations. Wrong inertia, more tracking error during fast moves.

Every major OEM ships a load identification routine: KUKA LoadDataDetermination, ABB LoadIdentify, FANUC payload estimation, UR's built-in payload wizard. You mount the load, run a prescribed characterization motion (the robot moves several joints through a sequence while measuring motor torques), and the controller solves for mass, CoG, and inertia from the torque data. Run it whenever the end-effector or grasped part changes significantly — and for variable payloads (e.g., a gripper that sometimes holds a 0.5 kg part and sometimes a 5 kg part), configure multiple payload records and switch in software.

Rule: Declare the real payload. A wrong payload silently degrades accuracy, defeats collision detection, and on a cobot corrupts the force estimate the safety case depends on. The auto-identify routine takes two minutes; run it.

Thermal compensation & drift

The error that ambushes people who calibrated perfectly in the morning and find the robot off by 0.2 mm by mid-shift. The arm changes shape as it warms — from ambient swings, from sun on a wall, and most of all from the gearboxes generating heat as they work.

The physics is just thermal expansion: steel ~12 µm/m/°C, aluminum ~23 µm/m/°C. A robot's links and gearbox housings warm 5–15 °C from cold start to thermal equilibrium over the first 1–2 hours of operation. Over a 1.5 m arm that's roughly 0.1–0.35 mm of drift — and because the heating is uneven (gearboxes hot, links cooler), it's not a simple uniform scale. For teach-and-repeat work nobody notices (repeatability is unaffected; the whole frame drifts together-ish). For absolute-accuracy work it's a real, time-varying error on top of your calibration.

What to do, in order of effort:

Warm up the robot. The cheapest fix. Run a representative motion cycle for 30–60 minutes before precision work, and calibrate when warm. Many shops mandate a warm-up program.
Calibrate at operating temperature. If the robot runs hot, calibrate hot. A calibration done cold is wrong by the drift amount once the robot warms.
Thermal model + temperature sensors. High-end systems (and some OEM accuracy packages) put temperature sensors on the joints and apply a thermal-expansion correction to the kinematic model in real time. This is what gets you stable sub-0.1 mm accuracy across a shift.
Control the environment. Stable ambient temperature, no direct sun, no HVAC blasting one side of the cell.

Rule: A calibration is valid at the temperature it was taken. If you need sub-0.1 mm all shift, either warm the robot to a steady state and keep it there, or instrument it with temperature sensors and a thermal model. "We calibrated it once, cold" is not a thermal strategy.

When calibration pays off

Calibration isn't free — instrument time, downtime, expertise — so spend it where accuracy (not repeatability) is the constraint. The tells:

Offline programming (OLP). Generating robot programs from CAD in RoboDK, Process Simulate, Delmia, or RobotStudio. The whole point of OLP is to skip manual teach-up; that only works if the real robot matches the simulated model, which means it must be accurate, not just repeatable. OLP without calibration is the #1 disappointment in this field — people generate a beautiful program and then spend days touching up every point because the arm is 1 mm off. Calibrate to ~0.15 mm and the touch-up nearly vanishes.
Vision-guided tasks. Bin picking, conveyor tracking, any pick from a vision-reported pose. The robot reaches Cartesian coordinates it was never taught — pure accuracy dependence. Garbage accuracy means the gripper misses the part even with a perfect vision solve.
Multi-robot cells / program portability. When a program must move between "identical" robots (line balancing, replacing a failed arm, deploying the same job to 20 stations), each arm's accuracy must be good enough that one program fits all. Uncalibrated, every arm is uniquely wrong by ~1 mm and programs don't port. Calibrated arms are interchangeable.
Metrology and inspection. The robot is the measuring instrument (or carries one). Accuracy is the spec.
CAD-path process tasks. Drilling, routing, deburring, dispensing, waterjet, additive — anywhere the path comes from CAD and tolerances are tight.

Where calibration buys you little: a fixed pick-place-stack cell with hand-taught points and no external coordinates. That's pure teach-and-repeat; repeatability carries it and calibration adds nothing the program uses. Don't calibrate reflexively — calibrate the robots whose programs depend on accuracy.

Validation per ISO 9283

You haven't calibrated until you've measured the result on poses you didn't use to fit the model. The standard for industrial-robot performance is ISO 9283:1998 (Manipulating industrial robots — Performance criteria and related test methods), and it defines exactly what to measure and how.

ISO 9283 prescribes a test setup: a cube positioned in the working space (typically the largest cube that fits, tilted to use the workspace), with measurement at the cube's diagonal-plane points (P1–P5). The robot is sent to these poses repeatedly (30 cycles per the standard) at specified speeds and loads, and an external instrument records where it actually lands. Key metrics:

Pose accuracy (AP): the distance between the commanded pose and the mean of the attained poses. This is absolute accuracy — what calibration improves. Split into position (APp) and orientation (APa, APb, APc) components.
Pose repeatability (RP): the spread (radius of the sphere containing the attained-pose cluster, at a confidence level) of the attained poses about their mean. This is repeatability — calibration does not change it.
Plus: distance accuracy/repeatability (AD/RD), path accuracy (AT) and path repeatability (RT) for continuous-path work, cornering, velocity accuracy, and more.

Rule: Test at 10%, 50%, and 100% of rated load and rated speed per the standard — not just unloaded and slow. Accuracy degrades with payload (compliance) and speed (dynamics), and a calibration that's only verified at low load and low speed hides exactly the conditions that bite in production.

The non-negotiable discipline: validate on a hold-out set. Use one set of poses to fit the kinematic parameters and a different, independent set to measure AP and RP. If you report the residual on the fitting poses as your accuracy, you're reporting how well you memorized the noise, not how well the model generalizes. A good calibration shows a fitting residual and a validation residual that are close (e.g., 0.12 mm fit, 0.15 mm validation). A big gap (0.05 mm fit, 0.4 mm validation) is the signature of over-fitting unobservable parameters — go back to the model and the condition number.

Tools & practical workflow

The software and the order of operations.

Calibration software:

RoboDK — popular, affordable OLP suite with a calibration module that drives a laser tracker, runs the identification, and writes corrected kinematics back to the robot or into the OLP model. Strong for the calibrate-then-OLP workflow.
Dynalog CalibWare / DynaCal — long-established dedicated robot-calibration package, tracker-driven, used by OEMs and integrators.
OEM accuracy packages — ABB Absolute Accuracy, KUKA accuracy options, FANUC, Stäubli. These are factory-calibrated-at-build options where the robot ships with identified parameters and (sometimes) compliance/thermal compensation. Buy the absolute-accuracy option at order time if your application needs it — retrofitting is more work.
MoveIt 2 hand-eye calibration and OpenCV calibrateHandEye for the vision side; ChArUco/AprilTag targets for pose collection.
Metrology software: Leica Tracker Pythons/SpatialAnalyzer, PolyWorks, Verisurf for the measurement and analysis.

A practical end-to-end workflow:

Mechanical check first. Verify mounting is rigid, no loose bolts, gearboxes serviced. Then master/home every joint to its reference. Mastering is the foundation — do it right or stop here.
Warm up the robot to operating temperature (30–60 min representative cycle) so you calibrate hot if it runs hot.
Set up the laser tracker, mount the SMR/6DoF target on the flange, establish the tracker-to-robot relationship.
Collect calibration poses — 30–100 configurations spread widely across the workspace and orientation range, ideally observability-optimized. Record commanded joint angles and measured tool poses.
Identify the kinematic parameters: modified-DH + Hayati for parallel axes, Levenberg–Marquardt least squares, check the condition number, fit only observable parameters.
Load the corrected parameters into the controller (or OLP model).
Calibrate the TCP (5–6 point + orientation) and the base / work-object frames, ideally tracker-measured.
Identify the payload with the OEM routine.
Validate per ISO 9283 on a hold-out pose set, at 10/50/100% load and speed. Report AP and RP.
Document and schedule re-checks — re-verify periodically and after any service touching a joint.

Rule: Order matters. Master → warm up → kinematic identify → TCP/frames → payload → validate. Each step assumes the previous ones are correct; doing TCP before mastering, or skipping the warm-up before a precision calibration, quietly poisons everything downstream.

Frequently asked questions

Why is my robot repeatable to 0.02 mm but misses CAD points by 1 mm? Because those are different specifications. Repeatability is returning to a taught pose — pure hardware. Reaching a CAD point requires the controller to run inverse kinematics on its internal model, and that model is off by manufacturing tolerances, so every computed pose inherits ~1 mm of geometric error. Kinematic calibration fixes the model and typically brings absolute accuracy to ~0.15 mm.

Can calibration improve repeatability? No. Repeatability is set by encoders, backlash, and structural stiffness — hardware. Calibration corrects the kinematic model, which only affects accuracy. Calibrated accuracy can approach but never beat the repeatability floor: if the arm scatters ±0.05 mm, no model makes it accurate to ±0.01 mm.

Do I need a laser tracker, or can I use a cheaper method? For true absolute accuracy (~~0.15 mm) you need to measure ~10× tighter (~~0.015 mm), which means a laser tracker or comparable photogrammetry. Cheaper methods (ballbar, reference sphere, vision target) reach ~0.3–0.5 mm — fine for a sanity check or a budget shop, not for verified sub-0.2 mm accuracy. Rent a tracker for the day if buying isn't justified.

What's the difference between DH and modified-DH for calibration? Both are 4-parameter-per-joint kinematic conventions. Modified-DH (Craig) puts the frame at the near end of each link, which makes parameter assignment cleaner for identification. For calibration, always use modified-DH plus the Hayati β correction on near-parallel joint pairs — plain DH is numerically singular for parallel axes and the d parameters blow up.

My hand-eye calibration translation looks right but the rotation seems off — why? Almost always insufficient rotation diversity in your poses. The AX=XB rotation is only observable if the camera undergoes large, varied rotations between poses. If your poses are mostly translations with the camera pointing the same way, the rotation solve is near-singular. Tip and twist the camera ≥30–45° about different axes across ≥10–15 poses.

Eye-in-hand or eye-to-hand — which should I use? Eye-in-hand (camera on the flange) when the camera needs to get close to the work, inspect from varied viewpoints, or serve a large workspace from one moving sensor. Eye-to-hand (fixed camera) when one overhead view covers the whole cell and you want the camera out of the way. The AX=XB math is the same; you solve for flange→camera vs base→camera respectively.

How often do I need to re-calibrate? Re-master and re-verify after any service touching a joint's motor, encoder, or gearbox, or after a hard collision. Otherwise schedule a periodic accuracy check (quarterly to yearly depending on duty) — kinematic parameters are stable in steel, but wear, thermal cycling, and minor crashes drift them over time.

Why does my robot drift during the day even though I calibrated it? Thermal growth. Links and gearboxes warm 5–15 °C from cold start to equilibrium, expanding 0.1–0.35 mm over a 1.5 m arm. A cold calibration is wrong once the robot warms. Warm the robot before precision work and calibrate hot, or instrument it with temperature sensors and a thermal model for stable sub-0.1 mm all shift.

Does the payload really affect accuracy, or just dynamics? Both. Payload deflects the arm (compliance), so wrong payload means wrong deflection compensation and worse accuracy — especially at reach and speed. It also corrupts the torque-based collision-detection and force estimate, which is a safety issue on cobots. Run the OEM load-identification routine whenever the end-effector or part changes.

My calibration residual is tiny but the robot is still inaccurate on new points. What happened? Classic over-fitting. You fit unobservable parameters that soaked up measurement noise — great training residual, terrible generalization. Check the identification Jacobian's condition number (should be tens to low hundreds, not thousands), fit only observable parameters (modified-DH + Hayati), and always validate on a hold-out pose set you didn't use to fit.

What accuracy can I realistically expect after calibration? Kinematic calibration alone: ~~0.10–0.20 mm absolute on a quality 6-axis arm (from ~0.5–2 mm uncalibrated). Adding joint-compliance (stiffness) and thermal compensation: ~0.05–0.10 mm. The repeatability floor (~~0.02–0.05 mm) is the hard limit you can never beat.

Is ISO 9283 the only standard I need to know? It's the core for static and path performance (AP, RP, AT, etc.). For service/mobile robots see ISO 18646; for collaborative-robot safety see ISO/TS 15066 and ISO 10218; for the metrology instruments, ASME B89.4.19 / ISO 10360-10 cover laser-tracker performance. For an industrial arm calibration, ISO 9283 is what you validate against.