Up to 4 hands can play simultaneously. Each detected hand becomes its own independent voice (color-coded).
Point with the index finger · X = position · Y low = tight slice · Y high = wide grain window
Pinch thumb to index per hand to control that hand's bend (touching = clean, fingers apart = wild wobble)
Distance from camera sets each hand's volume (closer = louder), estimated from palm area
Only hands the model is at least 60% confident about are tracked. Mouse drives the first slot as fallback when the camera is off.