CRAFT Hand: From a 3D-Printed Hand to Simulated Teleoperation and Zero-Shot Policy Transfer

Course project report for CS295: Robot Deep Learning. Team: Aditya Mittal and Sadman Sakib. Posted: Jun 11, 2026.

This project explored whether the CRAFT hand, a five-fingered hybrid-rigid/compliant dexterous hand, could be turned from a hardware design into a usable research platform for teleoperation and policy learning. Aditya focused on building the physical hand, mounting it to the I2RT YAM Ultra arm, and teleoperating it with VR hand tracking. I focused on the simulation side: fixing the CRAFT MuJoCo model, connecting HaMeR hand-pose predictions to CRAFT controls, integrating the hand with robosuite and DexJoCo, and testing whether policies trained for an Allegro hand could be mapped zero-shot to CRAFT.

What is new here? We did not propose a new learning algorithm. The main contribution is a systems bridge: a physical CRAFT hand build, a corrected CRAFT MuJoCo model, HaMeR-driven CRAFT teleoperation in simulation, robosuite and DexJoCo integrations, and a first zero-shot action-mapping experiment from four-finger Allegro policies to the five-finger CRAFT hand.

Starting Point

Before our project, several useful pieces existed, but they did not form an end-to-end CRAFT platform. The CRAFT hand design provided CAD files and a mechanical concept: rigid PLA links joined with soft TPU compliance. HaMeR could reconstruct a MANO human hand pose from camera images, but its original release was an image/mesh reconstruction system rather than a CRAFT-in-MuJoCo teleoperation controller. DexJoCo provided simulation tasks, teleoperation infrastructure, and trained policies for a Panda arm with a four-finger Allegro hand, but not for CRAFT. The missing piece was the set of adapters, model fixes, and control mappings needed to make CRAFT behave as a real and simulated robot hand.

Hardware baseline

CRAFT had printable parts and an intended hand design, but our lab setup needed a new wrist mount and practical teleoperation path.

Vision baseline

HaMeR estimated MANO hand pose. We needed to turn those rotations into CRAFT actuator commands and usable visual feedback.

Policy baseline

DexJoCo policies acted on Panda + Allegro actions. We tested how far those actions could be transferred to CRAFT.

Physical CRAFT Hand and VR Teleoperation

The physical build used fully 3D-printed CRAFT fingers: white PLA for stable rigid geometry and black TPU for compliant joints. Aditya assembled the fingers, designed a custom wrist mount in Autodesk Fusion using the I2RT arm geometry, and mounted the hand to the I2RT YAM Ultra setup. This was different from the original X-Arm mounting target, so the wrist interface itself became part of the project.

We also tested Quest-based hand tracking for physical teleoperation. Open-TeleVision retargeted arm motion from wrist pose, and the hand pose was interpreted relative to the Quest headset to reduce tracking noise. A manual side-to-side calibration step was needed because thumb tracking was unreliable. This worked well enough for qualitative demos, but it remained clunky in practice: calibration was sensitive, the thumb was finicky, and the operator did not get useful contact feedback.

Physical CRAFT hand teleoperation on the I2RT arm. This demonstrates the mounted arm+hand platform rather than a learned policy.

Quest/VR-based CRAFT teleoperation. The setup worked, but calibration and thumb tracking were still fragile.

Fixing the CRAFT MuJoCo Model

On the simulation side, the first blocker was that the CRAFT MJCF did not behave like a clean position-controlled hand. The original model mixed degree-style actuator ranges with a simulator workflow that expected radian targets, and the position actuators were too weak or slow for responsive teleoperation. We created corrected XML variants and small inspection scripts so the hand could be loaded, swept through joint ranges, rendered, and embedded into larger robot scenes.

Issue	What we changed	Why it mattered
Joint/actuator range mismatch	Created a corrected radian-based CRAFT MJCF and used compiled joint/control ranges consistently.	HaMeR and policy outputs could be mapped into real MuJoCo targets instead of degree-like commands.
Slow or weak finger motion	Tuned position feedback gains and stepped physics multiple times per control update.	Finger speed became visibly closer to the intended target motion, matching the professor's feedback after the presentation.
Different actuator ordering	Mapped CRAFT API order to MJCF actuator order: side MCP, forward MCP, and PIP for each finger, with DIP coupled to PIP.	Without this reorder, visually reasonable hand-pose values moved the wrong simulated joints.

Each non-thumb CRAFT finger has four hinge joints but only three direct actuators. The fourth distal joint is coupled to the PIP joint through a MuJoCo equality constraint. That means the controller should command MCP side motion, MCP forward flexion, and PIP curl; the fingertip follows through the coupling. The thumb has the same general pattern with CRAFT-specific names: thumb_mcp, thumb_2, and thumb_3.

CRAFT API order:
Ring, Index, Thumb, Middle, Pinky
each as: PIP, MCP forward, MCP side

MuJoCo actuator order:
index_1, index_2, index_3,
middle_1, middle_2, middle_3,
ring_1, ring_2, ring_3,
pinky_1, pinky_2, pinky_3,
thumb_mcp, thumb_2, thumb_3

CRAFT mounted on a Panda arm in simulation. This was an asset and control validation step before teleoperation and policy tests.

HaMeR to CRAFT Teleoperation in MuJoCo

We modified the HaMeR teleoperation path so that a camera or video frame could drive the simulated CRAFT hand. Relative to the original HaMeR release, the important changes were not in HaMeR's neural network weights; they were in the system around it. We updated the hand-detection stack to work with modern MMPose/RTMPose, ran HaMeR live on detected right-hand boxes, extracted MANO wrist and finger rotations, smoothed the axis-angle outputs, calibrated per-joint min/max ranges, and converted those values into 15 CRAFT commands.

The core mapping was deliberately simple: for each target CRAFT actuator, choose a HaMeR/MANO joint, choose the axis-angle component that best matched that CRAFT motion, normalize it through a calibration range, and invert signs where the CRAFT joint direction disagreed with the human-hand convention. For MuJoCo, those 15 CRAFT values then went through the CRAFT-to-MJCF reorder described above.

HaMeR-driven CRAFT finger teleoperation in MuJoCo. The demo shows a side-by-side human hand input and simulated finger motion.

DexJoCo HaMeR teleoperation branch. The arm pose stays in the 7-value end-effector format while CRAFT receives direct 15-value controls.

Robosuite and DexJoCo Integration

After the presentation, we moved beyond finger-only teleoperation. In robosuite, we added a CraftHand gripper wrapper that loads the CRAFT MJCF and attaches it to a Panda arm through robosuite's gripper factory. The first useful scripts were not policies; they were rendering and rollout checks that made sure the hand could be attached, actuated, viewed, and stepped through robosuite's action path.

In DexJoCo, the existing single-arm action format was designed around Panda + Allegro: [end-effector xyz, quaternion, 16 Allegro joint targets]. CRAFT does not match that interface: it has five fingers, 15 actuators, different thumb orientation, different joint couplings, and thinner fingers. We therefore added two paths. For HaMeR teleoperation, CRAFT uses a direct 22-value action: [end-effector xyz, quaternion, 15 CRAFT targets]. For zero-shot policy tests, we kept a compatibility mapping from Allegro-shaped actions to CRAFT controls so trained DexJoCo policies could at least be evaluated without retraining.

Vision-based arm+CRAFT teleoperation in simulation. This exposed practical limitations: no haptic contact feedback, occasional hand tracking loss, and camera field-of-view constraints.

Zero-Shot Allegro Policy Transfer to CRAFT

The most interesting negative result came from zero-shot inference. DexJoCo policies trained on Allegro can perform tasks such as hammering or pick-and-place with the original hand. We replaced the hand with CRAFT and mapped the policy's Allegro action outputs into CRAFT controls. This was intentionally unfair to CRAFT in one sense: the policy had never learned the CRAFT kinematics, finger shape, thumb orientation, or contact geometry. But that is exactly why the experiment was useful. It tested whether an action-level retargeting layer alone was enough.

The answer was: sometimes partially, but not reliably. CRAFT's thinner fingers reduced contact surface, the mapped thumb often failed to oppose the other fingers, and the medial/distal curl could over-close compared to what the Allegro-trained policy expected. These failures are good evidence that CRAFT likely needs either CRAFT-specific demonstrations, retargeted physically plausible data, or policy fine-tuning in the CRAFT simulator rather than a purely geometric action remap.

Baseline: the trained DexJoCo policy succeeds on pick-and-bucket with the original Allegro hand.

Zero-shot CRAFT mapping fails on the same style of task, showing that actuator remapping is not enough.

Baseline: Allegro hammer/nail policy succeeds in the original DexJoCo setup.

CRAFT-mapped hammer/nail rollout. The failure highlights thumb opposition and finger-curl mismatch.

A more favorable CRAFT hammer/nail rollout. This suggests the mapping can produce meaningful motion, but success depends heavily on contact geometry and task conditions.

What Worked, What Did Not

Result	Status	What we learned
Physical CRAFT hand build and I2RT mount	Worked for demos	The hand can be physically assembled and mounted, but practical teleoperation still needs better calibration and feedback.
HaMeR finger teleoperation in MuJoCo	Worked qualitatively	Vision-predicted MANO motion can drive CRAFT fingers after calibration, smoothing, actuator reorder, and control tuning.
Arm + CRAFT teleoperation in sim	Partially worked	Camera-based tracking can produce arm+hand motion, but tracking loss, lack of contact feedback, and field-of-view limits remain major issues.
Zero-shot Allegro-to-CRAFT policy transfer	Mostly failed, with partial successes	Action remapping alone cannot hide the kinematic and contact differences between Allegro and CRAFT.

Next Steps

The project suggests a clear next direction: stop spending most effort on teleoperation polish and use the simulator to build policy-learning assets. A strong follow-up would retarget prior Allegro or human manipulation datasets to CRAFT, reject physically implausible samples in MuJoCo, and train or fine-tune policies directly on the corrected CRAFT model. The specific fixes suggested by our failures are to improve thumb opposition, bring CRAFT fingers closer when grasping to increase contact area, reduce excessive distal curl, and evaluate whether contact-aware feedback or depth sensing improves teleoperation data quality.

In short, our project turned CRAFT into a working experimental target rather than just a hand design. The resulting system is still rough, but it now has the pieces needed for the next research question: can a five-finger compliant hand learn useful manipulation policies from retargeted simulation data?