Menu
AkshatCodes
  • Log
  • Builds
  • About
Use dark theme
  • homeHome
  • constructionBuilds
  • personAbout
  • bookmarkLibrary
Designed with discipline.
AKSHATCODES © 2026
GithubX / TwitterInstagramLinkedInEmail
Designed with discipline.
/
·

/blog/handconnect-web-interface-mediapipe-hand-tracking
TutorialJune 24, 2026·12 min read
Save
Share

HandConnect Web Interface: Real-Time Hand Tracking in the Browser

HandConnect is a browser-based web app using MediaPipe hand tracking and webcam input to create a stunning rainbow string visual effect in real time.


TL;DR — HandConnect is a browser-based computer vision project that uses your webcam and MediaPipe to detect hands in real time, then visually connects them with a dynamic rainbow-colored string effect. No backend. No installation. Just open and play.

HandConnect Web Interface — demo showing rainbow string connecting two detected hands in real time HandConnect live demo — rainbow string effect connecting two hands detected via webcam.


Introduction

Computer vision used to be reserved for researchers with expensive hardware and Python environments. Not anymore.

With tools like MediaPipe and modern browser APIs, you can now run real-time hand tracking directly in the browser — no installation, no GPU required, just a webcam and a JavaScript file.

HandConnect Web Interface is exactly that kind of project. It's a browser-based web application that detects your hands through your webcam and visually connects them using a dynamic rainbow-colored string effect. It's part art, part engineering — and 100% built with open web technologies.

In this post, I'll walk you through how HandConnect works, the tech stack behind it, the challenges I ran into, and where it's headed next.


What is HandConnect Web Interface?

HandConnect Web Interface is a real-time, browser-based hand tracking application. It uses your device's webcam as input, processes each video frame using Google's MediaPipe Hands model, and renders a visual effect — a glowing rainbow-colored line — that dynamically connects detected hands.

The result is an interactive, visually satisfying experience that responds to your physical hand movements in milliseconds.

It requires:

  • A modern browser (Chrome recommended)
  • A working webcam
  • No installation, no sign-up, no backend

It's a great example of what's possible with JavaScript computer vision on the open web.

HandConnect GitHub repository showing project structure and README GitHub repository for HandConnect — open source and ready to explore.

GitHub Repository
Live Demo


How It Works

Here's the core flow of HandConnect, step by step:

1. Webcam Access
The browser's getUserMedia() API requests access to your webcam and streams live video into a <video> element.

2. Frame Capture
Each video frame is drawn onto an HTML <canvas> element using the Canvas 2D API.

3. Hand Detection via MediaPipe
MediaPipe Hands processes each frame and returns the coordinates of 21 hand landmarks per detected hand — fingertips, knuckles, wrist, and more.

4. Rainbow String Rendering
Using the detected landmark positions (specifically the wrist or palm center of each hand), a colorful gradient line is drawn between the two hands on the canvas layer using ctx.createLinearGradient().

5. Loop
This entire process runs on every animation frame via requestAnimationFrame(), creating a smooth, real-time experience.

javascript
// Simplified core loop
async function detect() {
  const results = await hands.send({ image: videoElement });
  
  if (results.multiHandLandmarks && results.multiHandLandmarks.length === 2) {
    const hand1 = results.multiHandLandmarks[0][0]; // wrist of hand 1
    const hand2 = results.multiHandLandmarks[1][0]; // wrist of hand 2

    drawRainbowString(hand1, hand2, canvas, ctx);
  }

  requestAnimationFrame(detect);
}

Note: The above is a simplified illustration of the detection loop. Refer to the actual repository for the complete implementation.


Technologies Used

MediaPipe Hands

MediaPipe is an open-source framework by Google for building real-time ML pipelines. The Hands solution detects up to two hands and returns 21 3D landmarks per hand.

It runs entirely client-side using WebAssembly and TensorFlow.js, making it fast enough for real-time use in the browser without any server calls.

html
<!-- Loading MediaPipe via CDN -->
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/hands/hands.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils/camera_utils.js"></script>

JavaScript (Vanilla)

HandConnect is written in vanilla JavaScript — no React, no Vue, no framework. This keeps the project lightweight, fast, and easy to understand for beginners.

The core logic handles:

  • Webcam stream setup
  • Canvas rendering
  • MediaPipe model initialization and inference
  • Rainbow gradient drawing

HTML5 Canvas API

The visual effect is rendered on an HTML <canvas> element layered on top of the video feed. The Canvas 2D context (ctx) is used to draw the rainbow gradient string between hand positions on every frame.

javascript
function drawRainbowString(p1, p2, canvas, ctx) {
  const gradient = ctx.createLinearGradient(
    p1.x * canvas.width, p1.y * canvas.height,
    p2.x * canvas.width, p2.y * canvas.height
  );
  gradient.addColorStop(0, "red");
  gradient.addColorStop(0.2, "orange");
  gradient.addColorStop(0.4, "yellow");
  gradient.addColorStop(0.6, "green");
  gradient.addColorStop(0.8, "blue");
  gradient.addColorStop(1, "violet");

  ctx.strokeStyle = gradient;
  ctx.lineWidth = 4;
  ctx.beginPath();
  ctx.moveTo(p1.x * canvas.width, p1.y * canvas.height);
  ctx.lineTo(p2.x * canvas.width, p2.y * canvas.height);
  ctx.stroke();
}

Webcam API (getUserMedia)

The browser's native MediaDevices.getUserMedia() API is used to access the device camera. It returns a media stream that is fed into a <video> element.

javascript
const stream = await navigator.mediaDevices.getUserMedia({ video: true });
videoElement.srcObject = stream;

If you used any additional libraries (e.g., TensorFlow.js, camera_utils, drawing_utils from MediaPipe), add them here with a short description.


Key Features

  • Real-time hand detection — detects up to two hands simultaneously at high frame rates
  • Rainbow string visual effect — a dynamic gradient line connects the two detected hands
  • Zero installation — runs entirely in the browser, no setup needed
  • Privacy-first — all processing happens locally; no video data is sent to any server
  • Lightweight — built with vanilla JS and loaded via CDN; no heavy framework overhead
  • Mobile compatible — works on devices with front or rear cameras

Technical Architecture

HandConnect web interface showing the full browser UI with webcam feed and hand tracking active HandConnect running in a browser — webcam feed with rainbow string rendered on canvas overlay.

At a high level, the architecture looks like this:

HandConnect architecture diagram showing data flow from webcam through MediaPipe to canvas renderer HandConnect architecture — webcam input → MediaPipe inference → canvas rendering pipeline.

The key design decision was to overlay the canvas on top of the video element using CSS position: absolute, so the rainbow effect appears to float on top of the live webcam feed.


Challenges and Solutions

Challenge 1: Coordinate Mapping

MediaPipe returns landmark positions as normalized values between 0 and 1. Rendering them on a canvas requires mapping these to actual pixel coordinates.

Solution: Multiply the normalized x and y values by canvas.width and canvas.height respectively.

javascript
const x = landmark.x * canvas.width;
const y = landmark.y * canvas.height;

Challenge 2: Canvas Mirroring

When using the front-facing webcam, the video appears mirrored — but the canvas wasn't. This created a disconnect between where your hands physically are and where the effect renders.

Solution: Apply a CSS transform to flip the video and canvas simultaneously:

css
video, canvas {
  transform: scaleX(-1);
}

Challenge 3: Model Loading Latency

MediaPipe's WASM model takes a moment to load on the first visit, which can feel like the app is broken.

Solution: Show a loading indicator until the model is ready and the first frame is processed.


Step-by-Step: How to Open It

Step 1 — Open Chrome

Launch any web Browser on your device.

Step 2 — Go to the Live Demo

Type or paste this link in your browser address bar and hit Enter:

👉 https://hand-connect-orpin.vercel.app/

Step 3 — Allow Camera Access

When the page loads, your browser will ask:

"hand-connect-orpin.vercel.app wants to use your camera"

Click Allow. Without this, the app can't see your hands.

Step 4 — Wait a Second

The app loads a small AI model in the background. Give it 2–3 seconds. Once it's ready, your webcam feed will appear on screen.

Step 5 — You're in! 🎉

That's all. The app is now running.


What You'll See on Screen

Once the app loads, you'll see:

  • Your live webcam feed — like a mirror showing you in real time
  • A dark overlay canvas on top of the video
  • Nothing special yet — until you raise both hands into view

The magic happens when both hands are visible to the camera at the same time.


How to Use It

1. Raise both hands in front of your webcam Hold them up so the camera can clearly see both. Keep them roughly shoulder-width apart.

2. Watch the rainbow string appear As soon as HandConnect detects both hands, it draws a colorful gradient line connecting your wrists.

3. Move your hands around The line follows your hands in real time. Bring them close together, spread them far apart, move them up and down — the string moves with you.

4. Try it with a friend You can use your own two hands, or try having two people each put one hand in frame — HandConnect just needs to see two hands.

Performance Considerations

Running a machine learning model in the browser on every animation frame is computationally demanding. Here's what to keep in mind:

  • Frame rate — Performance depends on device hardware. Modern laptops typically run at 25–30 FPS; high-end machines can hit 60 FPS.
  • Model complexity — MediaPipe Hands offers a modelComplexity option (0 or 1). Setting it to 0 improves performance on lower-end devices.
javascript
const hands = new Hands({
  locateFile: (file) => `https://cdn.jsdelivr.net/npm/@mediapipe/hands/${file}`
});

hands.setOptions({
  maxNumHands: 2,
  modelComplexity: 0,      // 0 = faster, 1 = more accurate
  minDetectionConfidence: 0.7,
  minTrackingConfidence: 0.5
});
  • Canvas clearing — Clear the canvas on every frame using ctx.clearRect() to prevent ghost trails from previous frames.
  • Browser support — Chrome and Edge provide the best WebAssembly performance. Safari may have limitations.

Future Improvements

HandConnect is already fun to use, but there's a lot of room to grow:

  • Gesture recognition — detect specific hand signs (peace sign, thumbs up) and trigger different visual effects
  • Multi-color themes — let users switch between different string color palettes
  • Sound integration — play tones or beats based on hand distance or position
  • 3D effect — use depth data or CSS 3D transforms to give the string a sense of depth
  • Recording / export — let users download a short clip of their HandConnect session
  • Mobile PWA — wrap it as a Progressive Web App for easy mobile access

Open Source Availability

HandConnect is fully open source. You can explore the code, fork it, and build on top of it.

GitHub Repository

If you find it useful or interesting, consider giving it a ⭐ on GitHub — it helps other developers discover the project.

Contributions are welcome! Whether it's fixing a bug, improving performance, or adding a new visual effect — open a PR and let's build together.


Conclusion

HandConnect Web Interface is a small project with a big idea behind it: computer vision should be accessible to everyone, right in the browser, with zero friction.

It shows how far web technologies have come — from static HTML pages to real-time ML inference running client-side. If you're a student, developer, or just someone curious about computer vision, HandConnect is a great starting point to explore what's possible.

Try it yourself:
Live Demo

Explore the source code:
GitHub Repository

If you build something cool on top of this, tag me on Instagram @code.akshat.in — I'd love to see it.


💡

⭐ Enjoyed this project?
Star the repo on GitHub to show support and help other developers find it.
GitHub Repository


FAQ

What is HandConnect Web Interface?

HandConnect Web Interface is a browser-based web application that uses your webcam and the MediaPipe Hands machine learning model to detect your hands in real time and draw a dynamic rainbow-colored line connecting them on screen.

Does HandConnect require any installation?

No. HandConnect runs entirely in the browser. There is no backend, no app to download, and no account to create. Just open the link and allow webcam access.

What technologies does HandConnect use?

HandConnect is built with vanilla JavaScript, HTML5 Canvas API, the browser's getUserMedia webcam API, and Google's MediaPipe Hands for real-time hand landmark detection.

Is my webcam data sent to a server?

No. All hand tracking and processing happens locally in your browser using WebAssembly. Your video feed is never sent to any server.

Does HandConnect work on mobile?

Yes, HandConnect is compatible with mobile browsers that support getUserMedia. Chrome on Android works well. iOS Safari support may vary.

What is MediaPipe hand tracking?

MediaPipe Hands is an open-source ML solution by Google that detects up to two hands in a video frame and returns the coordinates of 21 landmarks per hand — including fingertips, knuckles, and the wrist — in real time.

Can I use HandConnect as a starting point for my own project?

Absolutely. HandConnect is open source. You can fork the repository, explore the code, and build your own hand tracking experiences on top of it.

GitHub Repository

What browser works best for HandConnect?

Google Chrome is recommended for the best WebAssembly and webcam performance. Microsoft Edge also works well. Firefox and Safari may have limited performance.

How accurate is the hand detection?

MediaPipe Hands achieves high accuracy under good lighting conditions. Performance may drop in low light or with fast hand movements. Setting minDetectionConfidence to 0.7 provides a good balance between speed and accuracy.

Where can I see more projects like HandConnect?

Check out my other build logs and projects on blog.akshatcodes.com and follow my build-in-public journey on Instagram @code.akshat.in.

Akshat Singh

Written by Akshat Singh

35K+ followers
code

Hey, I'm Akshat — a full-stack dev, AI tinkerer, and relentless builder who documents every step of the journey. I share what I learn in real-time — dev tutorials, design insights, and AI + tech news.

← Older
10 Exciting Coding Projects You Should Build in 2026

Comments

progress_activityLoading comments…