PepperPerception

PepperPerception is a lightweight perception service designed to provide object detection capabilities to the Pepper robot (or other applications) via a ZeroMQ interface.

It supports multiple backends:

Ultralytics YOLOv8: For object detection.
Google MediaPipe Holistic: For face, hand, and pose tracking.
Combined: Runs both YOLO and MediaPipe simultaneously.

This service is designed to run in a Docker container, preferably on a machine with an NVIDIA GPU.

Features

Multi-Backend: Switch between YOLO (Object Detection), MediaPipe (Holistic Tracking), or Combined.
ZeroMQ Interface: Exposes a fast and language-agnostic API using ZMQ (REP/REQ pattern).
Dockerized: Easy to deploy with all dependencies encapsulated.
GPU Accelerated: Configured to leverage NVIDIA GPUs for inference (YOLO).

Prerequisites

Docker
Docker Compose
NVIDIA Container Toolkit (Required for GPU support)

Installation & Running

Start the service:

docker-compose up --build

The service will start and listen on port 5557.

Configuration

You can configure the backend in docker-compose.yml.

YOLO Backend (Default):

command: [ "python", "-m", "perception_service.main", "--backend", "yolo", "--model", "yolov8m.pt" ]

MediaPipe Backend:

command: [ "python", "-m", "perception_service.main", "--backend", "mediapipe" ]

Combined Backend:

command: [ "python", "-m", "perception_service.main", "--backend", "combined" ]

API Documentation

Request Format

Send a multipart ZMQ message where the last frame contains the encoded image bytes.

Frame 0 (Optional): Metadata JSON string.
Frame 1 (Last): Encoded Image Bytes.

Response Format

The service replies with a JSON object.

YOLO Response:

{
  "status": "success",
  "backend": "yolo",
  "data": [
    {
      "class": "person",
      "confidence": 0.92,
      "bbox": [100.0, 50.0, 250.0, 400.0] 
    }
  ]
}

MediaPipe Response:

{
  "status": "success",
  "backend": "mediapipe",
  "data": {
    "pose_landmarks": [{"x": 0.5, "y": 0.5, "z": 0.0, "visibility": 0.9}, ...],
    "face_landmarks": [...],
    "left_hand_landmarks": [...],
    "right_hand_landmarks": [...]
  }
}

Combined Response:

{
  "status": "success",
  "backend": "combined",
  "data": {
     "detections": [...], // YOLO results
     "pose_landmarks": [...], // MediaPipe results
     "face_landmarks": [...],
     ...
  }
}

Tools & Benchmarking

The repository includes a benchmarking utility.

# Run benchmark (requires service to be running)
python benchmark.py

License

This project is licensed under the Apache License 2.0, see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
perception_service		perception_service
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
benchmark.py		benchmark.py
debug_client.py		debug_client.py
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
test_client.py		test_client.py
validate.py		validate.py
webcam_client.py		webcam_client.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PepperPerception

Features

Prerequisites

Installation & Running

Configuration

API Documentation

Request Format

Response Format

Tools & Benchmarking

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

PepperPerception

Features

Prerequisites

Installation & Running

Configuration

API Documentation

Request Format

Response Format

Tools & Benchmarking

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages