Skip to content

IERoboticsAILab/AIfred

Repository files navigation

🤖 AIfred – Your Clever Robotic Study Companion

AIfred System Diagram

AIfred is an interactive robotic lamp designed to enhance your learning experience. It responds to hand gestures, retrieves real-time information, and seamlessly blends digital and physical spaces using computer vision and AI. Built with ROS, Mediapipe, and Gemini API, it turns study time into an intuitive and focused conversation.

📺 Watch the full demo on YouTube:
AIfred YouTube Video
https://www.youtube.com/watch?v=YOUR_VIDEO_ID

Intro

This project introduces AIfred, a Clever Lamp, and innovative robotic lighting system designed to be the perfect companion for users. The system features a WX250s robotic arm from Trossen Robotics, equipped with a Kodak Mini projector mounted on its end-effector. The Clever Lamp combines advanced robotics with customizable projection capabilities, creating a versatile and interactive lighting solution.

The robotic arm's precise movements allow the projector to illuminate and transform the surrounding environment with tailored images and videos. Users can effortlessly manipulate the lamp's position and projection content, offering a unique and personalized lighting experience. The simplicity of the 3D design ensures ease of use and installation, while the customization options open up a wide range of applications, from mood lighting and entertainment to educational and professional uses.

By integrating cutting-edge robotics with user-centric design, the Clever Lamp offers immense potential for creative and practical applications, making it an indispensable addition to modern living spaces.

The objective of the project is to have a friend and tool at the disposal of the user. With a simple webcam, AIfred can detect what is appening on the user workspace and provide with usefull insights such as YouTube videos and Wikypedia links. Moreover, AIfred will be able to see your workspace intercat with you with voice (speech-to-speech) and solve on paper math for you.

Before running the code

To run the code you will need some prerequisites:

  1. Optitrack system: It is a system of camares that detect position and orientation of cetain objects thank to capability of reflection of the ball markers.
  2. Install natnet_ros_cpp ros package to send messages from Optitrack to your roscore.
  3. Install interbotics_ws ros package to move and nteract wth wx250s robot arm from trossenrobotics.
  4. 3D print our universal marker. It is an object that will be easily detected from Optitrack. We call it umh_0.
  5. 3D print our custom wx250s base, for the robot arm. It has M3 scrues to host marker balls in place and detect position of robot base. We call it real_base_wx250s.

AIfred USAGE (Ros Package)

Hardware set-up

  1. Setup:

    station setup
  2. Take Trossenrobotics wx250s, and secure it on table. Connect it to his power supply and connect signal USB to the computer.

  3. Mount Kodak Projector on the end-effector of the Trossenrobotics Robot arm (wx250s) (download this for the attachment).

  4. Attach chromcast to the HDMI of the kodak projector mini and power with a USB from chromcast to projector.

  5. Connect a USB camera to PC and make it point on your working station.

  6. Turn on optitrack:

    a. Turn on rigid body for mark base robot (download here real_base_wx250s)

    b. Turn on rigid body for universal marker (download hereumh_0)

Install software

  1. Create a Gemini API and YouTube API and put them in a .env file

  2. Download natnet_ros_cpp ROS package

  3. Downlaod Trossenrobotics ROS Pakages (our guide here)

  4. Download AIfred ROS Package:

    cd ~/catkin_ws/src
    git clone https://github.com/IERoboticsAILab/clever_lamp.git
    cd ..
    catkin build  #OR catkin_make
    . devel/setup.bash
    
  5. Open chrome (better account with youtube premium) tab (will be used for casting YouTube videos)

  6. Open FireFox tab (will be used for wikipedia articles)

  7. Cast the tab of chrome with chromcast that is on the robot arm end effector Kodak Projector

  8. Create virtual environment and download all the dependencies for computer vision

Dependencies
```python
''' IMPORT MODULES '''
import sys
import os
import cv2
import re
from PIL import Image as PIL_Image
import google.generativeai as genai
import mediapipe as mp
from dotenv import load_dotenv
sys.path.append(os.path.join(os.path.dirname(__file__), '../scripts'))
from my_functions import countFingers, detectHandsLandmarks, search_yt, circular_list, open_url, format_math
import webbrowser
import googleapiclient.discovery
import rospy
from tf.transformations import euler_from_quaternion
import geometry_msgs.msg
from geometry_msgs.msg import PoseStamped
import math
import subprocess
import pyautogui
import time
```

Run demo

  1. Publishing messages from Optitrack to ros:

    roslaunch natnet_ros_cpp gui_natnet_ros.launch
    
    natnet setup
    • Check that the topics have been published using:

      rostopic list
      

      If NOT, change from Multicast to Unicast (and viceversa) and press start again untill the messages are published

  2. Connect to Trossenrobotics Robot Arm. Source the interbotix workspace and run the controll package:

    source interbotics_ws/devel/setup.bash
    roslaunch interbotix_xsarm_control xsarm_control.launch robot_model:=wx250s
    
  3. Make robot follow and point on the marker. Launch clever_lamp.launch that will execute both nodes brodcast_marker.py and clever_lamp.py:

    roslaunch alfred_clever_lamp clever_lamp.launch
    

    a. brodcast_marker.py: This section of the project combine digital space wit real word with a user frendly interface. In RViz the robot is set in (0,0,0) that is the word cordinate space. But in the reality the robot is in a diffrent position in space (it depends where you position the working table). Here we take the Optitrack cordinates of the real robot base (/natnet_ros/real_base_wx250s/pose) in relation with the real marker (/natnet_ros/umh_2/pose), and we transform that relation with the digital robot base (wx250s/base_link), publishing a new tf for the marker (umh_2_new)

    b. clever_lamp.py: Look at the tf transformation of the universal marker position relative to the digital space and move end effector accordingly.

  4. Run the computer vision ROS Node:

    rosrun alfred_clever_lamp computer_vision.py
    

    a. computer_vision.py: Look at the webcam, detect using mediapipe if you are pointing at something with your finger, take screenshot and shows YouTube video and Wikipedia of what you are looking at. If it detect some math, it solve it step by tep with you, projecting on the paper the solution. To run this script you will need to create a virtual environment with all the dependencies and activate it when launching the alfred node (point at things → webcam → gemini → video casted).

Usage

  1. Move the Universal Marker and the robot will follow pointing the projector content on the table.

  2. Point on the workspace with your finger to trigger the screenshot, that will pass to Gemini API, and generate Personalize Content for you. Then it will be projected on the table.

  3. Lift the Marker to pause the YouTube video.

  4. Rotate the marker to show next/previous YouTube video.

Resources

Demo

Combine the 2 parts of the project and this is what you will have:

Demo example: lets say we are looking at giraffes and we are curious to know more about it, we can trigger the computer vision to tell us what it sees, and send us a YouTube video about it. Then we can manipulate the position of the projector and projct on a specific area of the table. Using chorome cast attached to the projector we have one more screen to improve our working quality.

real space result
RViz result

About

Combining wx250s robotic arm with mini projector to create clever lamp

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors