-
Notifications
You must be signed in to change notification settings - Fork 0
Home
This wiki will take the form of a research bank for all of the information relating to the AI_Vision project it will be divided as follows.
###Detecting objects in frame This will look for various objects in the Xbox Kinect frame, determine its size and shape, and then determine which pixels relate in the regular camera image. This may later take the form of 3-D modeling of objects but that may be too processor intensive for a real time application.
###Vision recognition algorithms This will take into account the picture of a found object, as well as the size and shape, and using some sort of recognition algorithm determine if it is already in the data banks or if it doesn't recognize it, at this point we will also bring in user input to "talk" to the robot to assist in deciding what everything is.
###Algorithm storage This will most likely take the form of a data base that saves all of the various data things needed to recognize various objects. It will be organized by the class hierarchy in such a way to move from general to specific, see below.
###Class hierarchy This will be how the algorithms are stored in memory. This will store objects based on what groups they fall into and move slowly down the line. IE to find a dog you would first see if the object falls into the living being category, then the mammal category all the way down to the dog category.
###Basic Logic and Reasoning Finally, I know very little about this level of the project. We will be working very heavily with Sheyne's comprehend project and the final goal would be to show the camera a video of a dog chasing a cat, and tell it the dog is chasing the cat, and then when you ask it, which one is the dog, it indicates the dog.