Calico Kinect
From IPRE Wiki
This page describes an experimental connection to Microsoft's Kinect 3D camera and microphone.
To use this, you will need:
- A Microsoft Kinect (about $150) connected to a PC running Windows7
- Microsoft's Kinect SDK on Windows7 server
- Rolf Lakaemper's Kinect TCP Server, installed and running on the Windows7 computer (can be same computer)
- Calico version 1.0.5 or greater running on same, or different computer (could be a Mac or Linux computer, or different Windows OS/computer)
Contents |
Overview
Microsoft's Kinect is a 3D camera for getting depth information, and more interesting, skeletal joint positions. You can track up to 6 people at a time.
The Calico Kinect project uses a Windows7 server to provide the data via the real Microsoft Kinect SDK... nothing has been reverse engineered. The Kinect TCP Server was written by Rolf Lakaemper and serves RGB, depth, and skeletal information over a port. Currently you must request data from the server each time you want some more data. The Calico code runs on a client, which can be the same Windows7 server, or a different computer running any operating system (Linux, Mac, or Windows).
Setup
Once the Kinect TCP Server is installed, and running, from Calico Python:
import Kinect
client = Kinect.Client("127.0.0.1", 8001)
Client takes the name of a machine and a port number. To make these ports available to computers other than the one that you are running on, you need to allow access to the port through the firewall.
Your Calico code parameters must match the parameters that the server is running. For example, if you are serving 640x480 RGB data, then your client must match.
Many computers can connect onto the server simultaneously.
Functions
- Kinect.Client(server, port) - the main interface
- client.hello() - internal
- client.readByte() - internal
- client.initRGB(resX) - call to initialize the RGB API
- client.initDepth(resX) - call to initialize the Depth API
- client.startKinect(default) - set default to True to use the default settings
- client.readData() - internal
- client.readDepth() - get the Depth data
- client.readRGBImageArray() - returns RGB data in a format for display
- client.readDepthXYZ() - get Depth data as a point cloud
- client.readSkeleton() - get the Skeletal data
- client.write(byte, ...) - internal
- client.read(count) - internal
- client.close() - closes the stream and connection
- client.getJointPositions(data, skelIndex) - given Skeletal data, and index (1-based) get Joint positions
- client.getJointSegments(data, skelIndex) - given Skeletal data, and index (1-based) get Joint segment positions
- client.convertDepthToImageArray(depth) - returns Depth data in a format for display
Examples
Camera
import Kinect
client = Kinect.Client("127.0.0.1", 8001)
client.initRGB(640)
client.initDepth(320)
client.startKinect(False)
depthValues = client.readDepth()
Skeleton
The skeleton comes in a 20-element matrix with the following position meanings:
- 0. HipCenter
- 1. Spine
- 2. ShoulderCenter
- 3. Head
- 4. ShoulderLeft
- 5. ElbowLeft
- 6. WristLeft
- 7. HandLeft
- 8. ShoulderRight
- 9. ElbowRight
- 10. WristRight
- 11. HandRight
- 12. HipLeft
- 13. KneeLeft
- 14. AnkleLeft
- 15. FootLeft
- 16. HipRight
- 17. KneeRight
- 18. AnkleRight
- 19. FootRight
Example Programs
This exampled pulls all three components together, depth, RGB, and skeleton. Note that the Client takes a DNS name or IP address as a string. In this is example, it assumes the server is listening on localhost, 127.0.0.1.
import Kinect
client = Kinect.Client("127.0.0.1", 8001)
depthWidth, depthHeight = 320, 240
rgbWidth, rgbHeight = 640, 480
client.initRGB(rgbWidth)
client.initDepth(depthWidth)
client.startKinect(False)
from Graphics import *
win = Window("Depth", depthWidth, depthHeight)
pic = Picture(depthWidth, depthHeight)
pic.draw(win)
win2 = Window("RGB", rgbWidth, rgbHeight)
pic2 = Picture(rgbWidth, rgbHeight)
pic2.draw(win2)
def getDepth():
depthValues = client.readDepth()
maximum = max([depthValues[x,0] for x in range(depthValues.GetUpperBound(0) + 1)])
for v in range(depthValues.GetUpperBound(0) + 1):
depth = depthValues[v,0]
gray = depth/maximum * 255
pic.setPixel(depthWidth - v % depthWidth, int(v/depthWidth), Color(gray, gray, gray))
def getRGB():
data = client.readRGB()
pic2.fromArray(data, "BGRX")
lines = []
def getJoints():
global lines
s = client.readSkeleton()
skeletons = s[0]
for line in lines:
line.undraw()
lines = []
for index in range(1, skeletons + 1):
joints = client.getJointPositions(s, index)
segments = client.getJointSegments(s, index)
z = joints[1,3]
zscale = z/500.0
for i in range(segments.GetUpperBound(0) + 1):
if (segments[i,4] != 0):
x1 = 640 - (int)(segments[i,0]/zscale+320.0);
y1 = 50 + (int)(-segments[i,1]/zscale+200.0);
x2 = 640 - (int)(segments[i,2]/zscale+320.0);
y2 = 50 + (int)(-segments[i,3]/zscale+200.0);
line = Line((x1, y1), (x2, y2))
line.setWidth(10)
line.draw(win2)
lines.append(line)
headsize = 40;
head = Circle((640 - (segments[0,0]/zscale+320),
480 - 50 - (segments[0,1]/zscale+200)), headsize);
head.fill = Color("yellow")
head.draw(win2)
lines.append(head)
def main():
while win.IsRealized and win2.IsRealized:
getRGB()
getDepth()
getJoints()
win.run(main)
This example just gets the skeleton data:
# D.S. Blank
# Kinect example: reading skeletons
from Graphics import *
import Kinect
client = Kinect.Client("colossus.brynmawr.edu", 8001)
client.startKinect(False)
rgbWidth, rgbHeight = 640, 480
win = Window("Skeleton", rgbWidth, rgbHeight)
# Global places for graphical objects:
bodies = {}
heads = {}
def getJoints():
try:
s = client.readSkeleton()
except:
return
skeletons = s[0]
for index in range(1, skeletons + 1):
joints = client.getJointPositions(s, index)
segments = client.getJointSegments(s, index)
z = joints[1,3]
zscale = z/500.0
bodies[index] = bodies.get(index, {})
for i in range(segments.GetUpperBound(0) + 1):
if (segments[i,4] != 0):
x1 = 640 - (int)(segments[i,0]/zscale+320.0);
y1 = 50 + (int)(-segments[i,1]/zscale+200.0);
x2 = 640 - (int)(segments[i,2]/zscale+320.0);
y2 = 50 + (int)(-segments[i,3]/zscale+200.0);
if i in bodies[index]:
line = bodies[index][i]
else:
line = Line((x1, y1), (x2, y2))
line.setWidth(10)
line.outline = Color("black")
line.draw(win)
bodies[index][i] = line
line.set_points(Point(x1, y1), Point(x2, y2))
line.update()
if index in heads:
head = heads[index]
else:
head = Circle((640 - (segments[0,0]/zscale+320),
480 - 50 - (segments[0,1]/zscale+200)), 35);
head.fill = Color("yellow")
head.draw(win)
heads[index] = head
head.moveTo(640 - (segments[0,0]/zscale+320),
480 - 50 - (segments[0,1]/zscale+200))
while win.IsRealized:
getJoints()
References
- http://research.microsoft.com/pubs/145347/BodyPartRecognition.pdf - "Real-Time Human Pose Recognition in Parts from Single Depth Images", by Shotten, et al
- http://www.youtube.com/watch?v=HNkbG3KsY84 - video
- Kinect.cs - client-side TCP code
- https://sites.google.com/a/temple.edu/kinecttcp/ - Kinect TCP Server





