From Brown University Robotics
Brown Robotics Projects (Current)
The SmURV platform (Small Universal Robotics Vehicle) is a comparatively cheap and easy to assemble robotics platform for educational, research and hobby purposes. Essentially, the platform consists of an iRobot Create with a small computer and camera mounted on top of it. This guide covers the (solderless) assembly of the platform (with subnotebook laptop) and tutorial for use with the Player/Stage middleware package. Topics covered include installation of Player on the subnotebook running Ubunte EEE, a tutorial for remote control teleoperation using playerjoy and playercam, and a few demonstration videos of the robot under human control. Alternatively, instructions for using ROS with this platform is available from brown-ros-pkg.
Structured light-based depth sensing with standard perception algorithms can enable mobile peer-to-peer interaction between humans and robots. The emergence of depth-based imaging allows robot perception of non-verbal cues in human movement in the face of lighting and minor terrain variations. Our iRobot Packbot-based system is capable of person following and responding to verbal and non-verbal commands under varying lighting conditions and uneven terrain, both indoors and outdoors.
We seek to enable users to teach personal robots arbitrary tasks, allowing robots to better serve users' wants and needs without explicit programming. Robot learning from demonstration is an approach well-suited to this paradigm, as a robot learns new tasks in new environments from observations of the task itself. Many current robot learning algorithms require the existence of basic behaviors that can be combined to perform the desired task. However, robots that exist in the world for long timeframes may exhaust this basis set. In particular, a robot may be asked to perform an unknown task for which its built in behaviors may not be appropriate.
We explore the use of full-body 3D physical simulation for human kinematic tracking from monocular and multi-view video sequences within the Bayesian filtering framework. Towards greater physical plausibility, we consider a human's motion to be generated by a "feedback control loop", where Newtonian physics approximates the rigid-body motion dynamics of the human and the environment through the application and integration of forces. The result is more faithful modeling of human-environment interactions, such as ground contacts, resulting from collisions and the human's motor control.
Spatio-temporal Isomap (ST-Isomap) is an algorithm for registering a single time-series (or a set of time-series) against itself such that data points X corresponding to similar phases of the underlying spatio-temporal process are aligned (placed in close proximity within an estimated embedding Y). This algorithm emphasizes the notion of a registration kernel, where pairwise distance is indicative of how well two points align in space and time, and estimation of new coordinates that preserve distances produced by this kernel.
Human control of high degree-of-freedom robotic systems is often difficult due to the overwhelming number of variables that need to be specified. Instead, we propose the use of sparse subspaces embedded within the pose space of a robotic system. Driven by human motion, we addressed this sparse control problem by uncovering 2D subspaces that allow cursor control, or eventually decoding of neural activity, to drive a robotic hand. Considering the problems in previous work related to noise in pose graph construction and motion capture, we introduced a method for denoising neighborhood graphs for embedding hand motion into 2D spaces. Such spaces allow for control of high-DOF systems using 2D interfaces such as cursor control via mouse or decoding of neural activity. We present results demonstrating our approach to interactive sparse control for successful power grasping and precision grasping using a 13 DOF robot hand.
We explore the use of manifold learning techniques to uncover structure in sensorimotor time series from teleoperated humanoid manipulation tasks. Data from Robonaut, NASA's humanoid robot, was recorded while it was being teleoperated through four tool manipulation tasks. We show that one algorithm, Spatio-Temporal Isomap, is capable of uncovering behavioral structures that can be difficult to find with other dimension reduction techniques (Principal Component Analysis, Multidimensional Scaling, and Isomap).
We explore Markov random fields (MRFs) as a probabilistic mathematical model for unifying approaches to multi-robot coordination or, more specifically, distributed action selection. We describe how existing methods for multi-robot coordination fit within an MRF-based model and how they conceptually unify. Further, we offer belief propagation on a multi-robot MRF as an alternative approach to distributed robot action selection.
We are currently developing a book as a resource for other students interested in robotics to get up and running with a robot platform or for professors interested in teaching a similar robotics course.
Brown Robotics Projects (Older)
There is currently a division between real-world human performance and the decision making of socially interactive robots. Specifically, the decision making of robots needs to have information about the decision making of its human collaborators. This circumstance is partially due to the difficulty in estimating human cues, such as pose and gesture, from robot sensing. Towards crossing this division, we present a method for kinematic pose estimation and action recognition from monocular robot vision through the use of dynamical human motion vocabularies.
We present a methodology for articulating and posing meshes, in particular facial meshes, through a 2D sketching interface. Our method establishes an interface between 3D meshes and 2D sketching with the inference of reference and target curves. Reference curves allow for user selection of features on a mesh and their manipulation to match a target curve. Our articulation system uses these curves to specify the deformations of a character rig, forming a coordinate space of mesh poses. Given such a coordinate space, our posing system uses reference and target curves to find the optimal pose of the mesh with respect to the sketch input. We present results demonstrating the efficacy of our method for mesh articulation, mesh posing with articulations generated in both Maya and our sketch-based system, and mesh animation using human features from video. Through our method, we aim to both provide novice-accessible articulation and posing mesh interfaces and rapid prototyping of complex deformations for more experienced users.
Dynamo (DYNAmic MOtion capture) is an approach to controlling animated characters in a dynamic virtual world. Leveraging existing methods, characters are simultaneously physically simulated and driven to perform kinematic motion (from mocap or other sources). Continuous simulation allows characters to interact more realistically than methods that alternate between ragdoll simulation and pure motion capture. The novel contributions of Dynamo are world-space torques for increased stability and a weak root spring for plausible balance. Promoting joint target angles from the traditional parent-bone reference frame to the world-space reference frame allows a character to set and maintain poses robust to dynamic interactions. It also produces physically plausible transitions between motions without explicit blending.
We address the symbol grounding problem for robot perception through a data-driven approach to deriving categories from robot sensor data. Unlike model-based approaches, where human intuitive correspondences are sought between sensor readings and features of an environment (corners, doors, etc.), our method learns intrinsic categories (or natural kinds) from the raw data itself. We approximate a manifold underlying sensor data using Isomap nonlinear dimension reduction and apply Bayesian clustering (Gaussian mixture models) with model identification techniques to discover categories (or kinds). We demonstrate our method through the learning of sensory kinds from trials in various indoor and outdoor environments with different sensor modalities. Learned kinds are then used to classify new sensor data (out-of-sample readings). We present results indicating greater consistency in classifying sensor data employing mixture models in non-linear low-dimensional embeddings.
Our modular system for untethered real-time kinematic motion capture using sensors with inertial measuring units (IMUs). The system is comprised of a set of small and lightweight inertial sensors. Each sensor provides its own global orientation (3 degrees of freedom) and is physically and computationally independent, requiring only external communication. Orientation information from sensors is communicated via wireless to host computer for processing. The untethered motion capture system has been used for teleoperating the NASA Robonaut.
Prof. Jenkins' dissertation research focused on the problem of automatically deriving behavior modules from human motion data for controlling humanoid robots and classifying human motion.
Collecting motion data is an important tool in controlling robots. Traditional approaches of motion capture usually use labels for passive markers. They suffers from several problems such as occlusions or cumbersome equipments. In the past few years, methods of markerless, unconstrained posture estimation using only cameras has received much attention from computer vision researchers. One of these methods is volume-based approach. Instead of deriving kinematic models directly from 2D images, this method first builds an intermediate 3D volume feature of the capture subject. Then fit a 3D body model is to the volume data. Here we proposed an approach for model-free, markerless, volume-based motion capture of humans. It is centered on generating underlying nonlinear axes (or a skeleton curve) from a volume of a human subject captured from multiple calibrated cameras.
A model of movement imitation and implementation for a simulated humanoid to imitate the behavior of a human performer. The tracking data used for this implementation were obtained using a 2.5D upper body video-based pose tracking system. The attention mechanism in this system focuses on the locations of the endpoints (i.e. hands). In place of a learned set of primitives, a human subject performed a sequence of motions, including line, circle, and arc trajectories of the endpoints, to yield a set movements that serve as a set of perceptual-motor primitives. Using these primitives, we implemented a vector quantization based classification mechanism. With postprocessing of the classification results, the classifier provides a desired via-point trajectory for each arm endpoint. These trajectories are then actuated using impedance control on our 20 DOF humanoid simulation, Adonis.