Using machine learning to teach robots to get dressed

In the Siggraph 2018 paper Learning to Dress: Synthesizing Human Dressing Motion via Deep Reinforcement Learning, a Georgia Institute of Technology/Google Brain research team describe how they taught body-shame to an AI, leaving it with an unstoppable compulsion to clothe itself before the frowning mien of God.

The AI uses machine learning tools to "automatically discover robust dressing techniques," and manages to train a robust get-dressed model despite the high computational expense of simulating cloth.

The paper's fascinating: the secret to getting an AI to get dressed is haptics -- a sense of touch that is used to dynamically retune the AI's coordination to adjust to the rippling, slithering, treacherous textiles -- the model incorporates the ripping point of the cloth and penalizes AIs that rent their garments asunder while clothing themselves.

In this task, a t-shirt is initialized on the character’s shoulders with the character’s neck contained within the collar. To randomize the initial garment state, we apply a random impulse force with fixed magnitude to all the garment vertices at the beginning of the simu- lation. We allow the garment to settle for 1s before the character begins to move.

The first control policy completes the task of moving the right end effector into gripping range of the specified grip feature. The policy attempts to match a given position and orientation target in the garment feature space. Once the error threshold is reached, control transitions to an alignment policy designed to “tuck” the left end effector and forearm under the waist feature of the garment in preparation for dressing the arm. This policy attempts to contain the arm within the triangle formed by the gripping hand and the shoulders. This heuristic approximates the opening of the garment waist feature. In addition, this policy is rewarded for contact with the garment interior and penalized for geodesic distance from a selected point on the interior of the garment. Once interior contact is detected, and the arm is within the heuristic triangle, control is transitioned to the left sleeve dressing controller which attempts to minimize the end effector contact geodesic distance from the end feature of the sleeve and maximize the containment depth of the arm within the sleeve entrance feature. A task vector is provided which indicates the direction the end effector should move to decrease its contact geodesic distance (or points to the garment feature if not in contact). Once the limb has passed a threshold distance through the sleeve, the re-grip controller directs the hands together into position to exchange grip from right hand to left. Once the left hand is within a threshold distance of its gripping target, the grip exchange is triggered and control is transitioned to the second “tuck” control policy with the same purpose and transition criteria as the first. The second sleeve policy is then run to pass the right arm through the right sleeve. At this point, the seventh and final policy is used to guide the character back to its start pose while avoiding garment tearing.

Learning to Dress: Synthesizing Human Dressing Motion via Deep Reinforcement Learning [Alexander Clegg, Wenhao Yu, Jie Tan, C. Karen Liu And Greg Turk/ACM Transactions on Graphics, Vol. 37, No. 6]

(via JWZ)