2006 Scavenger Hunt Judging Guidelines Proposal

Motivation of judging guidelines

  The AAAI Scavenger Hunt challenges robots to locate and
obtain objects within the conference's natural setting -
a hotel or convention center. In order to encourage a wide variety
of AI researchers to field a system, the 2006 scavenger hunt
will scaffold its challenge through five different facets of
intelligent interaction with the physical environment:

  A) spatial reasoning
  B) object recognition
  C) language and communication
  D) human awareness
  E) task awareness and planning

  In each of these five, there are several different levels
of competence that a robotic system might demonstrate. For
ease of reference, these levels are labeled in
decreasing order of ability:

  1) Concierge
  2) Gofer
  3) Scavenger
  4) Wanderer
  5) Slacker

  This leads to a matrix of possible capabilities that
scavenger hunt systems might demonstrate.

  Some participants might focus on achieving a high level of
competence in only one facet; others might put more effort
into integrating these components. Both types of entry
are welcome, and we will work with all participants to
make the scavenger hunt a motivating and rewarding venue.

  Note that this framework merely wraps the previous
scavenger hunt with some additional directions that teams
might pursue. Any entry designed with the prior rules in mind
can still participate fully and without change.

  Here are the original point values for reasoning
and handling the objects listed on the Scavenger Hunt Item Page.

  1. locate an object and approach it (5 pts each)
  2. identify objects based on color and shape (10 pts each)
  3. indicate the relative positions of the objects found (~20 points)
  4. find and move an object to a designated location (30 points)
  5. create a map (of some sort), including the objects found (30 points)
In addition, these values may be increased or decreased by the judges
depending on the following factors:
  The table below is meant as a guide for how the judges
will try to compare systems that take very different approaches to
the spatial reasoning tasks. They also intend
to provide some additional structure, e.g., to distinguish
"human awareness" here from the separate human/robot interaction
event at AAAI. An advantage of the AAAI robot competition is that
all participants are welcome to tailor the venue to help motivate
their own particular projects. Thus, these guidelines will change to
meet participants' needs and interests.

  Disclaimers: these facets are not mutually exclusive. (Who would
claim AI was perfectly modular?) Also, one might say that a
"concierge"-level system in all five -- or even just one --
spatial reasoning category goes improbably far toward "solving
the AI problem." Whether or not this is true, it is perhaps not
a bad thing to keep such lofty goals in mind, even if they are still distant.

  Additional Disclaimers: there is no way to BOTH
encourage a wide variety of approaches to spatial-reasoning tasks AND
to create a completely objective means of judging very different
entries. The judges reserve the right to present multiple
first-place awards, if the conditions warrant. Technical awards for
particular behavioral strengths or specializations will also be considered.

Human Awareness
Capability Points Initial description
Slacker 0 pts No awareness of agents outside the system itself, i.e., humans are annoyingly-shaped walls that may or may not correspond to the floorplan. (What floorplan?!)
Wanderer 5 pts At this level, the system would show the ability to handle disinterested humans robustly, e.g., it will pursue its task goals even with people walking by the system at ordinary speed, pause to look at it, etc. People will not try to interfere with the system or its functioning at this level. Also, they will be relatively sparse in time and space: perhaps at most 2, at most every minute, and for at most 15 seconds per "interaction."
Scavenger 10 pts Here, a system will not only act robustly to disinterested humans, but will succeed (some/most of the time) in realizing and expressing the fact that a human is "interacting." At this level, the system should also become an active agent in this exchange - not (necessarily) in a typical human/robot interaction way. For example, the robot might ask the person to move (but not ask a wall to move).
Gofer 20 pts would demonstrate the ability to identify people in the environment and to solicit their help, if they are willing to give it. Characterizing this level of human awareness is the notion that people are a _resource_ and not just an annoyance :)
Concierge 80 pts would actively seek out people and engage with them in order to accomplish a specific part of a spatial task, e.g., getting directions or asking for more information on an object.
Object Recognition
Slacker 0 pts No object recognition capabilities demonstrated
Wanderer 5 pts The ability to distinguish up to 5 preselected objects in known poses, based on 1-2 features of their "appearance" to the sensor in question. For visual sensors these sensor features could be color, texture, color composition, or shape, among others. For direct range-sensors, this would be identifying particular categories of depth patterns. (1 point per object)
Scavenger 10 pts The ability to distinguish up to 10 preselected objects in unconstrained (or minimally constrained) poses, using 3 or more facets of the objects' "appearance" to the sensor(s) used.
Gofer 20 pts The ability to identify objects within a set of five well-defined categories, e.g., "a pillow," "a chair," "a newspaper, or "a bottle of champaigne."
Concierge 80 pts The ability to identify objects within a much larger and less well-defined set of categories, of specific instances of the previous categories. For example, "today's USA Today," "some shampoo," "tickets to Spamalot."
Spatial Reasoning
Slacker 0 pts No spatial reasoning capabilities demonstrated
Wanderer 5 pts demonstrating the ability to use a human-provided map of environmental landmarks and obstacles Reasoning about the robot's location and objects' locations within the map would be demonstrated.
Scavenger 10 pts demonstrating the ability to build a map of environmental features, assuming a static environment. Reasoning about the robot's location and objects' locations within this constructed map would be demonstrated.
Gofer 20 pts would show an ability to recognize and reason about a limited (and a priori known) set of possible changes to the environment during task performance, e.g., doors being open/closed
Concierge 80 pts would recognize, represent, and reason about any realistic changes to the environment during task performance, e.g., adding/removing furniture and movement of the objects to be retrieved
Task Awareness and Planning
Slacker 0 pts Tasks? What tasks?
Wanderer 5 pts Here the system seeks to locate, find, record, perhaps manipulate objects in the environment with an awareness and external expression of what subtask it is currently performing. Awareness, but not planning, is required at this competence level.
Scavenger 10 pts At this level, a system will not only be aware (and able to express) the current subgoal(s) it is trying to achieve, but it needs to demonstrate the ability to change its approach in the face of changing circumstances. In addition, the system should explicitly indicate that it is going to attempt a different strategy and then articulate that new strategy for achieving objectives.

This level of competence would include spatial replanning in the case of an unexpected obstacle, sensor-based replanning if object uncertainty is too high, identifying and asking a human for help if the system becomes lost or otherwise unable to achieve a goal, etc.
Gofer 20 pts would demonstrate the ability to consider many different possible plans of action and then choose and execute the most suitable one. This reasoning should be available to onlookers, along with the system's notion of its current level of success and anticipated success in accomplishing the (sub)task.
Concierge 80 pts should show the ability to explain what was tried and why it failed or succeeded, but tasks that fail will provide fuller tests of sophisticated systems - and such tasks will always be available!
Language and Communication
Slacker 0 pts No language-based communication capabilities demonstrated: the system uses menu/GUI inputs alone
Wanderer 5 pts shows the ability to parse and respond to short phrases entered by humans (who are not the system's designers) using a very restricted vocabulary (20-30 concepts/terms). Only one language modality (visual, i.e., printed-sign-reading or audio or keyboard input) needs to be implemented at this level. Presumably, this would be keyboard input.

The language inputs may be requested by the system, e.g., to obtain help, or may be intiated by a user, e.g., to provide direction or a goal. However, the system should demonstrably and appropriately change its behavior in response to the input received. In addition, the system should have a mechanism by which it indicates when a language-interaction has _not_ been understood.
Scavenger 10 pts demonstrating the ability to handle a less restricted vocabulary/grammar in receiving language-based inputs, e.g., full-sentence inputs that can handle >100 terms that could be entered by AI conference attendees with a minimum of prior introduction to the system's limitations. At this level, the system would also respond constructively to inputs it did not understand, e.g., suggesting alternatives that might be intended or providing tips on how humans might interact more fluidly.
Gofer 20 pts would show the ability to handle at least two language modalities (presumably adding visual or audio inputs) at the scavenger level. In addition, the system should be able to handle short interactions created by non-experts (not just AI researchers who did not build the system) and would have a vocabulary greater than 2000 terms.
Concierge 80 pts would recognize the use of visual, text-based, and audio language in humans' efforts to direct the system (human- initiated) and would be able to use visual and audio language effectively in order to seek help in its navigating and other spatial-reasoning tasks