HOMEWORK 3: Recognizing Digital Ink

This is an INDIVIDUAL assignment.

Objective

In this assignment we'll shift gears away from the low-level Swing component stuff, and explore how to effectively use digital ink in applications. We'll see how to build simple recognizers that are robust enough to work for a wide range of gestures, and how to integrate such a recognizer into your application.

This assignment is to create an implementation of the SiGeR recognizer discussed in class and the lecture notes, and integrate it into the Courier application. You'll use the recognizer to detect command gestures that will let you do simple things like move between pages, and move and delete on-screen objects without having to use the regular GUI controls.

The learning goals for this assignment are:

Experience with implementing a basic recognizer for digital ink strokes.
Experience creating recognition templates.
Experience integrating a recognizer into your application.

I've provided a lot of hints below about how you might go about implementing the required features.

Description

In this homework, we'll implement the SiGeR recognizer and integrate it into the Courier application as a way to perform commands. What this means is that certain gestures will be recognized as commands to control the application, rather than as simple digital ink that makes up the content of pages.

We'll use a modal method of telling the system which strokes should be interpreted as command gestures and which should be interpreted as ink. Strokes made with the "normal" (left) mouse button down should be interpreted as ink, while strokes made with the "context/menu" (right) button should be interpreted as command gestures. This means that in the code that handles strokes you'll now need to look at the modifiers to figure out whether to process the strokes as regular ink or pass them off to the recognizer.

To provide feedback to the user, command gestures should appear on the notebook page as the user draws them, but with a different visual appearance than ink; they should then disappear once the command has been completed. For example, if your ink is black, you might draw command gestures in red, and then have them disappear once the gesture has been completed.

Your application should recognize and act on the following set of gestures; I've listed these in order from easiest to most complex:

Page controls:

A "right angle" gesture (>) should move to the next page
A "left angle" gesture (<) should move to the previous page

Content movement and deletion controls:

A "pigtail" gesture (like a lowercase phi: a gesture in which you make a loop then draw downward through the loop) should indicate that you want to delete content from a page. The area enclosed in the loop should be deleted once the gesture is completed.
A basic loop gesture (without the pigtail downward stroke) should select content for movement. For example, you might circle a region of ink to select it. After this, the selected content should be rendered differently to indicate that it is selected (it could change color, or have a box drawn around it, or glow if you want to get really fancy). Then, the next mouse/pen input should allow the user to move the selected area on the page (press on the selected area, drag to new position, and then release).

Implementing the Recognizer

See the slides for details on how to implement SiGeR. Here are a few additional tips.

Decide on the representations you want to use first. By this I mean, figure out how you'll specify templates in your code, and how you'll represent the direction vector of input strokes.

I'd suggest defining a bunch of constants for the 8 ordinal directions SiGeR uses (N, S, E, W, NE, NW, SE, SW). Both direction vectors and templates will be defined as lists of these. You may also want to define some special "combination" constants (a "generally east" or "easts" constant that means either E, SE, or NE for example, or a "norths" or "up" constant that means either NW, N, or NE). These latter combination constants will only be used in the definition of templates, not in the vector strings you will produce from input gesture strokes. In other words, they allow you to define your templates a bit more loosely than just the specific 8 directions.

While defining such a set of human-readable constants isn't specifically necessary (you could just do everything in terms of the strings and regexp patterns described below), it can be very helpful for debugging to be able to write a template as a set of human-readable directions, rather than a raw regexp pattern.

Next, write a routine that takes an input gesture and produces the direction vector from it. In other words, given a series of points, it'll spit out a vector containing a list of the 8 ordinal direction constants. This direction vector represents the true shape of the input gesture.

Here's the only tricky part: you'll need to write a routine that turns the direction vector into an actual string of characters that contain the same information as in the vector, and another routine that takes the template info and produces a regexp pattern from it. The idea is that you'll see if the regexp pattern for the template actually matches the stringified representation of the direction vector.

There's a lot of flexibility in how you define the symbols in these strings. For the direction vector string, you'll probably just have a series of combinations of 8 separate letters, each representing one of the 8 ordinal directions.

For the regexp pattern, you'll want to generate a pattern that can match occurrences of any of these 8 letters, as well as "either-or" combinations of them ("generally east" for example, might be a pattern that matches either the letters representing E, SE, or NE). You'll also need to generate a pattern that can deal with "noise" at the first and end of the input string. The slides have some examples that show how to do this.

The actual matching process will then just compare an input stroke string to the list of template patterns, and report which (if any) it matches.

Defining the Templates

Your templates will be defined in your code, most likely as a set of declarations that look something like this:

int QUESTION_MARK = { UP, RIGHT, DOWN, LEFT, DOWN }
int UP_CARET = { NORTHEAST, SOUTHEAST }

You'll need to define templates for all of the required gestures. It may take a bit of tweaking to define the templates at the proper level of specificity.

Integrating the Recognizer

Remember that we're using a mode to distinguish ink (left mouse button) input versus gesture input (right mouse button). Gesture input should be drawn on screen while the gesture is being made, so that it provides feedback to the user. The gesture should disappear once the mouse is released.

One way to get this effect is to augment your page component slightly, to keep a reference to the single current gesture being drawn, which may be null if no gesture is in progress. The paint code then draws the display lists for strokes, shapes, and text, then the current gesture (if there is one), so that the gesture appears over the top of everything else.

Since the gesture is only a transient feature of the page, it should disappear once the user finishes drawing it: when the gesture is complete, it can be removed from the set of items to be displayed, and handed off to the recognizer to be processed.

If the gesture is not recognized, you should indicate this by displaying a message in the status bar that says "unrecognized gesture" or something.

If the gesture is recognized, what you do next depends on exactly what command was recognized.

For one of the page switching gestures, you should just move through the pages as you normally do when someone clicks on the page forward/page backward button.

For the delete gesture, once you've recognized the gesture you need to figure out what object(s) the gesture was drawn over. The way I'd suggest doing this is to take the bounding box of the gesture and then identifying which on-page objects have bounding boxes that are strictly contained in the gesture's box. Deleted items should simply be taken out of the display list(s) so that they do not appear; be sure to start a repaint so that the display is updated correctly.

The select-to-move gesture is perhaps the weirdest of these basic gestures, because it introduces a new mode into the UI. First, when items are selected through the loop gesture, they should be drawn differently on screen to indicate that they are selected. Again, you probably want to compare the bounding box of the loop to the bounding boxes of the on-page objects to determine what objects the user intends to select.

You need to keep track of the selected objects in some way; I'd suggest adding yet another variable to the component's instance data, which is a list of the selected objects (strokes, shapes, or text); your paint code will then display anything in this list differently (through color or a highlighted bounding box or whatever).

As long as something is selected, you're potentially in "move mode." My suggestion for how to implement this is to look at any mouse press that happens; if something is selected, and the mouse press is within the bounding box of one of the selected objects, then dragging the mouse moves the object (which should just be a matter of updating the X,Y coordinates of the items in the display list, and repainting. If the press happens outside of a selected item, you can "de-select" the selected stuff (take it out of the selected list and just draw it normally). This ends "move mode." The basic behavior here should be much like any paint program--when something is selected you can click into it to drag it; but as soon as you click outside, the object is de-selected.

NOTE: You *don't* have to worry about what happens if the selected area for a move or delete cut through one or more strokes--in other words, you don't have to be concerned with splitting a single stroke, shape, or block of text. Objects that aren't fully contained within the gesture can be considered to be outside the selection area.

Extra Credit

As usual, there are a lot of ways you might make this assignment much fancier than described:

Implement a richer command set than described here (cut/copy/paste, for example). Bonus points vary depending on the complexity, but probably +1 to +3 per command. NOTE: if you implement additional gestures, be sure to include a description of what those gestures are so that we can test them!
Gratuitous graphical richness, such as having deleted objects crumple themselves up or vanish in a puff of smoke, or having selected items surrounded by a pulsating glow. Variable, but probably +1 or +2 points.
Implement the Tivoli 9-square recognizer and use it for some of your gestures (in ADDITION to implementing the SiGeR recognizer): +2 points.
Implement the One-Dollar recognizer and use it for some of your gestures (in ADDITION to implementing the SiGeR recognizer): Variable, but at least +5 poins.
Allow move or delete gestures to "split" objects in a clean way: +3 points

Deliverable

See here for instructions on how to submit your homework. These instructions will be the same for each assignment.