The Workbench: HTML5 canvas: Simple per-pixel image picking

If you’re writing a 2D game in HTML5, chances are that you’ll want the user to have the ability to pick an object on the screen; Since there’s no direct API for this type of notion, and no good results / examples on the web, let’s take a look at how to properly do picking with an HTML5 canvas.

Picking can be a tricky, troublesome, and socially awkward process; especially in kickball...

Pixel Perfect picking

For games that utilize picking for desktop applications, pixel-perfect response to a mouse click is crucial. Multiple images can be overlaid against each other, each one that can have varied alpha footprints which in no-way match their conservative bounding box estimates. As such, to do a pixel-perfect pick, you’ll need to be able to determine what pixel, from what image was clicked on; mainly identifying that you’ll need to keep a copy of the image data in memory so that your code can query the pixel array.

In order to do this in HTML5, we need to introduce two separate data structures, a sprite prototype, which represents a single loaded image sprite, and a sprite instance which represents an instance of the prototype on the canvas, that is, we assume that a single image is used multiple times on a canvas. Our sprite prototype then will need to contain the pixel data for it’s image element, such that we can query against it later.

Loading pixel data

The cornerstone of our picking process is assuming that your images contain alpha values, which we generally assume is loaded this way:

var img = new Image();
img.onload = function(){alert(‘loaded!”);}
img.src = filename;

Notice the problem here is that in Javascript, we don’t directly handle the pixel data; it’s handled behind the scenes on our behalf.
To get the data then, we need to do some extra work. We could start by writing a Javascript PNG decoder, but that would be massive overkill, considering PNGs support lossless compression. Since we’re really only concerned with the alpha values of an image, we could store the alpha channel in a separate .RAW file that we fetch in parallel, however this would increase the transfer and asset size of the app.
For the sake of our purposes, we ignore those two options, and instead decide to keep the code footprint low, and transfer sizes low by using the canvas element to fetch the data. To do this, we create an off-screen canvas, render our image to it, and fetch the pixels of the canvas object back to memory.

var offScreenCanvas= document.createElement('canvas');
var fetch_ctx = offScreenCanvas.getContext('2d');
offScreenCanvas.width = 128;
offScreenCanvas.height = 128;
function fetchImageData(imgObject, imgwidth,imgheight)
{
 fetch_ctx.clearRect(0,0,128,128);
 fetch_ctx.drawImage(imgObject, 0, 0);

 //note this keeps an additional in-memory copy 
 var imgDat = fetch_ctx.getImageData(0,0, imgwidth, imgheight);
 
 return imgDat;
}

This allows us to transfer a smaller asset footprint, keep using our PNGs / GIFs or whatever other compression footprint you want, and still get the RGBA data available in main memory during load time. to utilize this, once an image has been loaded, we fetch it’s image data using the function above :

var img = new Image();
img.onload = function(){

targetSpriteProto.imgHandle = img;
targetSpriteProto.imgData = fetchImageData(targetSpriteProto.imgHandle,w,h);
   
}
img.src = filename;

Testing a mouse click

Once we have the per-image data in memory, we need to test against it when a user clicks. This is broken down into a few sections of a larger function.

Firstly the findClickedSprite function will loop through all the sprite instances in memory, and do a conservative bounding box test against the picking point; we assume that array-lookup is a performance limiting action in javascript, and this bounding-box test allows an early out for items that don’t potentially intersect with the pick position.

//--------------
function findClickedSprite(x,y)
{
 var pickedSprite = null;
 var tgtents = spriteInstances;

 //loop through all sprites 
 for(var i =0; i < tgtents.length; i++)
 {
  var sp = tgtents[i];
  //pick is not intersecting with this sprite
  if( x < sp.pos.x || x > sp.pos.x + sp.size.w ||
   y < sp.pos.y || y > sp.pos.y + sp.size.h)
   continue;

Once we find a sprite instance whose bounding box intersects with the picking point, we grab the sprite prototype, and translate the canvas-relative mouse position to a sprite-instance-relative position that we use to test against. These values are passed to a function on the sprite prototype to determine if the target pixel is transparent or not. If we’re clicking an opaque pixel for this sprite, we set this as the selected sprite.

  var ps = sp.spriteHandle;
  //get local coords and find the alpha of the pixel
  var lclx = x - sp.pos.x;
  var lcly = y - sp.pos.y;
  if(ps.isPixelTransparent(lclx,lcly))
  {
   pickedSprite = sp;
  }
 }
 
return pickedSprite;
 
}

Note that an important part of isometric and top-down 2D games is the notion of zOrder, in which a separate index is used to determine and describe how to properly render the elements on the screen. Our picking algorithm needs to take this into account, so that what the user thinks is top-most, is actually represented as top-most.

if(ps.isPixelTransparent(lclx,lcly))
{
 //do depth test (if applicable)
 if(pickedSprite && sp.zIndex < pickedSprite.zIndex)
  continue;
 pickedSprite = sp;
}

The isPixelTransparentI function for the proto-sprite does very simple logic. Firstly it determines what the proper pixel is in the image data that we’re targeting; (Note that data given back from the canvas is always in RGBA form!) and tests the alpha value against some threshold. The threshold is important, as most artists can add gradient falloffs, drop shadows, and other items which increase the visual of the item, but shouldn't be considered for picking purposes.

isPixelTransparent:function(lclx,lcly)
{
 var alphaThreshold = 50;
 var idx = (lclx*4) + lcly * (this.imgData.width*4);
 var alpha = this.imgData.data[idx + 3];
 //test against a threshold
 return alpha > alphaThreshold;
}

Results

The results are quite nice. We can select the right object out of a very complex pixel coverage area.

Caveats

While this method works, and produces pixel-perfect results, it presents two primary issues:

1) Image data, which is normally stored in your javascript layer behind the scenes, now has to be duplicated in your scripts. As such, this results in a larger memory footprint; Often more than double the size, since your in-memory copy is uncompressed.

2) It’s currently unclear how an array look-up affects performance in javascript under the hood. In C++ you have the ability to avoid CPU addressing issues like L2 Cache optimization for array traversal, which is completely missing in Javascript. On my 12 core work-machine, a single pick against 4096 images takes around ~2ms. I’d imagine on a phone, that would be significantly higher.

And finally, it’s unclear if you really need pixel-perfect picking for your game; For instance, the user may benefit from a more loosely defined picking area, that is allowing an extension of the valid picking area beyond the pixel boundaries around the object, in an attempt to reduce user picking frustration.

Next Time

In the next article, we'll talk about some advanced and faster ways to do picking on an HTML5 canvas, Stay tuned!

Source Code

You can grab the working source-code from my github page. Note that you’ll have to access the page via the python server (run python httpd.py first, and browse to localhost:5103) as the getImageData function on a canvas can’t be accessed locally.

~Main

You can find Colt McAnlis here:

The Workbench

Pages

Oct 29, 2012

HTML5 canvas: Simple per-pixel image picking