A couple of weeks ago I had an itch to start Go… and OpenCV. I decided to scratch that itch by working on a Sudoku puzzle solver web app powered by OpenCV and with a little machine learning to boot. The idea was simple enough: learning Go Upload an image and “parse” the puzzle with OpenCV e.g. find the grid and digits Use a test set of digits from sample puzzles to train a machine learning model to identify the digits Solve the puzzle using some constraint propagation algorithms I learned while working through the program Udacity AI nanodegree Along the way, diving into OpenCV led to cutting my teeth on C++, spending a fair amount of time understanding CGO (the bridge between Go and C/C++) and a few new nuggets of Docker knowledge. Here’s a quick demo of how it ended up: Read on for how it came together… “Parsing” Sudoku Images with OpenCV The current 3.x version of OpenCV exposes a C++ API so I installed the for Visual Studio Code and started reading sample OpenCV code. I wanted to be able to detect both “digital puzzle” images such as the one on the left and pictures of puzzles that may have some skew and poor lighting like the example on the right. C/C++ Tools plugin Original Sudoku Puzzle Images I ended up with the following image processing steps: — Subsequent steps don’t rely on color and assuming a consistent size made some of the later processing more simple Convert to Grayscale and Resize —The next task was detecting the Sudoku grid in the image. Supporting the “digital puzzle” images only would’ve been significantly easier but the newspaper images required more work. It started with some and . Extract and “Un-warp” the Grid denoising adaptive thresholding Thresholding and Denoising Applied These steps prepare for the ( ) which further reduces the information in the image to just object edges like the grid we’re after. Canny edge detection algorithm OpenCV docs Canny Edge Detection Algorithm Now we can use OpenCV’s algorithm to detect shapes in the image; the mode is used to limit the search to only extreme contours skipping any that lie other contours. [findContours](http://docs.opencv.org/3.2.0/d3/dc0/group__imgproc__shape.html#ga17ed9f5d79ae97bd4c7cf18403e1689a) [RETR_EXTERNAL](http://docs.opencv.org/3.2.0/d3/dc0/group__imgproc__shape.html#ga819779b9857cc2f8601e6526a3a5bc71) outer inside Contour Detection (there is a single contour in the image on the left if you look close) With an array of contours, which are simply lists of X/Y coordinates that “trace” the shape, the next task was to find the largest contour with the help of and then identify which X/Y coordinates are the corners of the grid by finding the outermost point in the contour within each quadrant of the area covered by the contour. [contourArea](http://docs.opencv.org/3.2.0/d3/dc0/group__imgproc__shape.html#ga2c759ed9f497d4a618048a2f56dc97f1) Detecting Perspective Transform Points for the Largest Contour These corners are then used to “un-warp” the grid section of the image with OpenCV’s and . While not much has changed in the “digital puzzle”, now our newspaper photo and “digital puzzle” image are on more equal footing for subsequent processing. [getPerspectiveTransform](http://docs.opencv.org/3.2.0/da/d54/group__imgproc__transform.html#ga8c1ae0e3589a9d77fffc962c49b22043) [warpPerspective](http://docs.opencv.org/3.2.0/da/d54/group__imgproc__transform.html#gaf73673a7e8e18ec6963e3774e6a94b87) Perspective Transform Applied to Detected Grid — Before we looking for the digit regions of the image, it’s useful to get rid of the grid lines. A (I used 1px horizontal and vertical kernels) are passed to and in order clean up the grid lines. is again used but this time the contours are filtered by aspect ratio to just those that approximate horizontal and vertical lines. Eliminate Grid Lines structuring element [erode](http://docs.opencv.org/3.2.0/d4/d86/group__imgproc__filter.html#gaeb1e0c1033e3f6b891a25d0511362aeb) [dilate](http://docs.opencv.org/3.2.0/d4/d86/group__imgproc__filter.html#ga4ff0f3318642c4f469d0e11f242f3b6c) [findContours](http://docs.opencv.org/3.2.0/d3/dc0/group__imgproc__shape.html#ga17ed9f5d79ae97bd4c7cf18403e1689a) Grid lines detected in each image (pink is just the image border) Sometimes the newspaper lines were still too “curvy” or indistinct to pass as grid lines (in this implementation) as in the example here. These regions are then expanded slightly and from the image yielding the following result. subtracted Removing Grid Lines and Denoising to Prepare for Digit Detection — Finally it’s time to detect the digit regions of the image. Once again is used followed by another aspect-ratio filter inspired by my friend . This filter allows us to skip over contours for some of the horizontal artifacts left in the newspaper image on the right. Finding Digits [findContours](http://docs.opencv.org/3.2.0/d3/dc0/group__imgproc__shape.html#ga17ed9f5d79ae97bd4c7cf18403e1689a) Kevin Kazmierczak’s article Detecting Digit Regions of the Image Machine Learning with OpenCV Having found digit regions in the image, the next step was to identify what number was present in each digit image. I eyeballed several sample images and put together a simple csv data file of sample sudoku images and the digits they contained as follows: Then each sample image was fed through the “parsing” logic described earlier to find the digit regions. The center of each digit region was matched to the nearest center point of a 9x9 grid imposed on the image and then assigned a digit value based on the data in the .csv file. Each digit was then resized to a consistent size and combined into an overall “training” image where the top row contains all the “1” images, the second row contains all the “2” images, etc. This becomes our raw data to train a machine learning model. Digit regions extracted from several sample Sudoku puzzle images and combined together There are plenty of digit recognition tutorials out there but I found of particular value for it’s clear and detailed explanation. I adapted his approach for this project. To quickly summarize, a (see also of the HOG Descriptor) is computed for each digit image. 80% of these descriptors along with the digit label (inferred from the row in our training image) are then fed as our “observations” into to train the model. The trained SVM model is then used to predict the appropriate digit for the remaining 20% of the digits. Satya Mallick’s Histogram of Gradients Descriptor Satya’s great explanation OpenCV’s [SVM](http://docs.opencv.org/3.2.0/d1/d2d/classcv_1_1ml_1_1SVM.html) class I built a to invoke training process and ended up with 97.73% accuracy on the set of sample puzzles I gathered. The trained SVM model is saved and loaded again later for use in solving novel puzzles. small C++ CLI tool albeit limited OpenCV 3.x C++ Integration with Go via CGO All the work thus far was done with C++ ( ). Invoking this from Go turned out to be trickier than I anticipated, an easy scenario for my first work in Go. if you look at the code be gentle, I’m a C++ newbie as of this writing not enables Go to call C code using a special comment called a preamble to specific import of a “C” package (this import should be on a line by itself) like this. Notice that the preamble supports platform-specific values to support cross-compiling the safe codebase on different platforms e.g. darwin (my local setup) and linux (the docker container created to run this app). [cgo](https://golang.org/cmd/cgo/) Quoting (emphasis mine): the documentation When the Go tool sees that one or more Go files use the special import “C”, it will look for other non-Go files and compile them as part of the Go package. Any .c, .s, or .S files will be compiled with the C compiler. Any .cc, , or .cxx files . in the directory .cpp will be compiled with the C++ compiler That seemed simple enough but I lost time due to a couple of the points I emphasized above: — So what’s the deal with the mention of files being compiled with the C++ compiler? Well, as a newbie to C/C++ it took me a while to understand that will compile C++ that is that is being imported. However, you cannot directly import C++ code which means C++ types such as and have to be converted to their C equivalent types. CGO let’s you call C code (not call C++ code) .cpp cgo referenced from the C code [string](http://www.cplusplus.com/reference/string/string/) [vector](http://www.cplusplus.com/reference/vector/vector/?kw=vector) — My original C++ Sudoku puzzle parsing code was organized into src/ and include/ directories and my preamble was trying to include a C header from a subdirectory. That did not work. CGO compiles files in the same directory ONLY cgo I ended up with the following directory structure to get properly compiling and linking my C++ OpenCV code into the imported “C” package cgo There were a few more notable gotchas: — , the Go debugger is awesome. However, I wasn’t able get it working once I started using . I couldn’t tell if it was actually supported or I just had issues. I found it easier to launch and debug the C++ code separately Debugging with CGO Delve cgo — Early on I was passing back a string from C++ which represented the “parsed” sudoku puzzle and getting strange garbage results intermittently. I ended up allocating C memory beforehand and passing pointers into the C code allowing it to read/write from that allocated memory. Go’s and ensure we free it up when we’re done. See below for an example. Watch Your Pointers! defer C.free() The result of this call is simply an 81 character string identifying the digits that were detected from the image: Solving the Sudoku Puzzle Going into how to solve a Sudoku puzzle given the digits is a primary focus of this article; if that’s what you’re after, checkout . However, I’ll just note that the main technique is constraint propagation, where we essentially use constraints on the problem domain to limit the number of options to search. If we conceive of a Sudoku board as a 9x9 grid in which each cell can take a value between 1–9 then we have 9⁸¹ possible board states (over 3 billion). Of course we know there are many constraints which limit the search space: not Peter Norvig’s approach and explanation naively The known starting digits Each row must contain all digits 1–9 Each column must contain all digits 1–9 etc. By applying these constraints we can drastically reduce the possible search space. Once a constraint is applied to narrow the search space we can re-evaluate other constraints to iteratively narrow in on valid values for each cell in the puzzle. Combined with a search algorithm for cases when constraints do not eliminate all possibilities, most puzzles can be solved in a few seconds or less. Check out for the Go implementation used in this project. [sudokuboard.go](https://github.com/jamesandersen/go-sudoku/blob/master/sudokuboard.go) In addition to the Sudoku solving code, the Go portion of the project contains some unremarkable code for serving a static directory of web content where the UI of the project lives and accepting a multipart file upload to which Sudoku puzzle images are sent. It also uses the to compress responses. NY Times Gzip handler Docker Deployment OK, so we’ve got OpenCV code to “parse” a puzzle image (passed as a byte array), we’ve integrated that with a simple Go web app that takes in a puzzle image, hands it off to be parsed by OpenCV and then solves the missing digits. Great! But where to deploy it? The dependency on OpenCV means we do not have a self-contained binary and rules out something like the on Google Cloud Platform. A containerized solution was the next logical choice so off to DockerHub… Go Standard Environment ? ✔ Go image OpenCV image? ! Plenty of ‘em Go OpenCV 3.x? ☹ and … so I cobbled together my , an Alpine Linux based image with Go 1.8.3 and OpenCV 3 libraries that are in Alpine Linux There were a few difficulties here such as getting to the version specific opencv libs but eventually this container, plus the platform specific preamble seen earlier got me successfully building the app inside a docker container. first contribution to DockerHub fortunately available Edge. symbolic links setup cgo Having read Kelsey Hightower’s article on , I was conscious that, while I could run my app from this image, it would be inflated because all the Go tooling was still present. Luckily for me, the Docker team added in Docker 17.05 since Kelsey wrote his article and now, in a Dockerfile I can build the app, and copy the Go binary from a previous stage to a clean, lighter weight image! Great work Docker team! tiny Docker containers for Go apps multi-stage builds single discard the build image with SDK tools, etc. Thanks to folks at for offering a free tier for dead-simple hosting of a docker container. Hyper.sh Conclusion If you’ve hung in this long… thanks! I hope there’s been something valuable for you. The solver is by no means foolproof; it still has trouble with some images and either will fail to parse them or parse them incorrectly leading to failure to solve them (because it may have parsed as an invalid puzzle). The goal of course was to ramp up on some new technologies and the project has been valuable from that perspective. The solver is live (as of this writing) at and if you’re interested. Drop me a note if you find it useful or have any follow-up questions. Thanks! http://gosudoku.jander.me/ source code is on GitHub