Invoice Scanning
I’ve just recently worked on invoice scanning for Finances. It lets you scan invoices on iPhone or iPad and add them as a PDF document to transactions. In this post I will show you how I’ve implemented that feature using the frameworks available on iOS.
Lets start by looking at the final result. You can see the invoice scanning in the Finances trailer. The user interface looks very similar to the document scanning UI in Apple’s Notes app on iOS 11. That’s not a coincident. I’ve reimplemented the exact same user interface, because most iOS users are already familiar with it. Also I found it an interesting challenge to implement it myself.
Scanning Screen
The scanning screen is the first screen presented to the user. Even though it provides a lot of information, it doesn’t feel overwhelming. On iPhone the user interface elements are placed at the top and bottom of the screen. On iPad the layout is a little bit different but pretty much the same.
The bar at the top is a UIToolbar
pinned to the top of the screen. There was no need to implement a custom user interface element for this.
All other user interface elements are implemented by view controllers. Those view controllers are managed by a container view controller. By doing it this way I was able to create simple and resuable controllers. I’ve tried to avoid view controllers with hundreds lines of code. Those massive view controllers become complex very fast. You will also have a hard time maintaining that code as well. Dave DeLong has a great write-up about the problems of massive view controllers.
Now lets take a look at the actual implementation.
Camera Output
The camera output view displays the output of an AVCaptureSession
in a AVCaptureVideoPreviewLayer
.
Invoice Outline
The outline of an invoice is detected in an image by using a VNDetectRectanglesRequest
, which is provided by the Vision framework on iOS 11. Performing the request can be done with just a few lines of code.
1import Vision
2
3// let image: CVImageBuffer = …
4
5let request = VNDetectRectanglesRequest {
6 (request, error) in
7 let observations = request.results as! [VNRectangleObservation]
8 // ...
9}
10let requestHandler = VNSequenceRequestHandler()
11requestHandler.perform([request], on: image)
Info Message
This view controller positions a UILabel
centered in a rounded view. The view controller provides one method to show and hide the text, optionally with a fade animation.
Shutter Release
The shutter release button consists of two views – the inner circle and outer ring. This view controller handles touch events and animates the inner circle accordingly.
Photo Preview
The photo preview holds a preview of all scanned invoices. The previews are displayed in a UICollectionView
. I’m using a subclass of UICollectionViewFlowLayout
to layout the cells. As you can see from the following videos the custom flow layout stacks the cells once there is no more space left between them.
If a photo is recorded, it is presented with a 3D animation. The animation is done by calculating a transformation matrix and applying the CATransform3D
to an image view. The transformation is based on the invoice outline which is detected using a VNDetectRectanglesRequest
. A similar animation is used when editing a photo.
Container View Controller
All view controllers are managed by a container view controller. This view controller is responsible for laying out the child view controllers and mediating between the controllers. For example when a photo is taken by tapping the shutter release, the container view controllers gets notified. It then queries another view controller to record a photo using the AVCaptureSession
. The photo is then rectified based on the invoice outline, presented to the user with a 3D animation and then moved to the photo preview stack.
Photo Editing
Once a photo is taken the invoice outline can be edited by the user. The photo editing screen lets the user change the corner position of the invoice outline. The outline rectangle is a quadrilateral. Based on the outline, the invoice is rectified using the CIPerspectiveCorrection
filter.
1import CoreImage
2
3// let image: CIImage = …
4// let quadrilateral: Quadrilateral = …
5
6let parameters = [
7 "inputTopLeft" : quadrilateral.topLeft,
8 "inputTopRight" : quadrilateral.topRight,
9 "inputBottomRight" : quadrilateral.bottomRight,
10 "inputBottomLeft" : quadrilateral.bottomLeft,
11]
12let rectified = image.applyingFilter("CIPerspectiveCorrection", parameters: parameters)
The quadrilateral is also used to transform the layer to get the 3D animations, as you can see in the following video. I love this animation. It gives you a sense of what is going on when an invoice is cut out from the photo and rectified.
Conclusion
Apple’s Notes app on iOS 11 has such a good user interface that I had to implement it myself. It has a lot of nice little touches and animations. I’ve split up every UI element into its own view controller to create independent components. I’ve tried to avoid massive view controllers as much as possible and I ended up with view controllers with simple implementations and clear interfaces.
The Vision framework on iOS is used to detect the invoice outline. Rectifying the image is done by the CoreImage framework.
Overall I’m really happy with how this turned out. You can try it out yourself in Finances for iOS.