今天,我们将扫描文档并显示该文档中可识别的文本。您无需为此安装其他库:VisionKit可用于扫描,而Vision可用于文本识别。
首先,请确保您已安装Xcode 11和iOS 13,然后创建一个具有Storyboard支持的新项目。我们将使用摄像机进行扫描。因此,我们需要将NSCameraUsageDescription添加到Info.plist,如果没有该应用程序,它将崩溃。
扫描
为了扫描文档,我们使用VisionKit Framework。要打开扫描屏幕,您需要从VNDocumentCameraViewController创建一个新示例并输出:let scanner = VNDocumentCameraViewController()
scanner.delegate = self
present(scanner, animated: true)
将VNDocumentCameraViewControllerDelegate添加到ViewController:class ViewController: UIViewController, VNDocumentCameraViewControllerDelegate {
...
单击“取消”或错误后,关闭打开的屏幕:func documentCameraViewControllerDidCancel(_ controller: VNDocumentCameraViewController) {
controller.dismiss(animated: true)
}
func documentCameraViewController(_ controller: VNDocumentCameraViewController, didFailWithError error: Error) {
controller.dismiss(animated: true)
}
扫描并单击“保存”后,将执行以下操作:func documentCameraViewController(_ controller: VNDocumentCameraViewController, didFinishWith scan: VNDocumentCameraScan) {
for i in 0 ..< scan.pageCount {
let img = scan.imageOfPage(at: i)
}
controller.dismiss(animated: true)
}
每个页面都可以单独处理。文字识别
我们弄清楚了扫描过程,现在我们提取了文本。为了使一切顺利进行,我们将在后台进行识别。为此,创建一个DispatchQueue:lazy var workQueue = {
return DispatchQueue(label: "workQueue", qos: .userInitiated, attributes: [], autoreleaseFrequency: .workItem)
}()
对于识别,我们需要一个VNImageRequestHandler用图片和VNRecognizeTextRequest与选项recognitionLevel,customWords,recognitionLanguages,以及完成处理,这将使结果以文本形式。完成后,我们收集最佳文本选项并显示:lazy var textRecognitionRequest: VNRecognizeTextRequest = {
let req = VNRecognizeTextRequest { (request, error) in
guard let observations = request.results as? [VNRecognizedTextObservation] else { return }
var resultText = ""
for observation in observations {
guard let topCandidate = observation.topCandidates(1).first else { return }
resultText += topCandidate.string
resultText += "\n"
}
DispatchQueue.main.async {
self.txt.text = resultText
}
}
return req
}()
VNImageRequestHandler:func recognizeText(inImage: UIImage) {
guard let cgImage = inImage.cgImage else { return }
workQueue.async {
let requestHandler = VNImageRequestHandler(cgImage: cgImage, options: [:])
do {
try requestHandler.perform([self.textRecognitionRequest])
} catch {
print(error)
}
}
}
最新的ViewController
import UIKit
import Vision
import VisionKit
class ViewController: UIViewController, VNDocumentCameraViewControllerDelegate {
@IBOutlet weak var txt: UITextView!
lazy var workQueue = {
return DispatchQueue(label: "workQueue", qos: .userInitiated, attributes: [], autoreleaseFrequency: .workItem)
}()
lazy var textRecognitionRequest: VNRecognizeTextRequest = {
let req = VNRecognizeTextRequest { (request, error) in
guard let observations = request.results as? [VNRecognizedTextObservation] else { return }
var resultText = ""
for observation in observations {
guard let topCandidate = observation.topCandidates(1).first else { return }
resultText += topCandidate.string
resultText += "\n"
}
DispatchQueue.main.async {
self.txt.text = self.txt.text + "\n" + resultText
}
}
return req
}()
@IBAction func startScan(_ sender: Any) {
txt.text = ""
let scanner = VNDocumentCameraViewController()
scanner.delegate = self
present(scanner, animated: true)
}
func recognizeText(inImage: UIImage) {
guard let cgImage = inImage.cgImage else { return }
workQueue.async {
let requestHandler = VNImageRequestHandler(cgImage: cgImage, options: [:])
do {
try requestHandler.perform([self.textRecognitionRequest])
} catch {
print(error)
}
}
}
func documentCameraViewController(_ controller: VNDocumentCameraViewController, didFinishWith scan: VNDocumentCameraScan) {
for i in 0 ..< scan.pageCount {
let img = scan.imageOfPage(at: i)
recognizeText(inImage: img)
}
controller.dismiss(animated: true)
}
func documentCameraViewControllerDidCancel(_ controller: VNDocumentCameraViewController) {
controller.dismiss(animated: true)
}
func documentCameraViewController(_ controller: VNDocumentCameraViewController, didFailWithError error: Error) {
print(error)
controller.dismiss(animated: true)
}
}

下一步是什么?
文档:developer.apple.com/documentation/visiondeveloper.apple.com/documentation/visionkitWWDC语音框架视频:developer.apple.com/videos/all-videos/?q = Vision GitHub项目:github.com/usenbekov /视觉演示