Fooling OCR Systems with Adversarial Text Images

arXiv: Learning, Volume abs/1802.05385, 2018.

Cited by: 1|Bibtex|Views34
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org
Weibo:
We demonstrated that Optical character recognition systems based on deep learning are vulnerable to targeted adversarial examples

Abstract:

We demonstrate that state-of-the-art optical character recognition (OCR) based on deep learning is vulnerable to adversarial images. Minor modifications to images of printed text, which do not change the meaning of the text to a human reader, cause the OCR system to recognize a different text where certain words chosen by the adversary ar...More

Code:

Data:

0
Introduction
  • Machine learning (ML) techniques based on deep neural networks have led to major advances in the state of the art for many image analysis tasks, including object classification [29] and face recognition [52].
  • Optical character recognition (OCR) is another image analysis task where deep learning has led to great improvements in the quality of ML models.
  • It is different from image classification in several essential ways.
  • Modern OCR models are not based on classifying individual characters
  • Instead, they assign sequences of discrete labels to variable-sized inputs.
  • Both the input image and the output text can vary in length, and the alignment of image regions to the corresponding text characters is not known a priori
Highlights
  • Machine learning (ML) techniques based on deep neural networks have led to major advances in the state of the art for many image analysis tasks, including object classification [29] and face recognition [52]
  • They assign sequences of discrete labels to variable-sized inputs. They recognize text line by line, as opposed to character by character. This presents a challenge for the adversary because, as we show, attacks that paste adversarial images of individual characters into an input image are ineffective against the state-of-the-art models
  • Tiny changes to input images can dramatically change the output of the natural language processing models operating on the results of Optical character recognition applied to these images
  • We demonstrated that Optical character recognition systems based on deep learning are vulnerable to targeted adversarial examples
  • Minor modifications to images of printed text cause the Optical character recognition system to “recognize” not the word in the image but its semantic opposite chosen by the adversary
  • The adversarial examples in this paper were developed for the latest version of Tesseract, a popular open-source Optical character recognition system based on deep learning
Methods
  • 4.1 Setup

    The authors used the latest Tesseract version 4.00 alpha, which employs the deep learning model described in Table 1 for recognition.
  • The authors implemented the attack described in Section 3.2 with the Adam optimizer [28], generated adversarial examples using the Tensorflow implementation, and evaluated them by directly applying Tesseract.
  • Some examples of the pairs in the list are presence/absence, superiority/inferiority, disabling/enabling, defense/offense, and ascend/descend.
  • The authors render these words with 16 common fonts and set their antonyms as the target output.
  • The perturbation is very minor but the output of Tesseract is the opposite of the word appearing in the image
Results
  • The logistic regression model achieves 78.4% accuracy on RT’s test data and the CNN model achieves 90.1% accuracy on IMDB’s test data.
  • For RT, the adversarial images achieve 92% target accuracy.
  • For IMDB, the adversarial images achieve 88.7% target accuracy and the sentiment classifier’s accuracy drops from 100% to 0% on the OCR-recognized text
Conclusion
  • Conclusions and Future

    Work

    The authors demonstrated that OCR systems based on deep learning are vulnerable to targeted adversarial examples.
  • Minor modifications to images of printed text cause the OCR system to “recognize” not the word in the image but its semantic opposite chosen by the adversary.
  • The adversarial examples in this paper were developed for the latest version of Tesseract, a popular open-source OCR system based on deep learning.
  • They do not transfer to the legacy version of Tesseract, which employs character-based recognition.
  • Transferability of adversarial images across different types of OCR models is an open problem
Summary
  • Introduction:

    Machine learning (ML) techniques based on deep neural networks have led to major advances in the state of the art for many image analysis tasks, including object classification [29] and face recognition [52].
  • Optical character recognition (OCR) is another image analysis task where deep learning has led to great improvements in the quality of ML models.
  • It is different from image classification in several essential ways.
  • Modern OCR models are not based on classifying individual characters
  • Instead, they assign sequences of discrete labels to variable-sized inputs.
  • Both the input image and the output text can vary in length, and the alignment of image regions to the corresponding text characters is not known a priori
  • Methods:

    4.1 Setup

    The authors used the latest Tesseract version 4.00 alpha, which employs the deep learning model described in Table 1 for recognition.
  • The authors implemented the attack described in Section 3.2 with the Adam optimizer [28], generated adversarial examples using the Tensorflow implementation, and evaluated them by directly applying Tesseract.
  • Some examples of the pairs in the list are presence/absence, superiority/inferiority, disabling/enabling, defense/offense, and ascend/descend.
  • The authors render these words with 16 common fonts and set their antonyms as the target output.
  • The perturbation is very minor but the output of Tesseract is the opposite of the word appearing in the image
  • Results:

    The logistic regression model achieves 78.4% accuracy on RT’s test data and the CNN model achieves 90.1% accuracy on IMDB’s test data.
  • For RT, the adversarial images achieve 92% target accuracy.
  • For IMDB, the adversarial images achieve 88.7% target accuracy and the sentiment classifier’s accuracy drops from 100% to 0% on the OCR-recognized text
  • Conclusion:

    Conclusions and Future

    Work

    The authors demonstrated that OCR systems based on deep learning are vulnerable to targeted adversarial examples.
  • Minor modifications to images of printed text cause the OCR system to “recognize” not the word in the image but its semantic opposite chosen by the adversary.
  • The adversarial examples in this paper were developed for the latest version of Tesseract, a popular open-source OCR system based on deep learning.
  • They do not transfer to the legacy version of Tesseract, which employs character-based recognition.
  • Transferability of adversarial images across different types of OCR models is an open problem
Tables
  • Table1: Neural network architecture1of Tesseract’s text recognition model
  • Table2: Results of attacking single words rendered with different fonts (B indicates bold, I indicates italic). Clean acc is the accuracy of Tesseract (percentage of predictions that match the ground truth) on clean images. Target accuracy is the accuracy of Tesseract predicting the target word (antonym) on adversarial images. Rejected is the percentage of adversarial images that are rejected by Tesseract due to large perturbation. Avg L2 is the average L2 distance between clean and adversarial images
  • Table3: Class transformation accuracy on the 20 Newsgroup dataset. Classes 1 through 4 are atheism, religion, graphics, space respectively. An a/b entry in row i and column j of the table means that, on average, fraction b of the words in each text needs to be changed so that texts from class i are misclassified by the model as class j with accuracy a
Download tables as Excel
Related work
  • Adversarial examples for computer vision. Recent research has shown that deep learning models are vulnerable to adversarial examples, where a small change in the input image causes the model to produce a different output. Prior work focused mainly on image classification tasks [10, 16, 41, 50], where the input is a fixed-size image and the output is a class label. Carlini and Wagner demonstrated an attack that improves on the prior state of the art in terms of the amount of perturbation and success rate [10]. Their method of generating adversarial examples is designed for classification problems and cannot be directly applied to OCR models.

    The Houdini approach [12] is based on a family of loss functions that can generate adversarial examples for structured prediction problems, including semantic segmentation and sequence labeling. Houdini is tailored for minimizing the performance of the model, as opposed to constructing targeted examples, and is not ideal for targeted attacks against OCR that aim to trick the model into outputting a specific text chosen by the adversary.
Funding
  • This work is partially supported by a grant from Schmidt Sciences
Study subjects and analysis
pairs: 120
We downloaded the parameters of Tesseract’s recognition model and loaded them into our Tensorflow [1] implementation of the same recognition model.2. We implemented the attack described in Section 3.2 with the Adam optimizer [28], generated adversarial examples using our Tensorflow implementation, and evaluated them by directly applying Tesseract.

4.2 Attacking single words

We selected 120 pairs of antonyms from WordNet [37] that meet our threshold requirement on the edit distance
. We set the threshold adaptively according to the number of characters in the word (2, 3, or 4 if the number of characters is, respectively, 5 or less, 6 to 9, or above 9)

pairs: 120
4.2 Attacking single words. We selected 120 pairs of antonyms from WordNet [37] that meet our threshold requirement on the edit distance. We set the threshold adaptively according to the number of characters in the word (2, 3, or 4 if the number of characters is, respectively, 5 or less, 6 to 9, or above 9)

Reference
  • ABADI, M., BARHAM, P., CHEN, J., CHEN, Z., DAVIS, A., DEAN, J., DEVIN, M., GHEMAWAT, S., IRVING, G., ISARD, M., ET AL. TensorFlow: A system for large-scale machine learning. In OSDI (2016).
    Google ScholarLocate open access versionFindings
  • Abbyy automatic document classification. //www.abbyy.com/en-eu/ocr-sdk/key-features/
    Findings
  • classification, 2016.
    Google ScholarFindings
  • [3] AMODEI, D., ANANTHANARAYANAN, S., ANUBHAI, R., BAI, J., BATTENBERG, E., CASE, C., CASPER, J., CATANZARO, B., CHENG, Q., CHEN, G., CHEN, J., CHEN, J., CHEN, Z., CHRZANOWSKI, M., COATES, A., DIAMOS, G., DING, K., DU, N., ELSEN, E., ENGEL, J., FANG, W., FAN, L., FOUGNER, C., GAO, L., GONG, C., HANNUN, A., HAN, T., JOHANNES, L., JIANG, B., JU, C., JUN, B., LEGRESLEY, P., LIN, L., LIU, J., LIU, Y., LI, W., LI, X., MA, D., NARANG, S., NG, A., OZAIR, S., PENG, Y., PRENGER, R., QIAN, S., QUAN, Z., RAIMAN, J., RAO, V., SATHEESH, S., SEETAPUN, D., SENGUPTA, S., SRINET, K., SRIRAM, A., TANG, H., TANG, L., WANG, C., WANG, J., WANG, K., WANG, Y., WANG, Z., WANG, Z., WU, S., WEI, L., XIAO, B., XIE, W., XIE, Y., YOGATAMA, D., YUAN, B., ZHAN, J., AND ZHU, Z. Deep Speech 2: End-to-end speech recognition in English and Mandarin. In ICML (2016).
    Google ScholarLocate open access versionFindings
  • [4] ATHALYE, A., CARLINI, N., AND WAGNER, D. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. arXiv:1802.00420 (2018).
    Findings
  • [5] ATHALYE, A., AND SUTSKEVER, I. Synthesizing robust adversarial examples. arXiv preprint arXiv:1707.07397 (2017).
    Findings
  • [6] BELINKOV, Y., AND BISK, Y. Synthetic and natural noise both break neural machine translation. arXiv preprint arXiv:1711.02173 (2017).
    Findings
  • [7] BENGIO, Y., LECUN, Y., NOHL, C., AND BURGES, C. LeRec: A NN/HMM hybrid for on-line handwriting recognition. Neural Computation 7, 6 (1995), 1289–1303.
    Google ScholarLocate open access versionFindings
  • [8] BREUEL, T. M., UL-HASAN, A., AL-AZAWI, M. A., AND SHAFAIT, F. High-performance OCR for printed English and Fraktur using LSTM networks. In ICDAR (2013).
    Google ScholarLocate open access versionFindings
  • [9] CARLINI, N., MISHRA, P., VAIDYA, T., ZHANG, Y., SHERR, M., SHIELDS, C., WAGNER, D., AND ZHOU, W. Hidden voice commands. In USENIX Security (2016).
    Google ScholarLocate open access versionFindings
  • [10] CARLINI, N., AND WAGNER, D. Towards evaluating the robustness of neural networks. In S&P (2017).
    Google ScholarLocate open access versionFindings
  • [11] CARLINI, N., AND WAGNER, D. Audio adversarial examples: Targeted attacks on speech-to-text. arXiv preprint arXiv:1801.01944 (2018).
    Findings
  • [12] CISSE, M. M., ADI, Y., NEVEROVA, N., AND KESHET, J. Houdini: Fooling deep structured visual and speech recognition models with adversarial examples. In NIPS (2017).
    Google ScholarLocate open access versionFindings
  • [13] ESPANA-BOQUERA, S., CASTRO-BLEDA, M. J., GORBEMOYA, J., AND ZAMORA-MARTINEZ, F. Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 4 (2011), 767–779.
    Google ScholarLocate open access versionFindings
  • [14] EVTIMOV, I., EYKHOLT, K., FERNANDES, E., KOHNO, T., LI, B., PRAKASH, A., RAHMATI, A., AND SONG, D. Robust physical-world attacks on deep learning models. arXiv preprint arXiv:1707.08945 1 (2017).
    Findings
  • [15] GOCR. http://jocr.sourceforge.net/, 2016.
    Findings
  • [16] GOODFELLOW, I. J., SHLENS, J., AND SZEGEDY, C. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
    Findings
  • [17] Google Books. https://books.google.com/, 2016.
    Findings
  • [18] Google Speech API. https://cloud.google.com/speech/, 2016.
    Findings
  • [19] Google Translate. https://translate.google.com/, 2016.
    Findings
  • [20] GRAVES, A., FERNA NDEZ, S., GOMEZ, F., AND SCHMIDHUBER, J. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. In ICML (2006).
    Google ScholarLocate open access versionFindings
  • [21] GUO, C., RANA, M., CISSE, M., AND VAN DER MAATEN, L. Countering adversarial images using input transformations. arXiv preprint arXiv:1711.00117 (2017).
    Findings
  • [22] HOCHREITER, S., AND SCHMIDHUBER, J. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
    Google ScholarLocate open access versionFindings
  • [23] HOSSEINI, H., KANNAN, S., ZHANG, B., AND POOVENDRAN, R. Deceiving Google’s Perspective API built for detecting toxic comments. arXiv preprint arXiv:1702.08138 (2017).
    Findings
  • [24] HUANG, S., PAPERNOT, N., GOODFELLOW, I., DUAN, Y., AND ABBEEL, P. Adversarial attacks on neural network policies. arXiv preprint arXiv:1702.02284 (2017).
    Findings
  • [25] JIA, R., AND LIANG, P. Adversarial examples for evaluating reading comprehension systems. In EMNLP (2017).
    Google ScholarLocate open access versionFindings
  • [26] KAMESHIRO, T., HIRANO, T., OKADA, Y., AND YODA, F. A document retrieval method from handwritten characters based on OCR and character shape information. In ICDAR (2001).
    Google ScholarLocate open access versionFindings
  • [27] KIM, Y. Convolutional neural networks for sentence classification. In EMNLP (2014).
    Google ScholarLocate open access versionFindings
  • [28] KINGMA, D. P., AND BA, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
    Findings
  • [29] KRIZHEVSKY, A., SUTSKEVER, I., AND HINTON, G. E. Imagenet classification with deep convolutional neural networks. In NIPS (2012).
    Google ScholarLocate open access versionFindings
  • [30] KURAKIN, A., GOODFELLOW, I., AND BENGIO, S. Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016).
    Findings
  • [31] LARSSON, A., AND SEGERA S, T. Automated invoice handling with machine learning and OCR. Tech. Rep. 2016:53, KTH, Computer and Electronic Engineering, 2016.
    Google ScholarLocate open access versionFindings
  • [32] LIU, Y., CHEN, X., LIU, C., AND SONG, D. Delving into transferable adversarial examples and black-box attacks. arXiv preprint arXiv:1611.02770 (2016).
    Findings
  • [33] LU, Z., SCHWARTZ, R., NATARAJAN, P., BAZZI, I., AND MAKHOUL, J. Advances in the BBN BYBLOS OCR system. In ICDAR (1999).
    Google ScholarLocate open access versionFindings
  • [34] MAAS, A. L., DALY, R. E., PHAM, P. T., HUANG, D., NG, A. Y., AND POTTS, C. Learning word vectors for sentiment analysis. In ACL (2011).
    Google ScholarLocate open access versionFindings
  • [35] MADRY, A., MAKELOV, A., SCHMIDT, L., TSIPRAS, D., AND VLADU, A. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).
    Findings
  • [36] MAHLER, T., CHEUNG, W., ELSNER, M., KING, D., DE MARNEFFE, M.-C., SHAIN, C., STEVENS-GUILLE, S., AND WHITE, M. Breaking NLP: Using morphosyntax, semantics, pragmatics and world knowledge to fool sentiment analysis systems. In First Workshop on Building Linguistically Generalizable NLP Systems (2017).
    Google ScholarLocate open access versionFindings
  • [37] MILLER, G. A. WordNet: A lexical database for English. Communications of the ACM 38, 11 (1995), 39–41.
    Google ScholarLocate open access versionFindings
  • [38] OCRopus. https://github.com/tmbdev/ocropy/, 2016.
    Findings
  • [39] OpenALPR. http://www.openalpr.com/, 2016.
    Findings
  • [40] PANG, B., AND LEE, L. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In ACL (2004).
    Google ScholarLocate open access versionFindings
  • [41] PAPERNOT, N., MCDANIEL, P., AND GOODFELLOW, I. Transferability in machine learning: From phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277 (2016).
    Findings
  • [42] PAPERNOT, N., MCDANIEL, P., JHA, S., FREDRIKSON, M., CELIK, Z. B., AND SWAMI, A. The limitations of deep learning in adversarial settings. In EuroS&P (2016).
    Google ScholarLocate open access versionFindings
  • [43] PAPERNOT, N., MCDANIEL, P., SWAMI, A., AND HARANG, R. Crafting adversarial input sequences for recurrent neural networks. In MILCOM (2016).
    Google ScholarLocate open access versionFindings
  • [44] PAPERNOT, N., MCDANIEL, P., WU, X., JHA, S., AND SWAMI, A. Distillation as a defense to adversarial perturbations against deep neural networks. In S&P (2016).
    Google ScholarLocate open access versionFindings
  • [45] REDDY, S., AND KNIGHT, K. Obfuscating gender in social media writing. In First Workshop on NLP and Computational Social Science (2016).
    Google ScholarLocate open access versionFindings
  • [46] SAMANTA, S., AND MEHTA, S. Towards crafting text adversarial samples. arXiv preprint arXiv:1707.02812 (2017).
    Findings
  • [47] SHARIF, M., BHAGAVATULA, S., BAUER, L., AND REITER, M. K. Accessorize to a crime: Real and stealthy attacks on stateof-the-art face recognition. In CCS (2016).
    Google ScholarLocate open access versionFindings
  • [48] SMITH, R. An overview of the Tesseract OCR engine. In ICDAR (2007).
    Google ScholarLocate open access versionFindings
  • [49] SMITH, R., GU, C., LEE, D.-S., HU, H., UNNIKRISHNAN, R., IBARZ, J., ARNOUD, S., AND LIN, S. End-to-end interpretation of the French street name signs dataset. In ECCV (2016).
    Google ScholarLocate open access versionFindings
  • [50] SZEGEDY, C., ZAREMBA, W., SUTSKEVER, I., BRUNA, J., ERHAN, D., GOODFELLOW, I., AND FERGUS, R. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).
    Findings
  • [51] TAGHVA, K., BORSACK, J., AND CONDIT, A. Results of applying probabilistic IR to OCR text. In SIGIR (1994).
    Google ScholarLocate open access versionFindings
  • [52] TAIGMAN, Y., YANG, M., RANZATO, M., AND WOLF, L. Deepface: Closing the gap to human-level performance in face verification. In CVPR (2014). https://github.com/tesseract-ocr/
    Locate open access versionFindings
  • tesseract, 2016.
    Google ScholarFindings
  • [54] Tesseract training data. issuecomment-274574951, 2016.
    Google ScholarFindings
  • [55] Tesseract trained model. https://github.com/
    Findings
  • tesseract-ocr/tessdata_best, 2016.
    Google ScholarFindings
  • [56] Neural nets in Tesseract 4.00. NeuralNetsInTesseract4.00, 2016.
    Google ScholarFindings
  • [57] TRAME R, F., KURAKIN, A., PAPERNOT, N., BONEH, D., AND MCDANIEL, P. Ensemble adversarial training: Attacks and defenses. arXiv preprint arXiv:1705.07204 (2017).
    Findings
  • [58] VAIDYA, T., ZHANG, Y., SHERR, M., AND SHIELDS, C. Cocaine noodles: Exploiting the gap between human and machine speech recognition. In WOOT (2015).
    Google ScholarLocate open access versionFindings
  • [59] WANG, T., WU, D. J., COATES, A., AND NG, A. Y. End-toend text recognition with convolutional neural networks. In ICPR (2012).
    Google ScholarLocate open access versionFindings
  • [60] XIE, C., WANG, J., ZHANG, Z., ZHOU, Y., XIE, L., AND YUILLE, A. Adversarial examples for semantic segmentation and object detection. In ICCV (2017).
    Google ScholarLocate open access versionFindings
  • [61] XU, X., CHEN, X., LIU, C., ROHRBACH, A., DARELL, T., AND SONG, D. Can you fool AI with adversarial examples on a visual Turing test? arXiv preprint arXiv:1709.08693 (2017).
    Findings
  • [62] Yandex Translate. https://translate.yandex.com/ocr, 2016.
    Findings
  • [63] YASSER, A. Classifying receipts and invoices in visma mobile scanner, 2016.
    Google ScholarFindings
  • [64] ZHANG, G., YAN, C., JI, X., ZHANG, T., ZHANG, T., AND XU, W. DolphinAttack: Inaudible voice commands. In CCS (2017).
    Google ScholarLocate open access versionFindings
  • [65] ZHAO, Z., DUA, D., AND SINGH, S. Generating natural adversarial examples. arXiv preprint arXiv:1710.11342 (2017).
    Findings
  • [66] ZUCCON, G., NGUYEN, A., BERGHEIM, A., WICKMAN, S., AND GRAYSON, N. The impact of OCR accuracy on automated cancer classification of pathology reports. Studies in Health Technology and Informatics 178 (2012), 250.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments