With the rapid development of the electronic industry, electronic products such as computers, mobile phones, and intelligent televisions are now seen everywhere. These electronic products rely on electronic components to transmit signals to each other. Printed circuit boards (PCBs) are required to perform the signal transmission between electronic components. Electronic design automation (EDA) in layout design and functional verification is often conducted before a PCB is produced. EDA needs the characteristics of electronic components, such as appearance and pin configuration about electronic components. It is very time-consuming to organize these characteristics from datasheets. In this thesis, we propose an automatic extraction process of the dimension parameters shown in three-view drawings. It is divided into two stages. In the first stage, we detect three-view drawings in datasheets and find out the text regions containing the parameters in the drawings by deep learning. We then recognize the values in these regions. However, we still do not know which parameter a value obtained belongs to. In the second stage, we thus design two algorithms, based on k-nearest neighbors (k-NN) and statistical evaluation, respectively, to match the digitized parameters with the values. The matched parameter values can be automatically stored into the electronic-component characteristic database to assist layout engineers in designing PCBs. We also made experiments to show the high accuracy in the two stages.
論文審定書 i
誌謝 ii
ABSTRACT iii
摘要 v
Contents vi
Lists of Tables viii
Lists of Figures ix
Chapter 1 Introduction 1
1.1 Background 1
1.2 Contribution 3
1.3 Thesis Organization 4
Chapter 2 Related Works 5
2.1 Content-Based Image Retrieval 5
2.1.1 Feature Extraction 6
2.1.2 Similarity Measurement 9
2.2 Object Detection 10
2.2.1 Anchor-based Methods 11
2.2.2 Anchor-free Methods 12
2.3 Optical Character Recognition 13
2.3.1 Text Detection 13
2.3.2 Text Recognition 15
2.3.3 End-to-End Methods 16
Chapter 3 Architecture and Methodology 17
3.1 Data Collection and Processing 18
3.2 Logo Recognition Module 19
3.3 Three-View Search Module 22
3.4 Text Extraction Module 23
3.5 Table Extraction Module 26
3.6 Parameter Matching Module 29
Chapter 4 Experimental Results 41
4.1 Experimental Datasets and Evaluation Metrics 41
4.2 Logo Recognition Module 44
4.3 Three-View Search Module 45
4.4 Text Extraction Module and Table Extraction Module 49
4.5 Parameter Matching Module 50
Chapter 5 Conclusions and Future Works 58
5.1 Conclusions 58
5.2 Future Works 59
References 60
[1]https://github.com/pymupdf/PyMuPDF
[2]https://www.digikey.com/
[3]Y. Baek, B. Lee, D. Han, S. Yun and H. Lee, "Character region awareness for text detection," IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9365-9374, 2019.
[4]Y. Baek, S. Shin, J. Baek, S. Park, J. Lee, D. Nam and H. Lee, "Character region attention for text spotting," European Conference on Computer Vision, pp. 504-521, 2020.
[5]H. Bay, T. Tuytelaars and L. Van Gool, "SURF: speeded up robust features," European Conference on Computer Vision, pp. 404-417, 2006.
[6]A. K. Bhunia, A. K. Bhunia, S. Ghose, A. Das, P. P. Roy and U. Pal, "A deep one-shot network for query-based logo retrieval," Pattern Recognition, Vol. 96, 2019.
[7]A. Bochkovskiy, C. Y. Wang and H. Y. M. Liao, "YOLOv4: optimal speed and accuracy of object detection," arXiv preprint arXiv:2004.10934, 2020.
[8]E. BUBER and D. Banu, "Performance analysis and CPU vs GPU comparison for deep learning," International Conference on Control Engineering and Information Technology, pp. 1-6, 2018.
[9]W. Chen, Y. Liu, W. Wang, E. Bakker, T. Georgiou, P. Fieguth, L. Liu and M. S. Lew, "Deep image retrieval: a survey," arXiv preprint arXiv:2101.11282, 2021.
[10]C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, Vol. 20, No. 3, pp. 273-297, 1995.
[11]N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, pp. 886-893, 2005.
[12]J. Deng, W. Dong, R. Socher, L. J. Li, K. Li and L. Fei-Fei, "Imagenet: a large-scale hierarchical image database," IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, 2009.
[13]T. Deselaers, D. Keysers and H. Ney, "Features for image retrieval: an experimental comparison," Information Retrieval, Vol. 11, No. 2, pp. 77-107, 2008.
[14]E. Elyan, L. Jamieson and A. Ali-Gombe, "Deep learning for symbols detection and classification in engineering drawings," Neural Networks, Vol. 129, pp. 91-102, 2020.
[15]C. Geng, S. J. Huang and S. Chen, "Recent advances in open set recognition: a survey," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
[16]R. Girshick, "Fast R-CNN," IEEE International Conference on Computer Vision, pp. 1440-1448, 2015.
[17]R. Girshick, J. Donahue, T. Darrell and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," IEEE Conference on Computer Vision and Pattern Recognition, pp. 580-587, 2014.
[18]D. E. Goldberg, and M. P. Samtani. "Engineering optimization via genetic algorithm," Conference on Electronic Computation, pp. 471-482, 1986.
[19]I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, "Generative adversarial nets," Advances in Neural Information Processing Systems, Vol. 27, 2014.
[20]A. Gordo, J. Almazán, J. Revaud and D. Larlus, "Deep image retrieval: learning global representations for image search," European Conference on Computer Vision, pp. 241-257, 2016.
[21]A. Graves, S. Fernández, F. Gomez and J. Schmidhuber, "Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks," International Conference on Machine Learning, pp. 369-376, 2006.
[22]K. He, X. Zhang, S. Ren and J. Sun, "Deep residual learning for image recognition," IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
[23]J. Hosang, R. Benenson and B. Schiele, "Learning non-maximum suppression," IEEE Conference on Computer Vision and Pattern Recognition, pp. 4507-4515, 2017.
[24]H. Hu, J. Gu, Z. Zhang, J. Dai and Y. Wei, "Relation networks for object detection," IEEE Conference on Computer Vision and Pattern Recognition, pp. 3588-3597, 2018.
[25]P. Y. Huang, C. S. Hsu, T. P. Hong, Y. Z. Wang, S. F. Huang and S. M. Li, " Automatic Parameter Setting in Hough Circle Transform," Asian Conference on Intelligent Information and Database Systems, pp. 527-535, 2020.
[26]F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally and K. Keutzer, "SqueezeNet: alexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size," arXiv preprint arXiv:1602.07360, 2016.
[27]L. Jiao, F. Zhang, F. Liu, S. Yang, L. Li, Z. Feng and R. Qu, "A survey of deep learning-based object detection," IEEE Access, Vol. 7, pp. 128837-128868, 2019.
[28]Y. Ke and R. Sukthankar, "PCA-SIFT: a more distinctive representation for local image descriptors," IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, pp. II-II, 2004.
[29]D. Kim, M. Kwak, E. Won, S. Shin and J. Nam, "TLGAN: document text localization using generative adversarial nets," arXiv preprint arXiv:2010.11547, 2020.
[30]D. P. Kingma and J. Ba, "ADAM: a method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
[31]A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems, Vol. 25, pp. 1097-1105, 2012.
[32]H. Law and J. Deng, "Cornernet: detecting objects as paired keypoints," European Conference on Computer Vision, pp. 734-750, 2018.
[33]M. Liao, Z. Wan, C. Yao, K. Chen and X. Bai, "Real-time scene text detection with differentiable binarization," AAAI Conference on Artificial Intelligence, Vol. 34, No. 07, pp. 11474-11481, 2020.
[34]T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie, "Feature pyramid networks for object detection," IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117-2125, 2017.
[35]T.-Y. Lin, P. Goyal, R. Girshick, K. He and P. Dollár, "Focal loss for dense object detection," IEEE International Conference on Computer Vision, pp. 2980-2988, 2017.
[36]T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár and C. L. Zitnick, "Microsoft COCO: common objects in context," European Conference on Computer Vision, pp. 740-755, 2014.
[37]L. Liu, W. Ouyang, X. Wang, P. Fieguth, J. Chen, X. Liu and M. Pietikäinen, "Deep learning for generic object detection: a survey," International Journal of Computer Vision, Vol. 128, No. 2, pp. 261-318, 2020.
[38]S. Liu, L. Qi, H. Qin, J. Shi and J. Jia, "Path aggregation network for instance segmentation," IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759-8768, 2018.
[39]W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu and A. C. Berg, "SSD: Single shot multibox detector," European Conference on Computer Vision, pp. 21-37, 2016.
[40]W. Liu, C. Chen and K.-Y. K. Wong, "Char-Net: a character-aware neural network for distorted scene text recognition," Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
[41]X. Liu, D. Liang, S. Yan, D. Chen, Y. Qiao and J. Yan, "FOTS: fast oriented text spotting with a unified network," IEEE Conference on Computer Vision and Pattern Recognition, pp. 5676-5685, 2018.
[42]J. Long, E. Shelhamer and T. Darrell, "Fully convolutional networks for semantic segmentation," IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440, 2015.
[43]D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, Vol. 60, No. 2, pp. 91-110, 2004.
[44]N. Ma, X. Zhang, H.-T. Zheng and J. Sun, "Shufflenet v2: practical guidelines for efficient CNN architecture design," European Conference on Computer Vision, pp. 116-131, 2018.
[45]A. Newell, K. Yang and J. Deng, "Stacked hourglass networks for human pose estimation," European Conference on Computer Vision, pp. 483-499, 2016.
[46]Y. J. Ni, Y. T. Wang and T. Y. Ho, "Footprint classification of electric components on printed circuit boards," ACM/IEEE Workshop on Machine Learning for CAD, pp. 169-174, 2020.
[47]K. Oksuz, B. C. Cam, S. Kalkan and E. Akbas, "Imbalance problems in object detection: a review," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
[48]A. Oliva and A. Torralba, "Modeling the shape of the scene: a holistic representation of the spatial envelope," International Journal of Computer Vision, Vol. 42, No. 3, pp. 145-175, 2001.
[49]M. Pak and S. Kim, "A review of deep learning in image recognition," International Conference on Computer Applications and Information Processing Technology, pp. 1-3, 2017.
[50]S. Qin, A. Bissacco, M. Raptis, Y. Fujii and Y. Xiao, "Towards unconstrained end-to-end text spotting," IEEE/CVF International Conference on Computer Vision, pp. 4704-4714, 2019.
[51]W. Rawat and Z. Wang, "Deep convolutional neural networks for image classification: a comprehensive review," Neural Computation, Vol. 29, No. 9, pp. 2352-2449, 2017.
[52]J. Redmon and A. Farhadi, "YOLO9000: better, faster, stronger," IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263-7271, 2017.
[53]J. Redmon and A. Farhadi, "YOLOv3: an incremental improvement," arXiv preprint arXiv:1804.02767, 2018.
[54]S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: towards real-time object detection with region proposal networks," Advances in Neural Information Processing Systems, Vol. 28, pp. 91-99, 2015.
[55]O. Ronneberger, P. Fischer and T. Brox, "U-Net: convolutional networks for biomedical image segmentation," International Conference on Medical Image Computing and Computer Assisted Intervention, pp. 234-241, 2015.
[56]D. E. Rumelhart, G. E. Hinton and R. J. Williams, "Learning representations by back-propagating errors," Nature, Vol. 323, No. 6088, pp. 533-536, 1986.
[57]M. Sandler, A. Howard, M. Zhu, A. Zhmoginov and L.-C. Chen, "Mobilenetv2: inverted residuals and linear bottlenecks," IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510-4520, 2018.
[58]F. Schroff, D. Kalenichenko and J. Philbin, "FaceNet: a unified embedding for face recognition and clustering," IEEE Conference on Computer Vision and Pattern Recognition, pp. 815-823, 2015.
[59]B. Shi, X. Bai and C. Yao, "An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, No. 11, pp. 2298-2304, 2016.
[60]K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
[61]R. Smith, "An overview of the Tesseract OCR engine," International Conference on Document Analysis and Recognition, Vol. 2, pp. 629-633, 2007.
[62]M. Tan, B. Chen, R. Pang, V. Vasudevan, M. Sandler, A. Howard and Q. V. Le, "Mnasnet: platform-aware neural architecture search for mobile," IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2820-2828, 2019.
[63]Z. Tian, C. Shen, H. Chen and T. He, "FCOS: fully convolutional one-stage object detection," IEEE Conference on Computer Vision and Pattern Recognition, pp. 9627-9636, 2019.
[64]J. R. Uijlings, K. E. Van De Sande, T. Gevers and A. W. Smeulders, "Selective search for object recognition," International Journal of Computer Vision, Vol. 104, No. 2, pp. 154-171, 2013.
[65]J. Wan, D. Wang, S. C. H. Hoi, P. Wu, J. Zhu, Y. Zhang and J. Li, "Deep learning for content-based image retrieval: a comprehensive study," ACM International Conference on Multimedia, pp. 157-166, 2014.
[66]C. Y. Wang, A. Bochkovskiy and H. Y. M. Liao, "Scaled-YOLOv4: scaling cross stage partial network," IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13029-13038, 2021.
[67]C. Y. Wang, H. Y. M. Liao, Y. H. Wu, P. Y. Chen, J. W. Hsieh and I. H. Yeh, "CSPNet: a new backbone that can enhance learning capability of CNN," IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390-391, 2020.
[68]Y. J. Wang, Y. T. Chen, Y. S. F. Jiang, M. F. Horng, C. S. Shieh, H. Y. Wang, J. H. Ho and Y. M. Cheng, "An artificial neural network to support package classification for SMT components," International Conference on Computer and Communication Systems, pp. 130-134, 2018.
[69]L. Xing, Z. Tian, W. Huang and M. R. Scott, "Convolutional character networks," IEEE/CVF International Conference on Computer Vision, pp. 9126-9136, 2019.
[70]Y. Xu, G. Yang, J. Luo and J. He, "An electronic component recognition algorithm based on deep learning with a Faster SqueezeNet," Mathematical Problems in Engineering, Vol. 2020, 2020.
[71]M. Yasmin, S. Mohsin and M. Sharif, "Intelligent image retrieval techniques: a survey," Journal of Applied Research and Technology, Vol. 12, No. 1, pp. 87-103, 2014.
[72]S. Zhang, C. Chi, Y. Yao, Z. Lei and S. Z. Li, "Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection," IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9759-9768, 2020.
[73]X. Zhang, X. Zhou, M. Lin and J. Sun, "Shufflenet: an extremely efficient convolutional neural network for mobile devices," IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848-6856, 2018.
[74]X. Zhou, V. Koltun and P. Krähenbühl, "Probabilistic two-stage detection," arXiv preprint arXiv:2103.07461, 2021.
[75]X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He and J. Liang, "EAST: an efficient and accurate scene text detector," IEEE Conference on Computer Vision and Pattern Recognition, pp. 5551-5560, 2017.