AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We present one of the first attempts to extend Neural Architecture Search beyond image classification to dense image prediction problems

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation.

CVPR, (2019): 82-92

Cited by: 542|Views508
EI
Full Text
Bibtex
Weibo

Abstract

Recently, Neural Architecture Search (NAS) has successfully identified neural network architectures that exceed human designed ones on large-scale image classification. In this paper, we study NAS for semantic image segmentation. Existing works often focus on searching the repeatable cell structure, while hand-designing the outer network ...More

Code:

Data:

0
Introduction
  • Deep neural networks have been proved successful across a large variety of artificial intelligence tasks, including image recognition [38, 25], speech recognition [27], machine translation [73, 81] etc.
  • While better optimizers [36] and better normalization techniques [32, 80] certainly played an important role, a lot of the progress comes from the design of neural network architectures.
  • This holds true for both image classification [38, 72, 75, 76, 74, 25, 85, 31, 30] and dense image prediction [16, 51, 7, 64, 56, 55].
Highlights
  • Deep neural networks have been proved successful across a large variety of artificial intelligence tasks, including image recognition [38, 25], speech recognition [27], machine translation [73, 81] etc
  • For the outer network level (Sec. 3.2), we propose a novel search space based on observation and summarization of many popular designs
  • We report our architecture search implementation details as well as the search results
  • We report semantic segmentation results on benchmark datasets with our best found architecture
  • We present one of the first attempts to extend Neural Architecture Search beyond image classification to dense image prediction problems
  • The result of the search, Auto-DeepLab, is evaluated by training on benchmark semantic segmentation datasets from scratch
Methods
  • The authors begin by introducing a continuous relaxation of the discrete architectures that exactly matches the hierarchical architecture search described above.
  • Continuous Relaxation of Architectures.
  • The authors reuse the continuous relaxation described in [49].
  • Every block’s output tensor Hil is connected to all hidden states in Iil: Hil = Oj→i(Hjl ) (1) Hjl ∈Iil. In addition, the authors approximate each Oj→i with its continuous relaxation Oj→i, defined as: Oj→i(Hjl ) = αjk→iOk(Hjl ) (2)
Results
  • The authors report the architecture search implementation details as well as the search results.
  • The authors report semantic segmentation results on benchmark datasets with the best found architecture.
  • Atr + sep.
  • 3x3 sep sep 3x3 sep 5x5 Hl-2 Hl-1.
  • Hl sep 3x3 atr 5x5 atr 3x3 sep 5x5.
  • Architecture Search Implementation Details.
  • The authors evaluate the performance of the found best architecture (Fig. 3) on Cityscapes [13], PASCAL VOC 2012 [15], and ADE20K [90] datasets
Conclusion
  • The authors present one of the first attempts to extend Neural Architecture Search beyond image classification to dense image prediction problems.
  • Instead of fixating on the cell level, the authors acknowledge the importance of spatial resolution changes, and embrace the architectural variations by incorporating the network level into the search space.
  • The authors develop a differentiable formulation that allows efficient architecture search over the two-level hierarchical search space.
  • On Cityscapes, Auto-DeepLab significantly outperforms the previous state-of-the-art by 8.6%, and performs comparably with ImageNet-pretrained top models when exploiting the coarse annotations.
  • On PASCAL VOC 2012 and ADE20K, Auto-DeepLab outperforms several ImageNet-pretrained state-of-the-art models
Tables
  • Table1: Comparing our work against other CNN architectures with two-level hierarchy. The main differences include: (1) we directly search CNN architecture for semantic segmentation, (2) we search the network level architecture as well as the cell level one, and (3) our efficient search only requires 3 P100 GPU days
  • Table2: Cityscapes validation set results with different Auto-DeepLab model variants. F : the filter multiplier controlling the model capacity. All our models are trained from scratch and with single-scale input during inference
  • Table3: Cityscapes validation set results. We experiment with the effect of adopting different training iterations (500K, 1M, and 1.5M iterations) and the Scheduled Drop Path method (SDP). All models are trained from scratch
  • Table4: Cityscapes test set results with multi-scale inputs during inference. ImageNet: Models pretrained on ImageNet. Coarse: Models exploit coarse annotations
  • Table5: PASCAL VOC 2012 validation set results. We experiment with the effect of adopting multi-scale inference (MS) and COCO-pretrained checkpoints (COCO). Without any pretraining, our best model (Auto-DeepLab-L) outperforms DropBlock by 20.36%. All our models are not pretrained with ImageNet images
  • Table6: PASCAL VOC 2012 test set results. Our AutoDeepLab-L attains comparable performance with many state-of-the-art models which are pretrained on both ImageNet and COCO datasets. We refer readers to the official leader-board for other state-of-the-art models
  • Table7: ADE20K validation set results. We employ multiscale inputs during inference. †: Results are obtained from their up-to-date model zoo websites respectively. ImageNet: Models pretrained on ImageNet. Avg: Average of mIOU and Pixel-Accuracy
Download tables as Excel
Related work
  • Semantic Image Segmentation Convolutional neural networks [42] deployed in a fully convolutional manner (FCNs [68, 51]) have achieved remarkable performance on several semantic segmentation benchmarks. Within the state-of-the-art systems, there are two essential components: multi-scale context module and neural network design. It has been known that context information is crucial for pixel labeling tasks [26, 70, 37, 39, 16, 54, 14, 10]. Therefore, PSPNet [88] performs spatial pyramid pooling [21, 41, 24] at several grid scales (including image-level pooling [50]), while DeepLab [8, 9] applies several parallel atrous convolution [28, 20, 68, 57, 7] with different rates. On the other hand, the improvement of neural network design has significantly driven the performance from AlexNet [38], VGG [72], Inception [32, 76, 74], ResNet [25] to more recent architectures, such as Wide ResNet [86], ResNeXt [85], DenseNet [31] and Xception [12, 61]. In addition to adopting those networks as backbones for semantic segmentation, one could employ the encoder-decoder structures [64, 2, 55, 44, 60, 58, 33, 79, 18, 11, 87, 83] which efficiently captures the long-range context information while keeping the detailed object boundaries. Nevertheless, most of the models require initialization from the ImageNet [65] pretrained checkpoints except FRRN [60] and GridNet [17] for the task of semantic segmentation. Specifically, FRRN [60] employs a two-stream system, where full-resolution information is carried in one stream and context information in the other pooling stream. GridNet, building on top of a similar idea, contains multiple streams with different resolutions. In this work, we apply neural architecture search for network backbones specific for semantic segmentation. We further show state-of-the-art performance without ImageNet pretraining, and significantly outperforms FRRN [60] and GridNet [17] on Cityscapes [13].
Funding
  • Our light-weight model attains the performance only 1.2% lower than DeepLabv3+ [11], while requiring 76.7% fewer parameters and being 4.65 times faster in Multi-Adds
  • On PASCAL VOC 2012 and ADE20K, our best model also outperforms several state-of-the-art models
  • Our models outperform some state-of-the-art models, including RefineNet [44], UPerNet [83], and PSPNet (ResNet-152) [88]; however, without any ImageNet [65] pretraining, our performance is lagged behind the latest work of [11]
Reference
  • K. Ahmed and L. Torresani. Maskconnect: Connectivity learning by gradient descent. In ECCV, 2018. 3
    Google ScholarLocate open access versionFindings
  • V. Badrinarayanan, A. Kendall, and R. Cipolla. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. arXiv:1511.00561, 2015. 2
    Findings
  • B. Baker, O. Gupta, N. Naik, and R. Raskar. Designing neural network architectures using reinforcement learning. In ICLR, 2017. 3
    Google ScholarLocate open access versionFindings
  • S. R. Bulo, L. Porzi, and P. Kontschieder. In-place activated batchnorm for memory-optimized training of dnns. In CVPR, 2018. 2, 7
    Google ScholarLocate open access versionFindings
  • H. Cai, T. Chen, W. Zhang, Y. Yu, and J. Wang. Efficient architecture search by network transformation. In AAAI, 2018. 3
    Google ScholarLocate open access versionFindings
  • L.-C. Chen, M. D. Collins, Y. Zhu, G. Papandreou, B. Zoph, F. Schroff, H. Adam, and J. Shlens. Searching for efficient multi-scale architectures for dense image prediction. In NIPS, 2018. 1, 2, 3, 7, 8
    Google ScholarLocate open access versionFindings
  • L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Semantic image segmentation with deep convolutional nets and fully connected crfs. In ICLR, 2015. 1, 2
    Google ScholarLocate open access versionFindings
  • L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. TPAMI, 2017. 2, 7
    Google ScholarLocate open access versionFindings
  • L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam. Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587, 2017. 2, 4, 6, 7
    Findings
  • L.-C. Chen, Y. Yang, J. Wang, W. Xu, and A. L. Yuille. Attention to scale: Scale-aware semantic image segmentation. In CVPR, 2016. 2
    Google ScholarLocate open access versionFindings
  • L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In ECCV, 2018. 1, 2, 7, 8
    Google ScholarLocate open access versionFindings
  • F. Chollet. Xception: Deep learning with depthwise separable convolutions. In CVPR, 2017. 2
    Google ScholarLocate open access versionFindings
  • M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene understanding. In CVPR, 2016. 2, 3, 6, 7
    Google ScholarLocate open access versionFindings
  • J. Dai, K. He, and J. Sun. Convolutional feature masking for joint object and stuff segmentation. In CVPR, 2015. 2
    Google ScholarLocate open access versionFindings
  • M. Everingham, S. M. A. Eslami, L. V. Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual object classes challenge a retrospective. IJCV, 2014. 2, 3, 6, 7
    Google ScholarLocate open access versionFindings
  • C. Farabet, C. Couprie, L. Najman, and Y. LeCun. Learning hierarchical features for scene labeling. PAMI, 2013. 1, 2
    Google ScholarLocate open access versionFindings
  • D. Fourure, R. Emonet, E. Fromont, D. Muselet, A. Tremeau, and C. Wolf. Residual conv-deconv grid network for semantic segmentation. In BMVC, 202, 3, 7
    Google ScholarLocate open access versionFindings
  • J. Fu, J. Liu, Y. Wang, and H. Lu. Stacked deconvolutional network for semantic segmentation. arXiv:1708.04943, 2017. 2
    Findings
  • G. Ghiasi, T.-Y. Lin, and Q. V. Le. Dropblock: A regularization method for convolutional networks. In NIPS, 2018. 7, 8
    Google ScholarLocate open access versionFindings
  • A. Giusti, D. Ciresan, J. Masci, L. Gambardella, and J. Schmidhuber. Fast image scanning with deep max-pooling convolutional neural networks. In ICIP, 2013. 2
    Google ScholarFindings
  • K. Grauman and T. Darrell. The pyramid match kernel: Discriminative classification with sets of image features. In ICCV, 2005. 2
    Google ScholarLocate open access versionFindings
  • K. Greff, R. K. Srivastava, J. Koutnık, B. R. Steunebrink, and J. Schmidhuber. Lstm: A search space odyssey. arXiv:1503.04069, 2015. 3
    Findings
  • B. Hariharan, P. Arbelaez, L. Bourdev, S. Maji, and J. Malik. Semantic contours from inverse detectors. In ICCV, 2011. 7
    Google ScholarLocate open access versionFindings
  • K. He, X. Zhang, S. Ren, and J. Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV, 2014. 2
    Google ScholarLocate open access versionFindings
  • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016. 1, 2
    Google ScholarLocate open access versionFindings
  • X. He, R. S. Zemel, and M. Carreira-Perpindn. Multiscale conditional random fields for image labeling. In CVPR, 2004. 2
    Google ScholarLocate open access versionFindings
  • G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6):82–97, 2012. 1
    Google ScholarLocate open access versionFindings
  • M. Holschneider, R. Kronland-Martinet, J. Morlet, and P. Tchamitchian. A real-time algorithm for signal analysis with the help of the wavelet transform. In Wavelets: Time-Frequency Methods and Phase Space, pages 289–297. Springer, 1989. 2
    Google ScholarLocate open access versionFindings
  • A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861, 2017. 7
    Findings
  • J. Hu, L. Shen, and G. Sun. Squeeze-and-excitation networks. In CVPR, 2018. 1
    Google ScholarLocate open access versionFindings
  • G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger. Densely connected convolutional networks. In CVPR, 2017. 1, 2
    Google ScholarLocate open access versionFindings
  • S. Ioffe and C. Szegedy. Batch normalization: accelerating deep network training by reducing internal covariate shift. In ICML, 2015. 1, 2, 7
    Google ScholarLocate open access versionFindings
  • M. A. Islam, M. Rochan, N. D. Bruce, and Y. Wang. Gated feedback refinement network for dense image labeling. In CVPR, 2017. 2
    Google ScholarLocate open access versionFindings
  • R. Jozefowicz, W. Zaremba, and I. Sutskever. An empirical exploration of recurrent network architectures. In ICML, 2015. 3
    Google ScholarLocate open access versionFindings
  • A. Kae, K. Sohn, H. Lee, and E. Learned-Miller. Augmenting crfs with boltzmann machine shape priors for image labeling. In CVPR, 2013. 3
    Google ScholarFindings
  • D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, 2015. 1, 6
    Google ScholarLocate open access versionFindings
  • P. Kohli, P. H. Torr, et al. Robust higher order potentials for enforcing label consistency. IJCV, 82(3):302–324, 2009. 2
    Google ScholarLocate open access versionFindings
  • A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. 1, 2
    Google ScholarLocate open access versionFindings
  • L. Ladicky, C. Russell, P. Kohli, and P. H. Torr. Associative hierarchical crfs for object class image segmentation. In ICCV, 2009. 2
    Google ScholarLocate open access versionFindings
  • G. Larsson, M. Maire, and G. Shakhnarovich. Fractalnet: Ultra-deep neural networks without residuals. In ICLR, 2017. 7
    Google ScholarLocate open access versionFindings
  • S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR, 2006. 2
    Google ScholarLocate open access versionFindings
  • Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4):541–551, 1989. 2
    Google ScholarLocate open access versionFindings
  • D. Lin, Y. Ji, D. Lischinski, D. Cohen-Or, and H. Huang. Multi-scale context intertwining for semantic segmentation. In ECCV, 2018. 8
    Google ScholarLocate open access versionFindings
  • G. Lin, A. Milan, C. Shen, and I. Reid. Refinenet: Multipath refinement networks with identity mappings for highresolution semantic segmentation. In CVPR, 2017. 2, 8
    Google ScholarLocate open access versionFindings
  • T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and S. Belongie. Feature pyramid networks for object detection. In CVPR, 2017. 4
    Google ScholarLocate open access versionFindings
  • T.-Y. Lin et al. Microsoft coco: Common objects in context. In ECCV, 2014. 7
    Google ScholarLocate open access versionFindings
  • C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L.-J. Li, L. Fei-Fei, A. Yuille, J. Huang, and K. Murphy. Progressive neural architecture search. In ECCV, 2018. 1, 2, 3
    Google ScholarLocate open access versionFindings
  • H. Liu, K. Simonyan, O. Vinyals, C. Fernando, and K. Kavukcuoglu. Hierarchical representations for efficient architecture search. In ICLR, 2018. 3
    Google ScholarLocate open access versionFindings
  • H. Liu, K. Simonyan, and Y. Yang. Darts: Differentiable architecture search. arXiv:1806.09055, 2018. 1, 2, 3, 5
    Findings
  • W. Liu, A. Rabinovich, and A. C. Berg. Parsenet: Looking wider to see better. arXiv:1506.04579, 2015. 2, 7
    Findings
  • J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, 2015. 1, 2
    Google ScholarLocate open access versionFindings
  • R. Luo, F. Tian, T. Qin, and T.-Y. Liu. Neural architecture optimization. In NIPS, 2018. 3
    Google ScholarLocate open access versionFindings
  • R. Miikkulainen, J. Liang, E. Meyerson, A. Rawal, D. Fink, O. Francon, B. Raju, H. Shahrzad, A. Navruzyan, N. Duffy, and B. Hodjat. Evolving deep neural networks. arXiv:1703.00548, 2017. 3
    Findings
  • M. Mostajabi, P. Yadollahpour, and G. Shakhnarovich. Feedforward semantic segmentation with zoom-out features. In CVPR, 2015. 2
    Google ScholarLocate open access versionFindings
  • A. Newell, K. Yang, and J. Deng. Stacked hourglass networks for human pose estimation. In ECCV, 2016. 1, 2, 4
    Google ScholarLocate open access versionFindings
  • H. Noh, S. Hong, and B. Han. Learning deconvolution network for semantic segmentation. In ICCV, 2015. 1, 4
    Google ScholarLocate open access versionFindings
  • G. Papandreou, I. Kokkinos, and P.-A. Savalle. Modeling local and global deformations in deep learning: Epitomic convolution, multiple instance learning, and sliding window detection. In CVPR, 2015. 2
    Google ScholarLocate open access versionFindings
  • C. Peng, X. Zhang, G. Yu, G. Luo, and J. Sun. Large kernel matters–improve semantic segmentation by global convolutional network. In CVPR, 2017. 2
    Google ScholarLocate open access versionFindings
  • H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean. Efficient neural architecture search via parameter sharing. In ICML, 2018. 2, 3
    Google ScholarLocate open access versionFindings
  • T. Pohlen, A. Hermans, M. Mathias, and B. Leibe. Fullresolution residual networks for semantic segmentation in street scenes. In CVPR, 2017. 2, 3, 7
    Google ScholarLocate open access versionFindings
  • H. Qi, Z. Zhang, B. Xiao, H. Hu, B. Cheng, Y. Wei, and J. Dai. Deformable convolutional networks – coco detection and segmentation challenge 2017 entry. ICCV COCO Challenge Workshop, 2017. 2
    Google ScholarFindings
  • E. Real, A. Aggarwal, Y. Huang, and Q. V. Le. Regularized evolution for image classifier architecture search. arXiv:1802.01548, 2018. 1, 2, 3
    Findings
  • E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q. Le, and A. Kurakin. Large-scale evolution of image classifiers. In ICML, 2017. 2, 3
    Google ScholarLocate open access versionFindings
  • O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In MICCAI, 2015. 1, 2, 4
    Google ScholarLocate open access versionFindings
  • O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. IJCV, 2015. 2, 7, 8
    Google ScholarLocate open access versionFindings
  • M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR, 2018. 7
    Google ScholarLocate open access versionFindings
  • S. Saxena and J. Verbeek. Convolutional neural fabrics. In NIPS, 2016. 3
    Google ScholarLocate open access versionFindings
  • P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. In ICLR, 2014. 2
    Google ScholarLocate open access versionFindings
  • R. Shin, C. Packer, and D. Song. Differentiable neural network architecture search. In ICLR Workshop, 2018. 2, 3
    Google ScholarLocate open access versionFindings
  • J. Shotton, J. Winn, C. Rother, and A. Criminisi. Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. IJCV, 2009. 2
    Google ScholarLocate open access versionFindings
  • A. Shrivastava, R. Sukthankar, J. Malik, and A. Gupta. Beyond skip connections: Top-down modulation for object detection. arXiv:1612.06851, 2016. 4
    Findings
  • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015. 1, 2
    Google ScholarLocate open access versionFindings
  • I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, 2014. 1
    Google ScholarLocate open access versionFindings
  • C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi. Inception-v4, inception-resnet and the impact of residual connections on learning. In AAAI, 2017. 1, 2
    Google ScholarLocate open access versionFindings
  • C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, 2015. 1
    Google ScholarLocate open access versionFindings
  • C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. Rethinking the inception architecture for computer vision. In CVPR, 2016. 1, 2
    Google ScholarLocate open access versionFindings
  • M. Tan, B. Chen, R. Pang, V. Vasudevan, and Q. V. Le. Mnasnet: Platform-aware neural architecture search for mobile. arXiv:1807.11626, 2018. 3, 8
    Findings
  • P. Wang, P. Chen, Y. Yuan, D. Liu, Z. Huang, X. Hou, and G. Cottrell. Understanding convolution for semantic segmentation. In WACV, 2018. 7
    Google ScholarLocate open access versionFindings
  • Z. Wojna, V. Ferrari, S. Guadarrama, N. Silberman, L.-C. Chen, A. Fathi, and J. Uijlings. The devil is in the decoder. In BMVC, 2017. 2
    Google ScholarLocate open access versionFindings
  • Y. Wu and K. He. Group normalization. In ECCV, 2018. 1
    Google ScholarLocate open access versionFindings
  • Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, et al. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144, 2016. 1
    Findings
  • Z. Wu, C. Shen, and A. van den Hengel. Wider or deeper: Revisiting the resnet model for visual recognition. arXiv:1611.10080, 2016. 2, 7, 8
    Findings
  • T. Xiao, Y. Liu, B. Zhou, Y. Jiang, and J. Sun. Unified perceptual parsing for scene understanding. In ECCV, 2018. 2, 8
    Google ScholarLocate open access versionFindings
  • L. Xie and A. Yuille. Genetic cnn. In ICCV, 2017. 3
    Google ScholarLocate open access versionFindings
  • S. Xie, R. Girshick, P. Dollr, Z. Tu, and K. He. Aggregated residual transformations for deep neural networks. In CVPR, 2017. 1, 2
    Google ScholarLocate open access versionFindings
  • S. Zagoruyko and N. Komodakis. Wide residual networks. In BMVC, 2016. 2
    Google ScholarLocate open access versionFindings
  • Z. Zhang, X. Zhang, C. Peng, D. Cheng, and J. Sun. Exfuse: Enhancing feature fusion for semantic segmentation. In ECCV, 2018. 2
    Google ScholarLocate open access versionFindings
  • H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. Pyramid scene parsing network. In CVPR, 2017. 2, 7, 8
    Google ScholarLocate open access versionFindings
  • Z. Zhong, J. Yan, W. Wu, J. Shao, and C.-L. Liu. Practical block-wise neural network architecture generation. In CVPR, 2018. 3
    Google ScholarLocate open access versionFindings
  • B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, and A. Torralba. Scene parsing through ade20k dataset. In CVPR, 2017. 2, 3, 6, 8
    Google ScholarLocate open access versionFindings
  • Y. Zhuang, F. Yang, L. Tao, C. Ma, Z. Zhang, Y. Li, H. Jia, X. Xie, and W. Gao. Dense relation network: Learning consistent and context-aware representation for semantic image segmentation. In ICIP, 2018. 7
    Google ScholarLocate open access versionFindings
  • B. Zoph and Q. V. Le. Neural architecture search with reinforcement learning. In ICLR, 2017. 2, 3
    Google ScholarLocate open access versionFindings
  • B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le. Learning transferable architectures for scalable image recognition. In CVPR, 2018. 1, 2, 3, 7
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科