Skip to content

Latest commit

 

History

History
285 lines (116 loc) · 10.2 KB

计算机视觉经典论文地址汇总.md

File metadata and controls

285 lines (116 loc) · 10.2 KB

计算机视觉论文地址汇总

[TOC]

基础网络

v1:Going deeper with convolutions v2:Batch Normalization: Accelerating Deep Network Training by ReducingInternal Covariate Shift v3:Rethinking the InceptionArchitecture for Computer Vision v4:Inception-v4,Inception-ResNet and the Impact of Residual Connections on Learning Aggregated ResidualTransformations for Deep Neural Networks Xception: DeepLearning with Depthwise Separable Convolutions

PolyNet: A Pursuit of Structural Diversity in Very Deep Networks : https://arxiv.org/pdf/1611.05725.pdf

NASNet:Learning Transferable Architectures for Scalable Image Recognition:https://arxiv.org/pdf/1707.07012.pdf

ResNet v1: Deep Residual Learning for Image Recognition: https://arxiv.org/pdf/1512.03385.pdf

ResNet v2: Identity Mappings in Deep Residual Networks: https://arxiv.org/pdf/1603.05027.pdf

Wide Residual Networks: https://arxiv.org/pdf/1605.07146.pdf

DenseNet: Densely Connected Convolutional Networks: https://arxiv.org/pdf/1608.06993.pdf

ResNeXt: Aggregated Residual Transformations for Deep Neural Networks: https://arxiv.org/pdf/1611.05431.pdf

Residual Attention Network for Image Classification: https://arxiv.org/pdf/1704.06904.pdf

SENet: Squeeze-and-Excitation Networks: https://arxiv.org/pdf/1709.01507.pdf

Memory-Efficient Implementation of DenseNets: https://arxiv.org/pdf/1707.06990.pdf

高效网络

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications: https://arxiv.org/pdf/1704.04861.pdf ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices:https://arxiv.org/pdf/1707.01083.pdf MobileNetV2: Inverted Residuals and Linear Bottlenecks :https://arxiv.org/pdf/1801.04381.pdf

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design:https://arxiv.org/pdf/1807.11164.pdf

目标检测

R-CNN: https://arxiv.org/pdf/1311.2524.pdf

SPPNet Spatial Pyramid Pooling in Deep Convolutional Networks : https://arxiv.org/pdf/1406.4729&ie=utf-8&sc_us=14732595831688094247.pdf

Fast R-CNN: https://arxiv.org/pdf/1504.08083.pdf

Faster R-CNN: https://arxiv.org/pdf/1506.01497.pdf

FPN: Feature Pyramid Networks for Object Detection: https://arxiv.org/pdf/1612.03144.pdf

Mask R-CNN: https://arxiv.org/pdf/1703.06870.pdf

Cascade R-CNN: Delving into High Quality Object Detection:https://arxiv.org/pdf/1712.00726.pdf

SSD Single Shot MultiBox Detector:https://arxiv.org/pdf/1512.02325.pdf

DSSD : Deconvolutional Single Shot Detector:https://arxiv.org/pdf/1701.06659.pdf

You Only Look Once: Unified, Real-Time Object Detection:https://arxiv.org/abs/1506.02640

YOLO9000 Better, Faster, Stronger: https://arxiv.org/pdf/1612.08242.pdf

Retinanet:https://arxiv.org/pdf/1708.02002.pdf

G-CNN: an Iterative Grid Based Object Detector :https://arxiv.org/pdf/1512.07729.pdf

R-FCN: Object Detection via Region-based Fully Convolutional Networks: https://arxiv.org/pdf/1605.06409.pdf

R-FCN-3000 :https://arxiv.org/pdf/1712.01802.pdf

DetNet: A Backbone network for Object : https://arxiv.org/abs/1804.06215

Relation Networks for Object Detection : https://arxiv.org/abs/1711.11575

RefineDet: Single-Shot Refinement Neural Network for Object Detection: https://arxiv.org/abs/1711.06897

Bag of Freebies for Training Object Detection Neural Networks:https://arxiv.org/abs/1902.04103v2

FCOS: Fully Convolutional One-Stage Object Detection: https://arxiv.org/pdf/1904.01355.pdf

语义分割/实例分割

Fully Convolutional Networks for Semantic Segmentation:https://arxiv.org/pdf/1411.4038.pdf

Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs:http://arxiv.org/pdf/1412.7062.pdf

U-Net: Convolutional Networks for Biomedical Image Segmentation:https://arxiv.org/pdf/1505.04597.pdf

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs: http://arxiv.org/pdf/1606.00915

DeepLab v2: Rethinking Atrous Convolution for Semantic Image Segmentation: https://arxiv.org/pdf/1706.05587.pdf

DeepLab v3: Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation: https://arxiv.org/abs/1802.02611

UNet++: A Nested U-Net Architecture for Medical Image Segmentation : https://arxiv.org/pdf/1807.10165.pdf

FCIS:Fully Convolutional Instance-aware Semantic Segmentation: https://arxiv.org/pdf/1611.07709.pdf

PAN: Path Aggregation Network for Instance Segmentation: https://arxiv.org/pdf/1803.01534.pdf

Mask Scoring R-CNN: https://arxiv.org/pdf/1903.00241.pdf

YOLACT:Real-time Instance Segmentation: https://arxiv.org/pdf/1904.02689.pdf

人体姿态估计

DeepPose: Human Pose Estimation via Deep Neural Networks: https://arxiv.org/pdf/1312.4659.pdf

Stacked Hourglass Networks for Human Pose Estimation: https://arxiv.org/pdf/1603.06937.pdf

DensePose: Dense Human Pose Estimation In The Wild:https://arxiv.org/pdf/1802.00434.pdf

视觉跟踪

On-line Boosting and Vision:http://read.pudn.com/downloads108/doc/445004/01640768.pdf

Online Object Tracking: A Benchmark:http://faculty.ucmerced.edu/mhyang/papers/cvpr13_benchmark.pdf

Transferring Rich Feature Hierarchies for Robust Visual Tracking:https://arxiv.org/pdf/1501.04587.pdf

Visual tracking with fully convolutional networks(FCNT): http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Wang_Visual_Tracking_With_ICCV_2015_paper.pdf

Learning to Track at 100 FPS with Deep Regression Networks(GOTURN):https://arxiv.org/pdf/1604.01802.pdf

Fully-Convolutional Siamese Networks for Object Tracking : https://arxiv.org/pdf/1606.09549.pdf

Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking:https://arxiv.org/pdf/1608.03773.pdf

人脸检测、识别

MTCNN: Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks: https://kpzhang93.github.io/MTCNN_face_detection_alignment/

Bayesian Face Revisited: A Joint Formulation:https://www.microsoft.com/en-us/research/wp-content/uploads/2012/01/JointBayesian.pdf

DeepID1: Deep Learning Face Representation from Predicting 10,000 Classes https://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Sun_Deep_Learning_Face_2014_CVPR_paper.pdf

DeepID2: Deep Learning Face Representation by Joint Identification-Verification https://arxiv.org/pdf/1406.4773.pdf

DeepID2+: Deeply learned face representations are sparse, selective, and robust: https://arxiv.org/abs/1412.1265v1

DeepID3: Face Recognition with Very Deep Neural Networks https://arxiv.org/pdf/1502.00873.pdf

FaceNet: A Unified Embedding for Face Recognition and Clustering: https://arxiv.org/pdf/1503.03832.pdf

Deep Face Recognition: http://www.robots.ox.ac.uk/~vgg/publications/2015/Parkhi15/parkhi15.pdf

A Discriminative Feature Learning Approach for Deep Face Recognition : http://ydwen.github.io/papers/WenECCV16.pdf

SphereFace: Deep Hypersphere Embedding for Face Recognition: https://arxiv.org/pdf/1704.08063.pdf

ArcFace/InsightFace : Additive Angular Margin Loss for Deep Face Recognition : https://arxiv.org/abs/1801.07698

MobileFaceNets: Efficient CNNs for Accurate RealTime Face Verification on Mobile Devices: https://arxiv.org/ftp/arxiv/papers/1804/1804.07573.pdf

DocFace+: ID Document to Selfie* Matching: https://arxiv.org/pdf/1809.05620.pdf

Low-Resolution Face Recognition: https://arxiv.org/pdf/1811.08965.pdf

Accurate and Efficient Similarity Search for Large Scale Face Recognition:https://arxiv.org/pdf/1806.00365.pdf

OCR/场景文字检测、识别

CRNN: An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition: https://arxiv.org/pdf/1507.05717.pdf

Synthetic Data for Text Localisation in Natural Images: https://arxiv.org/pdf/1604.06646.pdf

Scene text detection via holistic, multi-channel prediction: https://arxiv.org/pdf/1606.09002.pdf

CPTN: Detecting Text in Natural Image with Connectionist Text Proposal Network: https://arxiv.org/abs/1609.03605

SegLink: Detecting Oriented Text in Natural Images by Linking Segments: https://arxiv.org/pdf/1703.06520.pdf

Attention-based Extraction of Structured Information from Street View Imagery: https://arxiv.org/abs/1704.03549

EAST: An Efficient and Accurate Scene Text Detector: https://arxiv.org/pdf/1704.03155.pdf

TextBoxes: A Fast Text Detector with a Single Deep Neural Network: https://arxiv.org/pdf/1611.06779.pdf

TextBoxes++: A Single-Shot Oriented Scene Text Detector: https://arxiv.org/pdf/1801.02765.pdf

医学影像相关

Evaluate the Malignancy of Pulmonary Nodules Using the 3D Deep Leaky Noisy-or Network:https://arxiv.org/pdf/1711.08324v1.pdf

DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification:

Co-Learning Feature Fusion Maps from PET-CTImages of Lung Cancer: https://arxiv.org/pdf/1810.02492.pdf

Lung Nodule Classification using Deep Local-Global Networks:https://arxiv.org/ftp/arxiv/papers/1904/1904.10126.pdf

Gated-Dilated Networks for Lung Nodule Classification in CT scans:https://arxiv.org/ftp/arxiv/papers/1901/1901.00120.pdf

DIAGNOSTIC CLASSIFICATION OF LUNG NODULES USING 3D NEURAL NETWORKS: https://arxiv.org/pdf/1803.07192.pdf

其它

人流密度估计

Context-Aware Crowd Counting:https://arxiv.org/pdf/1811.10452.pdf

总结:https://github.com/gjy3035/Awesome-Crowd-Counting

异常检测/暴力检测

参考:异常检测概述

Eye in the Sky: Real-time Drone Surveillance System (DSS) for Violent Individuals Identification using ScatterNet Hybrid Deep Learning Network : https://arxiv.org/pdf/1806.00746

Future Frame Prediction for Anomaly Detection – A New Baseline: https://arxiv.org/pdf/1712.09867.pdf

Real-world Anomaly Detection in Surveillance Videos: https://arxiv.org/pdf/1801.04264.pdf

激活函数

PReLU:Delving Deep into Rectifiers:Surpassing Human-Level Performance on ImageNet Classification:https://arxiv.org/pdf/1502.01852.pdf

ELU: FAST AND ACCURATE DEEP NETWORK LEARNING BY EXPONENTIAL LINEAR UNITS : https://arxiv.org/pdf/1511.07289.pdf

SELU: Self-Normalizing Neural Networks: https://arxiv.org/pdf/1706.02515.pdf

RNN

Higher Order Recurrent Neural Networks: https://arxiv.org/pdf/1605.00064.pdf