英文链接地址:https://github.com/tensorflow/models/tree/master/object_detection

 

确保安装了如下的库:

 

Tensorflow Object Detection API depends on the following libraries:

Protobuf 2.6
Pillow 1.0
lxml
tf Slim (which is included in the "tensorflow/models" checkout)
Jupyter notebook
Matplotlib
Tensorflow
模型下载链接:https://github.com/tensorflow/models    (内涵模型各模块的简介,建议使用Chrome浏览器下载 ,下载文档文件名字为:models-master.zip )

下载模型存在自己本地目录下(D:\TensorFlow\TensorFlow Object Detection API Tutorial),我作了解压并在该目录下重命名为model文件夹

 

还需打开链接:https://github.com/google/protobuf/releases  下载所需版本的,我这里下载的是win版本 protoc-3.4.0-win32.zip,解压生成:bin, include两个文件夹

文件目录:D:\TensorFlow\TensorFlow Object Detection API Tutorial\include      与      D:\TensorFlow\TensorFlow Object Detection API Tutorial\bin (该目录下包含protoc.exe,待会需要用到 协议编译models下的object_detection文件)

由于我的运行环境是win10,用Anaconda装的TensorFlow,因此打开Anaconda带的Anaconda prompt(类似cmd),

如下图,目录切到下载解压后的models目录下,用protoc可执行文件编译目录object_detection/protos下的proto文件,生成Python文件;(根据留言反映,我尝试了几个版本的protoc,3.5和3.6版本的确实会有问题,建议尽量用3.5之前的版本,我用的3.4.0的protoc没有问题)

并打开Jupyter notebook

打开Object Detection Demo,运行Run All,该文件是从coco上下载数据并生成预训练模型,读者可根据需求自己训练模型。运行完毕后会生成验证测试效果如下:

主要公布了5个在COCO上训练的网络。网络结构分别是SSD+MobileNet、SSD+Inception、R-FCN+ResNet101、Faster RCNN+ResNet101、Faster RCNN+Inception_ResNet。

如果要检测自己的图片,那么更改TEST_IMAGE_PATHS为自己的图片路径就可以了。

#PATH_TO_TEST_IMAGES_DIR = 'test_images'
#TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]

TEST_IMAGE_PATHS = ['person.jpg'] #图片放到对应目录下

如果要检测自己的图片,那么更改TEST_IMAGE_PATHS为自己的图片路径就可以了。

#PATH_TO_TEST_IMAGES_DIR = 'test_images'
#TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]

TEST_IMAGE_PATHS = ['person.jpg'] #图片放到对应目录下

MODEL_NAME = 'ssd_inception_v2_coco_11_06_2017'
 
MODEL_NAME = 'rfcn_resnet101_coco_11_06_2017'
 
MODEL_NAME = 'faster_rcnn_resnet101_coco_11_06_2017'
 
MODEL_NAME = 'faster_rcnn_inception_resnet_v2_atrous_coco_11_06_2017'

------------------------------------------------------------上述是运行API中的一个程序文件,用预训练的模型测试几张图片分类效果-----------------------------------------

 

 

------------------------------------------------------------以下将用此API以及下载的模型做视频目标检测与定位--------------------------------------------------------------------

将上述文件Download as成py文件,并更改个文件名以便做代码更改,本文更改为:object_detection_tutorial_CONVERT.py (置于目录:D:\TensorFlow\TensorFlow Object Detection API Tutorial\models\object_detection)

接下来对object_detection_tutorial_CONVERT.py内容做修改,改成读摄像头并做目标检测与定位,

更改后的代码:

# coding: utf-8
 
# # Object Detection Demo
# Welcome to the object detection inference walkthrough!  This notebook will walk you step by step through the process of using a pre-trained model to detect objects in an image. Make sure to follow the [installation instructions](https://github.com/tensorflow/models/blob/master/object_detection/g3doc/installation.md) before you start.
 
# # Imports
 
# In[1]:
 
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
 
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
 
import cv2                  #add 20170825
cap = cv2.VideoCapture(0)   #add 20170825
 
# ## Env setup
 
# In[2]:                                  #delete 20170825
# This is needed to display the images.    #delete 20170825
#get_ipython().magic('matplotlib inline')   #delete 20170825
 
# This is needed since the notebook is stored in the object_detection folder.  
sys.path.append("..")
 
 
# ## Object detection imports
# Here are the imports from the object detection module.
 
# In[3]:
 
from utils import label_map_util
 
from utils import visualization_utils as vis_util
 
 
# # Model preparation 
 
# ## Variables
# 
# Any model exported using the `export_inference_graph.py` tool can be loaded here simply by changing `PATH_TO_CKPT` to point to a new .pb file.  
# 
# By default we use an "SSD with Mobilenet" model here. See the [detection model zoo](https://github.com/tensorflow/models/blob/master/object_detection/g3doc/detection_model_zoo.md) for a list of other models that can be run out-of-the-box with varying speeds and accuracies.
 
# In[4]:
 
# What model to download.
MODEL_NAME = 'ssd_mobilenet_v1_coco_11_06_2017'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
 
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'
 
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt')
 
NUM_CLASSES = 90
 
 
# ## Download Model
 
# In[5]:
 
opener = urllib.request.URLopener()
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
  file_name = os.path.basename(file.name)
  if 'frozen_inference_graph.pb' in file_name:
    tar_file.extract(file, os.getcwd())
 
 
# ## Load a (frozen) Tensorflow model into memory.
 
# In[6]:
 
detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')
 
 
# ## Loading label map
# Label maps map indices to category names, so that when our convolution network predicts `5`, we know that this corresponds to `airplane`.  Here we use internal utility functions, but anything that returns a dictionary mapping integers to appropriate string labels would be fine
 
# In[7]:
 
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
 
 
# ## Helper code
 
# In[8]:
 
def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)
 
 
# # Detection
 
# In[9]:
 
# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test_images'
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]
 
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)
 
 
# In[10]:
 
with detection_graph.as_default():
  with tf.Session(graph=detection_graph) as sess:
    while True:    #for image_path in TEST_IMAGE_PATHS:    #changed 20170825
      ret, image_np = cap.read()
	  
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)
      image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
      # Each box represents a part of the image where a particular object was detected.
      boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
      # Each score represent how level of confidence for each of the objects.
      # Score is shown on the result image, together with the class label.
      scores = detection_graph.get_tensor_by_name('detection_scores:0')
      classes = detection_graph.get_tensor_by_name('detection_classes:0')
      num_detections = detection_graph.get_tensor_by_name('num_detections:0')
      # Actual detection.
      (boxes, scores, classes, num_detections) = sess.run(
          [boxes, scores, classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})
      # Visualization of the results of a detection.
      vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          np.squeeze(boxes),
          np.squeeze(classes).astype(np.int32),
          np.squeeze(scores),
          category_index,
          use_normalized_coordinates=True,
          line_thickness=8)
      cv2.imshow('object detection', cv2.resize(image_np,(800,600)))
      if cv2.waitKey(25) & 0xFF ==ord('q'):
        cv2.destroyAllWindows()
        break
      #plt.figure(figsize=IMAGE_SIZE)   #delete 20170825
      #plt.imshow(image_np)             #delete 20170825
 
 
# In[ ]:
 
 

之后用IDLE或是Spyder运行代码,效果如下。