Introduction

Presentation

Pastec is an open source index and search engine for image recognition based on OpenCV. It can recognize flat objects such as covers, packaged goods or artworks. It has however not been designed to recognize faces, 3D objects or barcodes and QR codes.

Pastec can be, for example, used to recognize DVD covers in a mobile app or detect near duplicate images in a big database.

Pastec do not store the pixels of the images in its database. It stores the features of each image thanks to the technique of the visual words.

Intellectual property

Pastec is only based on the free packages of OpenCV that are available for commercial purposes. You should therefore be free to use Pastec without paying any patent licence.

More precisely, Pastec uses the patent-free ORB descriptor and not the well-know SIFT and SURF descriptors that are patented.

Setup

Compilation

Dependancies

To be compiled, Pastec requires OpenCV, jsoncpp and libmicrohttpd. On Ubuntu 14.10, those package can be installed using the following command:

sudo apt-get install libopencv-dev libmicrohttpd-dev libjsoncpp-dev

If you are using another distribution or operating system, you may have to compile them yourself.

Building

Pastec uses cmake as build system. You also need Git to get the source code. On Ubuntu, they can be installed using the following command:

sudo apt-get install cmake git

To compile Pastec, first get the sources with the following command:

git clone https://github.com/Visu4link/pastec.git
cd pastec

Then create a compilation folder:

mkdir build

Go to this subdirectory and run cmake:

cd build
cmake ../

Finally, run make to compile Pastec:

make

Running

To start Pastec, just run the pastec executable. It takes as mandatory argument the path to a file containing a list of ORB visual words. For now, please use this file visualWordsORB.dat. Next Pastec releases will contain tools that will allow you to generate your own list of visual words.

./pastec visualWordsORB.dat
The default port pastec listens for the REST API is 4212. You can set an other port with the -p argument. You can also give a path to an index file to load with the -i argument.

API

HTTP API

Pastec can be controlled using a simple HTTP API. By default, it listens to the 4212 port but you can change this using the -p argument.

Pastec answers are always formatted in JSON. They contains a mandatory type field that describes the result obtained or an error. Each image has an associated id that is a 32 bit unsigned integer. This id establishes the link in the index between the images and their signatures.

All the uploaded images must have their dimensions above 150 pixels. If one of the image dimension exceeds 1000 pixels, the image is resized so that the maximum dimension is set to 1000 pixels and the original aspect ratio is kept.

Here is a detailed list of the API calls:

Adding an image to the index

This call allows to add the signature of an image in the index to make it available for searching. You need to provide the compressed binary data of the image and an id to identify it.

  • Path: /index/images/<image id>
  • HTTP method: PUT
  • Data: the binary data of the image to add compressed in JPEG
  • Answer type: "IMAGE_ADDED"
  • Possible error types: "IMAGE_NOT_DECODED", "IMAGE_SIZE_TOO_BIG", "IMAGE_SIZE_TOO_SMALL"
  • Example:
    • Command line:
      curl -X PUT --data-binary @/home/test/img/1.jpg http://localhost:4212/index/images/23
      
    • Answer:
      {
         "image_id" : 23,
         "type" : "IMAGE_ADDED"
      }
      

Removing an image from the index

This call removes the signature of an image in the index thanks to its id. Be careful to not call often this method if your index is big because it is currently very slow.

  • Path: /index/images/<image id>
  • HTTP method: DELETE
  • Answer type: "IMAGE_REMOVED"
  • Possible error type: "IMAGE_NOT_FOUND"
  • Example:
    • Command line:
      curl -X DELETE http://localhost:4212/index/images/23
      
    • Answer:
      {
         "image_id" : 23,
         "type" : "IMAGE_REMOVED"
      }
      

Search request

This call performs a search in the index thanks to a request image. It returns the id of the matched images from the most to the least relevant ones.

Request JPEG images with a size approximately equal to 450x340 pixels and a 75% quality are usally enough to achieve a good matching. Their small size allows to quickly send them over a mobile network.

  • Path: /index/searcher
  • HTTP method: POST
  • Data: the binary data of the request image compressed in JPEG
  • Answer: "SEARCH_RESULTS" as type field and a list of the the matched image ids from the most to the least relevant one in the "image_ids" field
  • Possible error types: "IMAGE_NOT_DECODED", "IMAGE_SIZE_TOO_BIG", "IMAGE_SIZE_TOO_SMALL"
  • Example:
    • Command line:
      curl -X POST --data-binary @/home/test/img/request.jpg http://localhost:4212/index/searcher
      
    • Answer:
      {
         "image_ids" : [ 2, 5, 43 ],
         "type" : "SEARCH_RESULTS"
      }
      

Clear an index

This call erases all the data currently contained in the index.

  • Path: /index/io
  • HTTP method: POST
  • Answer type: "INDEX_CLEARED"
  • Possible error types: -
  • Example:
    • Command line:
      curl -X POST -d '{"type":"CLEAR"}' http://127.0.0.1:4212/index/io
      
    • Answer:
      {
         "type" : "INDEX_CLEARED"
      }
      

Load an index

This call loads the index data in a provided path.

  • Path: /index/io
  • HTTP method: POST
  • Data: a json with a type field of value "LOAD" and a "index_path" field that set the path where to read the index.
  • Answer type: "INDEX_LOADED"
  • Possible error types: "INDEX_NOT_FOUND"
  • Example:
    • Command line:
      curl -X POST -d '{"type":"LOAD", "index_path":"test.dat"}' http://127.0.0.1:4212/index/io
      
    • Answer:
      {
         "type" : "INDEX_LOADED"
      }
      

Save an index

This call saves the index data in a specified path.

  • Path: /index/io
  • HTTP method: POST
  • Data: a json with a type field of value "WRITE" and a "index_path" field that set the path where to write the index
  • Answer type: "INDEX_WRITTEN"
  • Possible error types: "INDEX_NOT_WRITTEN"
  • Example:
    • Command line:
      curl -X POST -d '{"type":"WRITE", "index_path":"test.dat"}' http://127.0.0.1:4212/index/io
      
    • Answer:
      {
         "type" : "INDEX_WRITTEN"
      }
      

Ping Pastec

This call sends a simple PING command to pastec that answers with a PONG.

  • Path: /
  • HTTP method: POST
  • Data: a json with a "type" field of value "PONG"
  • Answer type: "PONG"
  • Possible error types: -
  • Example:
    • Command line:
      curl -X POST -d '{"type":"PING"}' http://localhost:4212/
      
    • Answer:
      {
         "type" : "PONG"
      }
      

Python API

In the python subdirectory of the source directory, you will also find a tiny python API that is actually just a wrapper of the HTTP API. We encourage you to read the small source to code to understand it.