Create MS COCO style dataset
Asked Answered
A

4

13

How to create a MS COCO style dataset to use with TensorFlow? Does anyone have an experience with this? I have images, and annotations, as well as ground truth masks. I need to convert them to be compatible with MS COCO and any help is appreciated. I can't find any open source tool to create COCO style JSON annotations.

TensorFlow MS COCO reads JSON files which I'm not very experienced with.

Adhibit answered 7/8, 2017 at 10:54 Comment(2)
did you find any answer for this ?Glia
and i wanted to ask you that how did you annnotated images and prepared ground truth mask ?Glia
A
9

I'm working on a python library which has many useful classes and functions for doing this. It's called Image Semantics.

Here is an example of adding masks and exporting them in COCO format:

from imantics import Mask, Image, Category

image = Image.from_path('path/to/image.png')
mask = Mask(mask_array)
image.add(mask, category=Category("Category Name"))

# dict of coco
coco_json = image.export(style='coco')
# Saves to file
image.save('coco/annotation.json', style='coco')
Aggarwal answered 20/1, 2019 at 0:47 Comment(0)
K
5

You can try to use pycococreator, which includes a set of tools to convert binary masks to the polygon and RLE formats that COCO uses.

https://github.com/waspinator/pycococreator/

Here is an example of how you could use it to create annotation information from a binary mask:

annotation_info = pycococreatortools.create_annotation_info(
                    segmentation_id, image_id, category_info, binary_mask,
                    image.size, tolerance=2)

You can read more details about how to use pycococreator here: https://patrickwasp.com/create-your-own-coco-style-dataset/

Kubetz answered 17/4, 2018 at 16:3 Comment(0)
V
4

In order to convert a mask array of 0's and 1's into a polygon similar to the COCO-style dataset, use skimage.measure.find_contours, thanks to code by waleedka.

import numpy
from skimage.measure import find_contours 

mask = numpy.zeros(width, height) # Mask
mask_polygons = [] # Mask Polygons

# Pad to ensure proper polygons for masks that touch image edges.
padded_mask = np.zeros(
(mask.shape[0] + 2, mask.shape[1] + 2), dtype=np.uint8)
padded_mask[1:-1, 1:-1] = mask
contours = find_contours(padded_mask, 0.5)
for verts in contours:
    # Subtract the padding and flip (y, x) to (x, y)
    verts = np.fliplr(verts) - 1
    pat = PatchCollection([Polygon(verts, closed=True)], facecolor='green', linewidths=0, alpha=0.6)
    mask_polygons.append(pat) 

To generate the JSON file for a COCO-style dataset, you should look into the Python's JSON API. Beyond that, it's just simply about matching the format used by the COCO dataset's JSON file.

You should take a look at my COCO style dataset generator GUI repo. I built a very simple tool to create COCO-style datasets.

The specific file you're interested in is create_json_file.py, which takes matplotlib polygon coordinates in the form (x1, y1, x2, y2 ...) for every polygon annotation and converts it into the JSON annotation file quite similar to the default format of COCO.

Vrablik answered 12/3, 2018 at 12:55 Comment(2)
do you have any suggestions on how to generate uncompressed RLE encoded masks for iscrowd: 1 annotations?Kubetz
As far as I understood, iscrowd=1 annotations are just collections of polygons representing a cluster of similar objects. I haven't looked into it myself but if they are just collections of polygons, extending to multiple polygons would be to add multiple polygon support for a single object cluster.Vrablik
B
4

CREATING COCO STYLE DATASETS AND USING ITS API TO EVALUATE METRICS

Let's assume that we want to create annotations and results files for an object detection task (So, we are interested in just bounding boxes). Here is a simple and light-weight example which shows how one can create annoatation and result files appropriately formatted for using COCO API metrics.

Annotation file: ann.json

{"images":[{"id": 73}],"annotations":[{"image_id":73,"category_id":1,"bbox":[10,10,50,100],"id":1,"iscrowd": 0,"area": 10}],"categories": [{"id": 1, "name": "person"}, {"id": 2, "name": "bicycle"}, {"id": 3, "name": "car"}]}

Result file: res.json

[{"image_id":73,"category_id":1,"bbox":[10,10,50,100],"score":0.9}]

Now, you can simply use the following script to evaluate the COCO metrics:

from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval

annFile = './ann.json'
resFile='./res.json'

cocoGt=COCO(annFile)

cocoDt=cocoGt.loadRes(resFile)

annType = 'bbox'
cocoEval = COCOeval(cocoGt,cocoDt,annType)
cocoEval.evaluate()
cocoEval.accumulate()
cocoEval.summarize()
Butlery answered 16/5, 2019 at 6:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.