Forem: Voxel51-Brian

FiftyOne Computer Vision Tips and Tricks for Adding and Merging Data – Feb 17, 2023

Voxel51-Brian — Sat, 18 Feb 2023 16:30:14 +0000

Welcome to our weekly FiftyOne tips and tricks blog where we give practical pointers for using FiftyOne on topics inspired by discussions in the open source community. This week we’ll cover adding and merging data.

Wait, What’s FiftyOne?

FiftyOne is an open source machine learning toolset that enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster.

If you like what you see on GitHub, give the project a star.
Get started! We’ve made it easy to get up and running in a few minutes.
Join the FiftyOne Slack community, we’re always happy to help.

Ok, let’s dive into this week’s tips and tricks!

A primer on adding and merging

Datasets are the core data structure in FiftyOne, allowing you to represent your raw data, labels, and associated metadata. Samples are the atomic elements of a Dataset that store all the information related to a given piece of data. When you query and manipulate a Dataset object using dataset views, a DatasetView object is returned, which represents a filtered view into a subset of the underlying dataset’s contents.

Many computer vision workflows involve operations that merge data from multiple sources, such as adding new samples to an existing dataset, or merging a model’s predictions into a dataset which contains ground truth labels. In FiftyOne, Dataset and DatasetView objects come with a variety of methods that make performing these add and merge operations easy.

Continue reading for some tips and tricks to help you master adding and merging data in FiftyOne!

Encountering a sample multiple times

If you want to add a completely new collection of samples, samples, to a dataset, dataset, then you can use the add_samples() and add_collection() methods for the most part interchangeably. However, if there are samples that appear multiple times in your workflows, due to sources of randomness, for instance, then these two methods have different consequences.

When add_samples() encounters samples that are already present in the dataset to which the method is applied, it generates a new sample with a new id, and adds this to the dataset. On the other hand, add_collection() ignores the sample and moves on.

In the code block below, when applied to the Quickstart Dataset with a random collection from the dataset as input, add_collection() leaves the dataset unchanged, whereas add_samples() increases the size of the dataset:

import fiftyone as fo
import fiftyone.zoo as foz
# 200 samples
dataset = foz.load_zoo_dataset("quickstart")
# randomly select 50 samples
samples = dataset.take(50)
# doesn’t change dataset size
dataset.add_collection(samples)
# 200 samples → 250 samples
dataset.add_samples(samples)

Learn more about FiftyOne’s random utils in the FiftyOne Docs.

Add samples by directory

FiftyOne supports a variety of common computer vision data formats, making it easy to load your data into FiftyOne and accelerating your computer vision workflows. FiftyOne’s DatasetImporter classes allow you to import data in various formats without needing to write your own loops and I/O scripts.

If you have VOC-style data stored in a single directory on disk, for instance, you can create a dataset from this data using the from_dir() method:

import fiftyone as fo
name = "my-dataset"
data_path = "/path/to/images"
labels_path = "/path/to/voc-labels"
# Import dataset by explicitly providing paths to the source media and labels
dataset = fo.Dataset.from_dir(
    dataset_type=fo.types.VOCDetectionDataset,
    data_path=data_path,
    labels_path=labels_path,
    name=name,
)

With the add_dir() method, you can extend the logic of any existing DatasetImporter to data that is stored in multiple directories. To add train and val data in YOLOv5 format to a single dataset, you can run the following:

import fiftyone as fo
name = "my-dataset"
dataset_dir = "/path/to/yolov5-dataset"
# The splits to load
splits = ["train", "val"]
# Load the dataset, using tags to mark the samples in each split
dataset = fo.Dataset(name)
for split in splits:
    dataset.add_dir(
        dataset_dir=dataset_dir,
        dataset_type=fo.types.YOLOv5Dataset,
        split=split,
        tags=split,
)

This allows you to add the contents of each directory directly to the final dataset without having to instantiate temporary datasets. The merge_dir() can also be similarly useful!

Learn more about loading data into FiftyOne in the FiftyOne Docs.

Add from archive

On a related note, if you have data in a common archived format, such as .zip, .tar, or .tar.gz stored on disk, you can use the add_dir() or merge_dir() methods to add this data to your dataset. If the archived data has not been unpacked yet, FiftyOne will handle this extraction for you!

Learn more about from_archive(), add_archive(), and merge_archive() in the FiftyOne Docs.

Add model predictions

In machine learning workflows, it is common practice to withhold ground truth information at inference time. To accomplish this, it is often beneficial to separate the various fields of your dataset so that only certain subsets of information are available at different steps.

When it comes to evaluating model performance at the end of the day, however, we would like to merge ground truth labels and predictions into a common dataset. In FiftyOne, this is possible with the merge_samples() method. If we have a predictions_view only containing predictions, and a dataset with all other information, we can merge the predictions into our base dataset as follows:

import fiftyone as fo
import fiftyone.zoo as foz
# Create a dataset containing only ground truth objects
dataset = foz.load_zoo_dataset("quickstart")
dataset = dataset.exclude_fields("predictions").clone()
# Example predictions view
predictions_view = dataset1.select_fields("predictions")
# Merge the predictions
dataset.merge_samples(predictions_view)

Learn more about selecting and excluding fields in the FiftyOne Docs.

Export multiple labels with merge_labels()

If you have multiple Label fields and you want to export your data using a common format, you can use the merge_labels() method to merge all of these label fields into one field for export.

For instance, if you have three labels, ground_truth, model1_predictions, and model2_predictions, you can merge all of these labels as follows:

import fiftyone as fo
dataset = fo.load_dataset(...)
## clone label fields into temporary fields
dataset.clone_sample_field("ground_truth", "tmp")
dataset.clone_sample_field("model1_predictions", "tmp1")
dataset.clone_sample_field("model2_predictions", "tmp2")
## merge model1 predictions into ground_truth
dataset.merge_labels("tmp1", "tmp")
## merge model2 predictions into ground_truth
dataset.merge_labels("tmp2", "tmp")
## export the merged labels field
dataset.export(..., label_field="tmp")
## clean up
dataset.delete_sample_fields(["tmp", "tmp1", "tmp2"])

If you want to export the data just so that you can import it at a later time, however, then you can avoid all of this and instead make your dataset persistent!

dataset.persistent = True

Learn more about labels and dataset persistence in the FiftyOne Docs.

Join the FiftyOne community!

Join the thousands of engineers and data scientists already using FiftyOne to solve some of the most challenging problems in computer vision today!

1,350+ FiftyOne Slack members
2,500+ stars on GitHub
3,100+ Meetup members
Used by 246+ repositories
56+ contributors

What’s next?

If you like what you see on GitHub, give the project a star.
Get started! We’ve made it easy to get up and running in a few minutes.
Join the FiftyOne Slack community, we’re always happy to help.

Announcing FiftyOne 0.19 with Spaces, In-App Embeddings Visualization, Saved Views, and More!

Voxel51-Brian — Fri, 17 Feb 2023 04:23:02 +0000

Voxel51 in conjunction with the FiftyOne community is excited to announce the general availability of FiftyOne 0.19. This release is packed with new features that make it even easier and faster to visualize your computer vision datasets and boost the performance of your machine learning models. How? Read on!

Wait, what’s FiftyOne?

FiftyOne is the open source machine learning toolset that enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster.

If you like what you see on GitHub, give the project a star.
Get started! We’ve made it easy to get up and running in a few minutes.
Join the FiftyOne Slack community, we’re always happy to help.

Ok, let’s dive into the release.

tl;dr: What’s new in FiftyOne 0.19?

This release includes:

Spaces: an all-new customizable framework for organizing interactive information panels within the FiftyOne App, allowing you to visualize and query your datasets in powerful new ways through a convenient interface
In-App embeddings visualization: you can now interactively explore embeddings visualizations natively in the App by opening an embeddings panel with one click
Saved views: you can now save views into your datasets and switch between them natively in the App
On-disk segmentations: you can now store your semantic segmentation masks and heatmaps on disk, rather than in the database
New UI filtering options: the App’s sidebar now contains upgraded options for filtering datasets
FiftyOne Teams documentation: documentation for FiftyOne Teams is now publicly available!

Check out the release notes for a full rundown of additional enhancements and bug fixes.

Live demo & AMA on Feb. 28 @ 10 AM PT

You can see all of the new features in action in a live webinar and AMA on February 28, 2023 at 10 AM Pacific Time. I’ll be demoing all of the new features in FiftyOne 0.19, followed by an open Q&A where you can get answers to any questions you might have. Register here.

Now, here’s a quick overview of some of the new features we packed into this release.

Spaces

FiftyOne 0.19 debuts Spaces, a customizable framework for organizing interactive information Panels in the App.

As of FiftyOne 0.19, the following Panel types are included natively:

Samples Panel: the media grid that loads by default when you launch the App
Histograms Panel: a dashboard of histograms for the fields of your dataset
(New!) Embeddings Panel: a canvas for working with embeddings visualizations
Map Panel: visualizes the geolocation data of datasets that have a GeoLocation field
You can also configure custom Panels via plugins!

In the screenshot below, for example, we’ve added the Embeddings and Map Panels to the default Samples Panel so we can visualize all three together seamlessly in the App.

You can configure Spaces visually in the App in a variety of ways described below.

Click the + icon in any Space to add a new Panel:

When you have multiple Panels open in a Space, you can use the divider buttons to split the Space either horizontally or vertically:

You can rearrange Panels at any time by dragging their tabs between Spaces, or close a Panel by clicking on its x icon:

You can also programmatically configure your Spaces layout from Python!

The code sample below shows an end-to-end example of loading a dataset, generating an embeddings visualization via the FiftyOne Brain, and launching the App with a customized Spaces layout that includes the Samples Panel, Histograms Panel, and Embeddings Panel with the Brain result already loaded:

import fiftyone as fo
import fiftyone.brain as fob
import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset("quickstart")
fob.compute_visualization(dataset, brain_key="img_viz")
samples_panel = fo.Panel(
    type="Samples",
    pinned=True,  # don’t allow closing
)
histograms_panel = fo.Panel(
    type="Histograms",
    state=dict(plot="Labels"),  # open label fields by default
)
# Open the visualization we generated above by default
embeddings_panel = fo.Panel(
    type="Embeddings",
    state=dict(brainResult="img_viz", colorByField="metadata.size_bytes"),
)
spaces = fo.Space(
    children=[
        fo.Space(
            children=[
                fo.Space(children=[samples_panel]),
                fo.Space(children=[histograms_panel]),
            ],
            orientation="horizontal",
        ),
        fo.Space(children=[embeddings_panel]),
    ],
    orientation="vertical",
)
session = fo.launch_app(dataset, spaces=spaces)

Check out the docs for more information about using and configuring Spaces layouts.

In-App embeddings visualization

New in FiftyOne 0.19 (and enabled by the Spaces feature above), when you load a dataset in the App that contains an embeddings visualization, you can open the Embeddings Panel to visualize and interactively explore a scatterplot of the embeddings in the App.

For example, try running the code below to download a dataset, generate two embeddings visualizations on it, and launch the App:

import fiftyone as fo
import fiftyone.brain as fob
import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset("quickstart")
# Image embeddings
fob.compute_visualization(dataset, brain_key="img_viz")
# Object patch embeddings
fob.compute_visualization(
    dataset, patches_field="ground_truth", brain_key="gt_viz"
)
session = fo.launch_app(dataset)

Then click on the + icon next to the Samples tab to open the Embeddings Panel and use the two menus in the upper-left corner of the Panel to configure your plot:

Brain key: the Brain key associated with the compute_visualization() run to display
Color by: an optional sample field (or label attribute, for patches embeddings) to color the points by

From there you can lasso points in the plot to show only the corresponding samples/patches in the Samples Panel:

The Embeddings Panel also provides a number of additional controls:

Press the pan icon in the menu (or type g) to switch to pan mode, in which you can click and drag to change your current field of view
Press the lasso icon (or type s) to switch back to lasso mode
Press the locate icon to reset the plot’s viewport to a tight crop of the current view’s embeddings
Press the x icon (or double click anywhere in the plot) to clear the current selection

When coloring points by categorical fields (strings and integers) with fewer than 100 unique classes, you can also use the legend to toggle the visibility of each class of points:

Single click on a legend trace to show/hide that class in the plot
Double click on a legend trace to show/hide all other classes in the plot

As demonstrated in the previous section, the Embeddings Panel can also be programmatically configured via Python.

Check out the docs for more information about working with embeddings visualizations in the App.

Saved views

In FiftyOne 0.19 you can use a new menu in the upper-left of the App to record the current state of the App’s view bar and filters sidebar as a saved view into your dataset:

Saved views are persisted on your dataset under a name of your choice so that you can quickly load them in a future session via the UI or Python.

Saved views are a convenient way to record semantically relevant subsets of a dataset, such as:

Samples in a particular state, e.g. with certain tag(s)
A subset of a dataset that was used for a task, e.g. training a model
Samples that contain content of interest, e.g. object types or image characteristics

Remember that saved views only store the rules used to extract content from the underlying dataset, not the actual content itself. You can save hundreds of views into a dataset if desired without worrying about storage space.

You can load a saved view at any time by selecting it from the saved view menu:

You can also edit or delete saved views by clicking on their pencil icon:

You can also programmatically create saved views via Python:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F
dataset = foz.load_zoo_dataset("quickstart")
dataset.persistent = True
# Create a view
cats_view = (
    dataset
    .select_fields("ground_truth")
    .filter_labels("ground_truth", F("label") == "cat")
    .sort_by(F("ground_truth.detections").length(), reverse=True)
)
# Save the view
dataset.save_view("cats-view", cats_view)

And load them in future sessions (including saved views created via the App):

import fiftyone as fo
dataset = fo.load_dataset("quickstart")
# Retrieve a saved view
cats_view = dataset.load_saved_view("cats-view")
print(cats_view)

Check out the docs for more information about using saved views in the App and Python.

On-disk segmentations

In prior FiftyOne versions, semantic segmentations and heatmaps could only be stored as compressed bytes directly in the database.

Now in FiftyOne 0.19, you can store segmentations and heatmaps as images on disk and store (only) their paths on your FiftyOne datasets, just like you do for the primary media of each sample:

import cv2
import numpy as np
import fiftyone as fo
# Example segmentation mask
mask_path = "/tmp/segmentation.png"
mask = np.random.randint(10, size=(128, 128), dtype=np.uint8)
cv2.imwrite(mask_path, mask)
sample = fo.Sample(filepath="/path/to/image.png")
sample["segmentation"] = fo.Segmentation(mask_path=mask_path)
print(sample)

Segmentation masks can be stored in either of these formats on disk:

2D 8-bit or 16-bit images
3D 8-bit RGB images

When you load datasets with segmentation fields containing 2D masks in the App, each pixel value is rendered as a different color from the App’s color pool so that you can visually distinguish the classes. When you view RGB segmentation masks in the App, the mask colors are always used.

You can also store semantic labels for your segmentation fields on your dataset. Then, when you view the dataset in the App, label strings will appear in the App’s tooltip when you hover over pixels.

If you are working with 2D segmentation masks, specify target keys as integers:

import fiftyone as fo
dataset = fo.Dataset()
dataset.default_mask_targets = {1: "cat", 2: "dog"}

And if you are working with RGB segmentation masks, specify target keys as RGB hex strings:

import fiftyone as fo
dataset = fo.Dataset()
dataset.default_mask_targets = {"#499CEF": "cat", "#6D04FF": "dog"}

The entire FiftyOne API was upgraded to support on-disk and/or RGB segmentations:

Evaluation via evaluate_segmentations() natively supports on-disk and/or RGB segmentations
The apply_model() method now has an optional output_dir argument specifying where to store semantic segmentation inferences as images on disk
There’s a new export_segmentations() utility for conveniently exporting in-database segmentations to on-disk images
Other new utilities like transform_segmentations() are now available for manipulating segmentations

Check out the docs for more information about adding on-disk segmentations to your FiftyOne datasets.

New UI filtering options

We’re constantly improving and extending the filtering options available natively in the App to provide more powerful and intuitive ways to query datasets. In FiftyOne 0.19, we added a new selector that allows you to fine-tune your filters in the sidebar.

For example, when filtering by the label attribute of a Detections field, you can choose between the following options:

(default): Filter to only show objects with the specified labels (omitting samples with no matching objects)
Exclude objects with the specified labels
Show samples that contain the specified labels (without filtering)
Omit samples that contain the specific labels

All applicable filtering options are available from both the grid view and the sample modal, and for all field types, including top-level fields and dynamic label attributes!

FiftyOne Teams documentation

Exciting news! Documentation for FiftyOne Teams is now publicly available at https://docs.voxel51.com/teams/index.html.

FiftyOne Teams enables multiple users to securely collaborate on the same datasets and models, either on-premises or in the cloud, all built on top of the open source FiftyOne workflows that you’re already relying on. Look interesting? Schedule a demo to get started with FiftyOne Teams yourself.

Community contributions

Shoutout to the following community members who contributed to this release!

kalpit-S contributed #2354 – added help link for Mapbox configuration in App
flakeice contributed #2359 – fix bug when loading datasets in VOC format
Rustem Galiullin contributed #2353 – add support for custom CVAT task names
Rustem Galiullin contributed #2373 – exact frame count support
Oguz-hanoglu contributed #2297 – improved explanation of sidebar modes in the App
Jamie Werther contributed #2427 – show only supported eval keys
Nikita Manovich contributed #2478 – Fix several CVAT links
Chris Hall contributed #2561 – updated CVAT links

FiftyOne community updates

The FiftyOne community continues to grow!

1,300+ FiftyOne Slack members
2,500+ stars on GitHub
3,000+ Meetup members
Used by 245+ repositories
56+ contributors

What’s next?

If you like what you see on GitHub, give the project a star.
Get started! We’ve made it easy to get up and running in a few minutes.
Join the FiftyOne Slack community, we’re always happy to help.

FiftyOne Computer Vision Tips and Tricks – Feb 10, 2023

Voxel51-Brian — Sat, 11 Feb 2023 16:29:04 +0000

Welcome to our weekly FiftyOne tips and tricks blog where we recap interesting questions and answers that have recently popped up on Slack, GitHub, Stack Overflow, and Reddit.

Wait, what’s FiftyOne?

If you like what you see on GitHub, give the project a star.
Get started! We’ve made it easy to get up and running in a few minutes.
Join the FiftyOne Slack community, we’re always happy to help.

Ok, let’s dive into this week’s tips and tricks!

Isolating spurious or missing objects

Community Slack member George Pearse asked,

“Is there a way to just get bounding boxes around the possibly missing and possibly spurious objects in my dataset?”

Here, George is asking about how to isolate potential mistakes in ground truth labels on a dataset. When working with a new dataset, it is always important to validate the quality of the ground truth annotations. Even highly regarded and well-cited datasets can contain a plethora of errors.

Two such common types of errors in object detection labels are:

A ground truth label was spuriously added to the data, and does not correspond to an object in the allowed object classes
An object is not annotated, so the ground truth detection is missing

Fortunately, the FiftyOne Brain provides a built-in method that identifies possible spurious and missing detections. These are stored at both the sample level and the detection level.

With FiftyOne’s filtering capabilities, it is easy to create a view containing only the detections that are possibly spurious, or possibly missing, or both. In these cases, you might also find it helpful to convert the filtered view to a PatchView so you can view each potential error on its own. Here is some code to get you started:

import fiftyone as fo
import fiftyone.brain as fob
import fiftyone.zoo as foz
from fiftyone import ViewField as F

## load example dataset
dataset = foz.load_zoo_dataset("quickstart")

## find possible mistakes
fob.compute_mistakenness(dataset, "predictions")

## create a view containing only objects whose
## ground truth detections are possibly missing
pred_field = "predictions"
missing_view = dataset.filter_labels(
    pred_field, 
    F("possible_missing") > 0, 
    only_matches=True
).to_patches(pred_field)

## create a view containing only objects whose
## ground truth detections are possibly spurious
gt_field = "ground_truth"
spurious_view = dataset.filter_labels(
    gt_field, 
    F("possible_spurious") > 0, 
    only_matches=True
).to_patches(gt_field)

We can then view these in the FiftyOne App. Inspect the possibly spurious detection patches, for instance:

session = fo.launch_app(spurious_view)

Learn more about identifying detection mistakes in the FiftyOne Docs.

Filtering by ID

Community Slack member Sylvia Schmitt asked,

“I am storing related sample IDs as StringField objects in a separate field on my data and I want to use them to match sample IDs that are stored as ObjectIdField objects. How do I do this?”

If you were comparing the values in two StringFields, you could use the ViewField as follows:

import fiftyone as fo
from fiftyone import ViewField as F
dataset = fo.Dataset(..)
dataset.match(F('field_a') == F('field_b'))

However, sample IDs are represented as ObjectIdField objects. They are stored under an _id key in the underlying database, and need to be referenced with this same syntax, prepending an underscore. Additionally, the object needs to be converted to a string for the comparison.

Here is what such a matching operation might look like:

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

dataset = foz.load_zoo_dataset("quickstart")

# Add a `str_id` field that matches `id` on 10 samples
view = dataset.take(10)
view.set_values("str_id", view.values("id"))

matching_view = dataset.match(
    F("str_id") == F("_id").to_string()
)

Learn more about fields and filtering in the FiftyOne Docs.

Merging datasets with shared media files

Community Slack member Joy Timmermans asked,

“I have three datasets, and some of my samples are in multiple datasets. I’d like to combine all of these datasets into one dataset for export, retaining each copy of each of the samples. How do I do this?”

If your datasets were created independently, even if there are samples that have the same media files (located at the same file paths), these samples will have different sample IDs. In this case, you can create a combined dataset with the add_collection() method without passing in any optional arguments.

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")
ds1 = dataset[:100].clone()
ds2 = dataset[100:].clone()

## create temporary dataset for combining
tmp = ds1.clone()
## add ds2 samples
tmp.add_collection(ds2)

## export
tmp.export(..)
## delete temporary dataset
tmp.delete()

If, on the other hand, your datasets have samples with the same sample IDs, then applying the add_collection() method without options will only lead to the “combined” dataset having a single copy of each media file.

Fortunately, you can bypass this by passing in new_ids = True to add_collection(). In your case, combining three datasets would look like:

import fiftyone as fo
import fiftyone.zoo as foz
## start with dataset1, dataset2, dataset3
tmp = dataset1.clone()
tmp.add_collection(dataset2, new_ids = True)
tmp.add_collection(dataset3, new_ids = True)
tmp.export(..)
tmp.delete()

Learn more about merging datasets in the FiftyOne Docs.

Exporting GeoJSON annotations

Community Slack member Kais Bedioui asked,

“I am logging some of my production data in GeoJSON format, and I want to save it in the database in that same format. Is there a way to include the ground_truth label in the labels.json file so that when I reload the GeoJSON dataset, it comes with its annotations?”

To do this, you can use the optional property_makers argument of the GeoJSON exporter to include additional properties directly in GeoJSON format. For example:

import fiftyone as fo
import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset("quickstart-geo")
dataset.export(
    labels_path="/tmp/labels.json",
    dataset_type=fo.types.GeoJSONDataset,
    property_makers={"ground_truth": lambda d: len(d.detections)},
)

Alternatively, if you want to save the annotations but do not need to save everything in GeoJSON format, you can export it as a FiftyOne Dataset:

import fiftyone as fo
dataset.export(
    export_dir="/tmp/all",
    dataset_type=fo.types.FiftyOneDataset,
    export_media=False,
)

When you take this approach, all you have to do to load the dataset back in is use FiftyOne’s from_dir() method:

import fiftyone as fo
dataset = fo.Dataset.from_dir(
    dataset_dir="/tmp/all",
    dataset_type=fo.types.FiftyOneDataset,
)

Learn more about from_dir(), and importing and exporting data in the FiftyOne Docs.

Picking random frames from videos

Community Slack member Joy Timmermans asked,

“Is there an equivalent of take() for frames in a video dataset so that I can randomly select a subset of frames for each sample?”

One way to accomplish this would be to use a Python library for random sampling in conjunction with the select_frames() method. First, you can use random sampling without replacement to pick a set of frame numbers for each video. Then, you can get the frame id for each of these frames. Finally, you can pass this list of ids into select_frames().

Here’s one implementation using numpy’s random choice method:

from numpy.random import choice as nrc
import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F
dataset = foz.load_zoo_dataset("quickstart-video")
## get nested list of frame ids for each sample
frame_ids = dataset.values("frames.id")
## number of frames for each sample
nframes = dataset.values(F("frames").length())
## number of samples in dataset
nsample = len(dataset)
sample_frames = [nrc(nframe, 10, replace=False) for nframe in nframes]
keep_frame_ids = []
for i in range(nsample):
    curr_frame_ids = frame_ids[i]
    for s in sample_frames[i]:
        keep_frame_ids.append(curr_frame_ids[s])
kept_view = dataset.select_frames(keep_frame_ids)

If you’d like, at this point you can also convert the videos to frames:

kept_frames_view = kept_view.to_frames()

Learn more about video views and frame views in the FiftyOne Docs.

Join the FiftyOne community!

Join the thousands of engineers and data scientists already using FiftyOne to solve some of the most challenging problems in computer vision today!

1,300+ FiftyOne Slack members
2,500+ stars on GitHub
3,000+ Meetup members
Used by 241+ repositories
55+ contributors

What’s next?

If you like what you see on GitHub, give the project a star.
Get started! We’ve made it easy to get up and running in a few minutes.
Join the FiftyOne Slack community, we’re always happy to help.

FiftyOne Computer Vision Model Evaluation Tips and Tricks – Feb 03, 2023

Voxel51-Brian — Fri, 03 Feb 2023 18:40:54 +0000

Wait, what’s FiftyOne?

If you like what you see on GitHub, give the project a star.
Get started! We’ve made it easy to get up and running in a few minutes.
Join the FiftyOne Slack community, we’re always happy to help.

Ok, let’s dive into this week’s tips and tricks!

A primer on model evaluations

FiftyOne provides a variety of builtin methods for evaluating your model predictions, including regressions, classifications, detections, polygons, instance and semantic segmentations, on both image and video datasets.

When you evaluate a model in FiftyOne, you get access to the standard aggregate metrics such as classification reports, confusion matrices, and PR curves for your model. In addition, FiftyOne can also record fine-grained statistics like accuracy and false positive counts at the sample-level, which you can leverage via dataset views and the FiftyOne App to interactively explore the strengths and weaknesses of your models on individual data samples.

FiftyOne’s model evaluation methods are conveniently exposed as methods on all Dataset and DatasetView objects, which means that you can evaluate entire datasets or specific views into them via the same syntax.

Continue reading for some tips and tricks to help you master evaluations in FiftyOne!

Task-specific evaluation methods

In FiftyOne, the Evaluation API supports common computer vision tasks like object detection and classification with default evaluation methods that implement some of the standard routines in the field. For standard object detection, for instance, the default evaluation style is MS COCO. In most other cases, the default evaluation style is denoted "simple". If the default style for a given task is what you are looking for, then there is no need to specify the method argument.

import fiftyone as fo
import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset("quickstart")
results = dataset.evaluate_detections(
    "predictions", 
    gt_field = "ground_truth"
)

Alternatively, you can explicitly specify a method to use for model evaluation:

dataset.evaluate_detections(
    "predictions", 
    gt_field = "ground_truth", 
    method = "open-images"
)

Each evaluation method has an associated evaluation config, which specifies what arguments can be passed into the evaluation routine when using that style of evaluation. For ActivityNet style evaluation, for example, you can pass in an iou argument specifying the IoU threshold to use, and you can pass in compute_mAP = True to tell the method to compute the mean average precision.

To see which label types are available for a dataset, check out the section detailing that dataset in the FiftyOne Dataset Zoo documentation.

Learn more about evaluating object detections in the FiftyOne Docs.

Evaluations on views

All methods in FiftyOne’s Evaluation API that are applicable to Dataset instances are also exposed to DatasetView. This means that you can compute evaluations on subsets of your dataset obtained through filtering, matching, and chaining together any number of view stages.

As an example, we can evaluate detections only on samples that are highly unique in our dataset, and which have fewer than 10 predicted detections:

import fiftyone as fo
import fiftyone.brain as fob
import fiftyone.zoo as foz
from fiftyone import ViewField as F
## compute uniqueness of each sample
fob.compute_uniqueness(dataset)
dataset = foz.load_zoo_dataset("quickstart")
## create DatasetView with 50 most unique images
unique_view = dataset.sort_by(
    "uniqueness", 
    reverse=True
).limit(50)
## get only the unique images with fewer than 10 predicted detections
few_pred_unique_view = unique_view.match(
    F("predictions.detections").length() < 10
)
## evaluate detections for this view
few_pred_unique_view.evaluate_detections(
    "predictions", 
    gt_field="ground_truth",
    eval_key="eval_few_unique"
)

Learn more about the FiftyOne Brain in the FiftyOne Docs.

Plotting interactive confusion matrices

For classification and detection evaluations, FiftyOne’s evaluation routines generate confusion matrices. You can plot these confusion matrices in FiftyOne with the plot_confusion_matrix() method.

import fiftyone as fo
import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset("quickstart")
## generate evaluation results
results = dataset.evaluate_detections(
    "predictions", 
    gt_field = "ground_truth"
)
## plot confusion matrix
classes = ["person", "kite", "car", "bird"]
plot = results.plot_confusion_matrix(classes=classes)
plot.show()

Because the confusion matrix is implemented in plotly, it is interactive! To interact visually with your data via the confusion matrix, attach the plot to a session launched with the dataset:

## create a session and attach plot
session = fo.launch_app(dataset)
session.plots.attach(plot)

Clicking into a cell in the confusion matrix then changes which samples appear in the sample grid in the FiftyOne App.

Learn more about interactive plotting in the FiftyOne Docs.

Evaluating frames of a video

All of the evaluation methods in FiftyOne’s Evaluation API can be applied to frame-level labels in addition to sample-level labels. This means that you can evaluate video samples without needing to convert the frames of a video sample to standalone image samples.

Applying FiftyOne evaluation methods to video frames also has the added benefit that useful statistics are computed at both the frame and sample levels. For instance, the following code populates the fields eval_tp, eval_fp, and eval_fn as summary statistics on the sample level, containing the total number of true positives, false positives, and false negatives across all frames in the sample. Additionally, on each frame, the evaluation populates an eval field for each detection with a value of either tp, fp, or fn, as well as an eval_iou field where appropriate.

import random
import fiftyone as fo
import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset(
    "quickstart-video", 
    dataset_name="video-eval-demo"
)
## Create some test predictions 
classes = dataset.distinct("frames.detections.detections.label")
def jitter(val):
    if random.random() < 0.10:
        return random.choice(classes)
    return val
predictions = []
for sample_gts in dataset.values("frames.detections"):
    sample_predictions = []
    for frame_gts in sample_gts:
        sample_predictions.append(
            fo.Detections(
                detections=[
                    fo.Detection(
                        label=jitter(gt.label),
                        bounding_box=gt.bounding_box,
                        confidence=random.random(),
                    )
                    for gt in frame_gts.detections
                ]
            )
        )
    predictions.append(sample_predictions)
dataset.set_values("frames.predictions", predictions)
dataset.evaluate_detections(
    "frames.predictions",
    gt_field="frames.detections",
    eval_key="eval",
)

Note that the only difference in practice is the prefix “frames” used to specify the predictions field and the ground truth field.

Learn more about video views and evaluating videos in the FiftyOne Docs.

Managing multiple evaluations

With all of the flexibility the Evaluation API provides, you’d be well within reason to wonder what evaluation you should perform. Fortunately, FiftyOne makes it easy to perform, manage, and store the results from multiple evaluations!

The results from each evaluation can be stored and accessed via an evaluation key, specified by the eval_key argument. This allows you to compare different evaluation methods on the same data,

import fiftyone as fo
import fiftyone.zoo as foz
dataset = foz.load_zoo_dataset("quickstart")
dataset.evaluate_detections(
    "predictions", 
    gt_field = "ground_truth", 
    method = "coco", 
    eval_key = "coco_eval"
)
dataset.evaluate_detections(
    "predictions", 
    gt_field = "ground_truth", 
    method = "open-images", 
    eval_key = "oi_eval"
)

evaluate predictions generated by multiple models,

dataset.evaluate_detections(
    "model1_predictions", 
    gt_field = "ground_truth", 
    eval_key = "model1_eval"
)
dataset.evaluate_detections(
    "model2_predictions", 
    gt_field = "ground_truth", 
    eval_key = "model2_eval"
)

Or compare evaluations on different subsets or views of your data, such as a view with only small bounding boxes and a view with only large bounding boxes:

from fiftyone import ViewField as F
bbox_area = (
    F("bounding_box")[2] *
    F("bounding_box")[3]
)
large_boxes = bbox_area > 0.7
small_boxes = bbox_area < 0.3
# Create a view that contains only small-sized objects
small_view = (
    dataset
    .filter_labels(
        "ground_truth", 
        small_boxes
    )
)
# Create a view that contains only large-sized objects
large_view = (
    dataset
    .filter_labels(
        "ground_truth", 
        large_boxes
    )
)
small_view.evaluate_detections(
    "predictions",
    gt_field="ground_truth",
    eval_key="eval_small",
)
large_view.evaluate_detections(
    "predictions",
    gt_field="ground_truth",
    eval_key="eval_large",
)

Learn more about managing model evaluations in the FiftyOne Docs.

Join the FiftyOne community!

Join the thousands of engineers and data scientists already using FiftyOne to solve some of the most challenging problems in computer vision today!

1,300+ FiftyOne Slack members
2,500+ stars on GitHub
2,900+ Meetup members
Used by 241+ repositories
55+ contributors

What’s next?

If you like what you see on GitHub, give the FiftyOne project a star.
Get started with FiftyOne! We’ve made it easy to get up and running in a few minutes.
Join the FiftyOne Slack community, we’re always happy to help.

FiftyOne Computer Vision Tips and Tricks - Sept 30, 2022

Voxel51-Brian — Fri, 30 Sep 2022 15:07:04 +0000

Welcome to our weekly FiftyOne tips and tricks blog where we recap interesting questions and answers that have recently popped up on Slack, GitHub, Stack Overflow, and Reddit.

Wait, what’s FiftyOne?

If you like what you see on GitHub, give the project a star
Get started: we’ve made it easy to get up and running in a few minutes
Join the FiftyOne Slack community, we’re always happy to help

Ok, let’s dive into this week’s tips and tricks!

Custom plugins for the FiftyOne App

Community Slack member Gerard Corrigan asked,

“I’d like to integrate FiftyOne with another app. How can I do that?”

With the latest FiftyOne 0.17.0 release you can now customize and extend the FiftyOne App’s behavior. For example, if you need a unique way to visualize individual samples, plot entire datasets, or fetch FiftyOne data, a custom plugin just might be the ticket!

Learn more about how to develop custom plugins for the FiftyOne App on GitHub.

Exporting visualization options from the FiftyOne App

Community Slack member Murat Aksoy asked,

“Is there a way to export videos from the FiftyOne App? Specifically after an end-user makes adjustments to visualization options such as which labels to show, opacity, etc?”

The best workflow to accomplish this would be:

Interactively select/filter in the FiftyOne App in IPython/notebook
Press the “bookmark” icon to save your current filters into the view bar
Render the labels for the current view from Python via:

session.view.draw_labels(...)

The drawlabels() method provides a bunch of options for configuring the look-and-feel of the exported labels, including font sizes, transparency, etc.

In a nutshell, you can use the FiftyOne App to visually filter, but you must use drawlabels() in Python to trigger the rendering and provide any look-and-feel customizations you want, like transparency.

Learn more about the drawlabels() method in the FiftyOne Docs.

Retrieving aggregations per video

Community Slack member Tadej Svetina asked,

“I have a video dataset and I am interested in getting some aggregations (let’s say count of detections) per video. How do I do that?”

You can actually use the values() aggregation for this along with some pretty advanced view expression usage. Specifically, you can reduce the frames in each video to a single value based on the length of the detections in a field of each video. For example, here’s a way to get the number of detections in every video of the dataset.

Or you can modify this code slightly to get a dictionary mapping of video IDs to the number of objects in each video:

id_num_objects_map = dict(zip(*dataset.values(["id", num_objects])))

Learn more about how to use reduce() in the FiftyOne Docs.

Working with polylines and labels using the CVAT integration

Community Slack member Guillaume Dumont asked,

“I am using the CVAT integration with a local CVAT server and somehow, in the cases where the polylines have the same label_id, the last one would override the previous one when downloading annotations. This ended up leaving a single polyline where I expected there to be many. Any ideas what’s going on here?”

When calling to_polylines() you want to make sure to use the mask_types="thing" rather than the default mask_types="stuff" which will give each segment a unique ID. You can also directly annotate semantic segmentation masks and let FiftyOne manage the conversion to polylines for you.

Here’s the relevant snippet from the Docs in regards to mask_types:

`mask_types`(“stuff”) — whether the classes are “stuff” (amorphous regions of pixels) or “thing” (connected regions, each representing an instance of the thing).

Can be any of the following:

- “stuff” if all classes are stuff classes

- “thing” if all classes are thing classes

- a dict mapping pixel values to “stuff” or “thing” for each class

Learn more about the CVAT integration and to_polylines() and semantic segmentation in the FiftyOne Docs.

Exporting only landscape orientation images

Community Slack member Stan asked,

“How would I go about exporting only images in a certain orientation, for example landscape vs portrait? I have a script to tag images as landscape by checking if width is greater than height and then removing all images and annotations for the images that are not in the landscape orientation. What would be the FiftyOne approach for this?”

Here’s a way to isolate landscape image samples, and then remove all other samples, using for example the quickstart dataset:

Learn more about the FiftyOne Dataset Zoo quickstart dataset in the FiftyOne Docs.

What’s next?

If you like what you see on GitHub, give the project a star
Get started: we’ve made it easy to get up and running in a few minutes
Join the FiftyOne Slack community, we’re always happy to help

Announcing Our $12.5M Series A Funding to Bring Transparency and Clarity to the World’s Data

Voxel51-Brian — Fri, 23 Sep 2022 17:28:48 +0000

We're delighted to announce that we raised $12.5 million in Series A funding from new investors Drive Capital, Top Harvest Capital, and Shasta Ventures as well as existing investors eLab Ventures and ID Ventures. This financing allows us to accelerate the next phase of our growth in bringing data-centric machine learning to the world.

Since we started Voxel51 in October 2018, we’ve been building open source and commercial software that enables developers, scientists, and organizations to build high-quality datasets and computer vision models that are powering some of today’s most remarkable machine learning and artificial intelligence. Our software provides the infrastructure for these users to analyze and modulate their datasets allowing them to address critical issues like data bias.

One of our first big milestones was in August 2020 with the launch of the open source FiftyOne project. Since then, FiftyOne has seen incredible growth and adoption. Today FiftyOne is used by tens of thousands of engineers and scientists and has reached 150,000+ monthly active machines, 1900+ GitHub stars, and 1000+ members in our Slack community. We’re grateful for everyone in the enthusiastic and growing FiftyOne community — for your support, contributions, and for being a part of this journey with us!

Late last year, we began working with dozens of startups and Fortune 500 enterprises as early adopters of FiftyOne Teams, our commercial product that enables teams to securely collaborate on their datasets and models. Our early adopters span a variety of industries — automotive, robotics, security, retail, healthcare, and more — including large organizations, which are typically risk averse; a testament to the utility and value that FiftyOne Teams provides! By providing us with real-world usage, input, and feedback, our early adopters helped us shape and harden the FiftyOne Teams solution that we’re publicly announcing today. (You can find the full press release here.) A huge shoutout and thank you to all of our early adopters for making FiftyOne Teams ready for the broader ML/AI community to enjoy!

So… what’s next for Voxel51? It’s no secret that there has been an explosion of visual data. For example, there are an estimated 45 billion cameras in the world today, and the growth will continue to accelerate for decades. This creates a tremendous opportunity for computer vision applications, but only if the data can be properly organized, indexed, and labeled.

As Voxel51 enters its next phase of growth, we’re doubling down on our commitment to build FiftyOne alongside the community so that it remains the leading open source tool for building high-quality datasets and computer vision models. We’re also accelerating the development of FiftyOne Teams as critical and trusted infrastructure for managing organization’s visual data, enabling them to build machine learning systems based on high quality, transparent data that brings their ML-powered products to market faster.

We’re going to need many more exceptional and diverse people to achieve our mission of bringing transparency and clarity to the world’s data. If our mission excites you, check out our open positions across product, engineering, community, and more.

Thank you to everyone who has supported Voxel51 over the years — our amazing team, investors, customers, partners, and open source community. Today is a significant milestone on our journey and I’m excited for many more to come!

Nearest Neighbor Embeddings Search with Qdrant and FiftyOne

Voxel51-Brian — Thu, 21 Jul 2022 18:52:49 +0000

Neural network embeddings are a low-dimensional representation of input data that give rise to a variety of applications. Embeddings have some interesting capabilities, as they are able to capture the semantics of the data points. This is especially useful for unstructured data like images and videos, so you can not only encode pixel similarities but also some more complex relationships.

Embeddings from the BDD100K dataset visualized using FiftyOne and Plotly

Performing searches over these embeddings gives rise to a lot of use cases like classification, building up the recommendation systems, or even anomaly detection. One of the primary benefits of performing a nearest neighbor search on embeddings to accomplish these tasks is that there is no need to create a custom network for every new problem, you can often just use pre-trained models. In fact, it is possible to use the embeddings generated by some publicly available models without any further finetuning.

While there are a lot of powerful use cases that involve embeddings, there are a number of challenges in workflows performing searches over embeddings. Specifically, performing a nearest neighbor search on a large dataset and then being able to effectively act on the results of the search, for example performing workflows like auto-labeling of data, are both technical and tooling challenges. To that end, Qdrant and FiftyOne can help make these workflows effortless.

Qdrant is an open-source vector database designed to perform an approximate nearest neighbor search (ANN) on dense neural embeddings which is necessary for any production-ready system that is expected to scale to large amounts of data.

FiftyOne is an open-source dataset curation and model evaluation tool that allows you to effectively manage and visualize your dataset, generate embeddings, and improve your model results.

In this article, we’re going to load the MNIST dataset into FiftyOne and perform the classification based on ANN, so the data points will be classified by selecting the most common ground truth label among the K nearest points from our training dataset. In other words, for each test example, we’re going to select its K nearest neighbors, using a chosen distance function, and then just select the best label by voting. All that search in the vector space will be done with Qdrant, to speed things up. We will then evaluate the results of this classification in FiftyOne.

Installation

If you want to start using the semantic search with Qdrant, you need to run an instance of it, as this tool works in a client-server manner. The easiest way to do this is to use an official Docker image and start Qdrant with just a single command:

docker run -p “6333:6333” -p “6334:6334” -d qdrant/qdrant

After running the command we’ll have the Qdrant server running, with HTTP API exposed at port 6333 and gRPC interface at 6334.

We will also need to install a few Python packages. We’re going to use FiftyOne to visualize the data, along with their ground truth labels and the ones predicted by our embeddings similarity model. The embeddings will be created by MobileNet v2, available in torchvision. Of course, we need to communicate to Qdrant server somehow as well, and since we’re going to use Python, qdrant_client is a preferred way of doing that.

pip install fiftyone pip install torchvision pip install qdrant_client

Processing pipeline

Loading the dataset
Generating embeddings
Loading embeddings into Qdrant
Nearest neighbor classification
Evaluation in FiftyOne

Loading the dataset

There are several steps we need to take to get things running smoothly. First of all, we need to load the MNIST dataset and extract the train examples from it, as we’re going to use them in our search operations. To make everything even faster, we’re not going to use all the examples, but just 2500 samples. We can use the FiftyOne Dataset Zoo to load the subset of MNIST we want in just one line of code.

import fiftyone as fo
import fiftyone.zoo as foz

# Load the data
dataset = foz.load_zoo_dataset("mnist", max_samples=2500)

# Get all training samples
train_view = dataset.match_tags(tags=["train"])

Let’s start by taking a look at the dataset in the FiftyOne App.

# Visualize the dataset in FiftyOne
session = fo.launch_app(train_view)

Generating embeddings

The next step is to generate embeddings on the samples in the dataset. This can always be done outside of FiftyOne, with your own custom models. However, FiftyOne also provides various different models in the FiftyOne Model Zoo that can be used right out of the box to generate embeddings.

In this example, we use MobileNetv2 trained on ImageNet to compute an embedding for each image.

# Compute embeddings
model = foz.load_zoo_model("mobilenet-v2-imagenet-torch")

train_embeddings = train_view.compute_embeddings(model)

Loading embeddings into Qdrant

Qdrant allows storing not only vectors but also some corresponding attributes — each data point has a related vector and optionally a JSON payload attached to it. We want to use this to pass in the ground truth label to make sure we can make our prediction later on.

ground_truth_labels = train_view.values("ground_truth.label")
train_payload = [
    {"ground_truth": gt} for gt in ground_truth_labels
]

Having the embedding created, we can simply start communicating with the Qdrant server. An instance of QdrantClient is then helpful, as it encloses all the required methods. Let’s connect and create a collection of points, simply called “mnist”. The vector size is dependent on the model output, so if we want to experiment with a different model another day, then we will just need to import a different one, but the rest will be kept the same. Eventually, after making sure the collection exists, we can send all the vectors along with their payloads containing their true labels.

import qdrant_client as qc
from qdrant_client.http.models import Distance

# Load the train embeddings into Qdrant
def create_and_upload_collection(
    embeddings, payload, collection_name="mnist"
):
    client = qc.QdrantClient(host="localhost")
    client.recreate_collection(
        collection_name=collection_name,
        vector_size=embeddings.shape[1],
        distance=Distance.COSINE,
    )
    client.upload_collection(
        collection_name=collection_name,
        vectors=embeddings,
        payload=payload,
    )
    return client

client = create_and_upload_collection(train_embeddings, train_payload)

Nearest neighbor classification

Now to perform inference on the dataset. We can create the embeddings for our test dataset, but just ignore the ground truth and try to find it out using ANN, then compare if both match. Let’s take one step at a time and start with creating the embeddings.

# Assign the labels to test embeddings by selecting
# the most common label among the neighbours of each sample
test_view = dataset.match_tags(tags=["test"])
test_embeddings = test_view.compute_embeddings(model)

Time for some magic. Let’s simply iterate through the test dataset’s samples and their corresponding embeddings, and use the search operation to find the 15 closest embeddings from the training set. We’ll also need to select the payloads, as they contain the ground truth labels which are required to find the most common label in the neighborhood of a particular point. Python’s Counter class will be helpful to avoid any boilerplate code. The most common label will be stored as an “ann_prediction” on each test sample in FiftyOne.

This is encompassed in the function below which takes an embedding vector as input, uses the Qdrant search capability to find the nearest neighbors to the test embedding, generates a class prediction, and returns a FiftyOne Classification object that we can store in our FiftyOne dataset.

import collections
from tqdm import tqdm

def generate_fiftyone_classification(
    embedding, collection_name="mnist"
):
    search_results = client.search(
        collection_name=collection_name,
        query_vector=embedding,
        with_payload=True,
        top=15,
    )
    # Count the occurrences of each class and select the most common label
    # with the confidence estimated as the number of occurrences of 
    # the most common label divided by a total number of results.
    counter = collections.Counter(
        [point.payload["ground_truth"] for point in search_results]
    )
    predicted_class, occurences_num = counter.most_common(1)[0]
    confidence = occurences_num / sum(counter.values())
    prediction = fo.Classification(
        label=predicted_class, confidence=confidence
    )
    return prediction

predictions = []

# Call Qdrant to find the closest data points
for embedding in tqdm(test_embeddings):
    prediction = generate_fiftyone_classification(embedding)
    predictions.append(prediction)

test_view.set_values("ann_prediction", predictions)

By the way, we estimated the confidence by calculating the fraction of samples belonging to the most common label. That gives us an intuition of how sure we were while predicting the label for each case and can be used in FiftyOne to easily spot confusing examples.

Evaluation in FiftyOne

It’s high time for some results! Let’s start by visualizing how this classifier has performed. We can easily launch the FiftyOne App to view the ground truth, predictions, and images themselves.

session = fo.launch_app(test_view)

FiftyOne provides a variety of built-in methods for evaluating your model predictions, including regressions, classifications, detections, polygons, instance and semantic segmentations, on both image and video datasets. In two lines of code, we can compute and print an evaluation report of our classifier.

# Evaluate the ANN predictions, with respect to the values in ground_truth
results = test_view.evaluate_classifications(
    "ann_prediction", gt_field="ground_truth", eval_key="eval_simple"
)

# Display the classification metrics
results.print_report()

precision    recall  f1-score   support

    0 - zero       0.87      0.98      0.92       219
     1 - one       0.94      0.98      0.96       287
     2 - two       0.87      0.72      0.79       276
   3 - three       0.81      0.87      0.84       254
    4 - four       0.84      0.92      0.88       275
    5 - five       0.76      0.77      0.77       221
     6 - six       0.94      0.91      0.93       225
   7 - seven       0.83      0.81      0.82       257
   8 - eight       0.95      0.91      0.93       242
    9 - nine       0.94      0.87      0.90       244

    accuracy                           0.87      2500
   macro avg       0.88      0.87      0.87      2500
weighted avg       0.88      0.87      0.87      2500

After performing the evaluation in FiftyOne, we can use the results object to generate an interactive confusion matrix allowing us to click on cells and automatically update the App to show the corresponding samples.

plot = results.plot_confusion_matrix()
plot.show()

Let’s dig in a bit further. We can use the sophisticated query language of FiftyOne to easily find all predictions that did not match the ground truth, yet were predicted with high confidence. These will generally be the most confusing samples for the dataset and the ones from which we can gather the most insight.

from fiftyone import ViewField as F

# Display FiftyOne app, but include only the wrong predictions that 
# were predicted with high confidence
false_view = (
    test_view
    .match(F("eval_simple") == False)
    .filter_labels("ann_prediction", F("confidence") > 0.7)
)
session.view = false_view

These are the most confusing samples for the model and, as you can see, they are fairly irregular compared to other images in the dataset. A next step we could take to improve the performance of the model could be to use FiftyOne to curate additional samples similar to these. From there, those samples can then be annotated through the integrations between FiftyOne and tools like CVAT and Labelbox. Additionally, we could use some more vectors for training or just perform a fine-tuning of the model with similarity learning, for example using the triplet loss. But right now this example of using FiftyOne and Qdrant for vector similarity classification is working pretty well already.

And that’s it! As simple as that, we created an ANN classification model using FiftyOne with Qdrant as an embeddings backend, so finding the similarity between vectors can stop being a bottleneck as it would in the case of a traditional k-NN.

Try it yourself!

Click here for the notebook containing the source code of what you saw in this. Additionally, it includes a realistic use case of this process to perform pre-annotation of night and day attributes on the BDD100K road-scene dataset.

Summary

FiftyOne and Qdrant can be used together to efficiently perform a nearest neighbor search on embeddings and act on the results on your image and video datasets. The beauty of this process lies in its flexibility and repeatability. You can easily load additional ground truth labels for new fields into both FiftyOne and Qdrant and repeat this pre-annotation process using the existing embeddings. This can quickly cut down on annotation costs and result in higher-quality datasets, faster.

This blog post was made in collaboration between the teams at Qdrant and Voxel51 and is co-authored by Kacper Łukawski and Eric Hofesmann.