Underwater video analysis · AI · Open source
Community Fish Detector
A community-built AI model trained to detect one class, 'fish', and generalise across marine environments and species.
Video from Global Finprint Project YouTube
Through a community-driven effort, we standardised the publicly available, fragmented fish-imaging datasets into a single unified dataset, and used it to train a "universal fish model" that aims to detect fish in any underwater video or image, regardless of camera, depth, environment or species.
The problem
Fragmented AI approaches
- • Every project trains its own model from scratch.
- • Useful datasets/models exist, but in incompatible formats.
- • Standardising data is costly; few groups can do it alone.
The approach
Community-led standardisation
- • Contributors each adopt one dataset and convert it to COCO format.
- • The cost of standardisation is split across many groups.
- • Dataset, model and tooling are openly shared.
The Dataset
Each contributor adopted one open dataset and converted it to a common COCO-format
schema, with a single fish
class. The 17 sources span BRUVS, ROVs, deep-sea cameras, river-herring weirs, coral
reefs, aquaculture pens, lab tanks and freshwater rivers, covering most of the imaging
conditions a fish detector might encounter in the wild.
We aggregated the converted datasets, deduplicated labels, audited bounding boxes for consistency, and split the result into training, validation and held-out evaluation sets, with the held-out split used to measure how the model generalises beyond the data it was trained on.
The dataset is openly hosted on LILA BC: lila.science/datasets/community-fish-detection-dataset .
17
contributors
17
datasets processed
2M
images, 935K annotations
The Model
We trained two single-class detectors on the CFD corpus and release both as open weights:
RF-DETR · Default
Permissive license
A DETR-style transformer detector trained on the same corpus, released under a permissive open-source license (Apache 2.0). You can use, fine-tune and ship it inside commercial or closed-source projects without copyleft obligations. This is why we recommend RF-DETR as the default model.
YOLOv12x
Faster, AGPL-licensed
A real-time anchor-free detector, faster at inference and lighter to deploy on laptops and edge devices. Inherits Ultralytics' AGPL-3.0 license, a strong copyleft that requires any work using it (including networked services) to be open-sourced under the same terms. Pick YOLOv12x when speed matters and you're comfortable with the AGPL obligations.
Training, evaluation and inference code is openly developed on github.com/filippovarini/community-fish-detector , including dataset preparation scripts, training configs and the held-out evaluation protocol.
79%
Average AI Precision (AP@50) on held-out val data
1
class: "fish", across every domain
Example detections across some of the imaging modalities CFD was trained on.
Evaluation against SOTA models
Comparison of CFD (YOLOv12x) with MegaFishDetector (MFD, Yang et al. 2023) on held-out evaluation datasets. Results in AP@50.
| Dataset | CFD | MFD |
|---|---|---|
| fathomnet* | 54 | 46 |
| deepfish* | 85 | 63 |
| noaa_puget* | 26 | 1 |
| fishtrack* | 52 | 16 |
| brackish | 66 | 1 |
| coralscapes | 32 | 1 |
| deep_vision | 88 | 23 |
| f4k | 67 | 1 |
| kakadu | 81 | 2 |
| marine_detect | 70 | 8 |
| river_herring | 83 | 2 |
| project_natick | 37 | 0 |
| roboflow_fish | 69 | 11 |
| salmon_cv | 94 | 11 |
| torsi | 79 | 12 |
| zebrafish | 99 | 65 |
* Dataset MFD was trained on.
Get started
Free to use, openly developed
The model, dataset and tooling are all open. Try the detector in your browser, pull the weights from GitHub, or download the dataset from LILA.
Try it out
Run CFD on your images in the Hugging Face Space.
💻GitHub
Model weights, training code and documentation.
📊Dataset
The CFD Dataset on LILA: 2M images, 935K annotations.
Questions? Contact us.
Citation
Cite our work
Varini, F. et al. (2026). Community Fish Detector. GitHub. github.com/filippovarini/community-fish-detector
@misc{varini2026cfd,
title = {Community Fish Detector: a community-built single-class fish detector
generalising across marine imaging domains},
author = {Varini, F. and contributors},
year = {2026},
howpublished = {\url{https://github.com/filippovarini/community-fish-detector}}
} Model and dataset are openly available. Submit issues on GitHub, we usually respond within a week.
Contributors
Built by many hands
The Community Fish Detector exists thanks to the efforts of Filippo Varini, Dan Morris, Sonny Burniston, Océane Boulais, Kevin Barnard, Laura Chrobak, Alexander Merdian-Tarko, Devi Ayyagari, Mona Dhiflaoui, Joshua Chen, Gerard Calvo Bartra, Kalindi Fonda, Levi Veevee Cai, Giorgio De Pertis, Chris Jackett, Aditya Shirvalkar and Adrian Ibanez.
Get involved
We're building v2
We're looking for contributors to expand and improve the dataset, retrain the model and grow the working group. If you maintain a marine-imaging dataset, want to help curate one, or want to collaborate on the next iteration of the universal fish model, reach out.
Reach out to collaborate