Compartir
Título
Labeled Patches Dataset for Semi-supervised YOLO Training on Cervical Cytology WSI [Dataset]
Autor(es)
Palabras clave
Cervical cytology
Papanicolaou test
YOLO
Whole Slide Image
object detection
patch dataset
digital pathology
label validation
Clasificación UNESCO
1203.04 Inteligencia Artificial
2407.04 Citología
2209.09 Radiación Infrarroja
2209.90 Tratamiento Digital. Imágenes
Fecha de publicación
2026
Editor
Universidad de Salamanca
Citación
Cardona-Mendoza, A., Gil-González, A.B., et al. (2025). Labeled Patches Dataset for Semi-supervised YOLO Training on Cervical Cytology WSI. GREDOS Repository, University of Salamanca.
Resumen
[EN]This dataset contains labeled image patches extracted from Colombian Whole Slide Images (WSIs) of conventional Papanicolaou (Pap) tests. It supports the training and validation of object detection models (e.g., YOLO) in automated cervical cytology diagnosis. The patches (640×640px JPGs) were extracted and labeled using a semi-automated pipeline combining manual annotation in QuPath and automated patch extraction and YOLO label generation via Groovy and Python scripts. A web-based expert validation interface was used to ensure label accuracy.
Descripción
Here is the English version of the abstract, optimized for international repositories (like Zenodo, IEEE Dataport, or Kaggle) using standard terminology in Digital Pathology and Computer Vision:
📝 Dataset Abstract
Title Suggestion: Automated Cervical Cytology Screening: A Labeled WSI Patch Dataset for Deep Learning-Based Object Detection.
Abstract:
The automated detection of cellular structures using Deep Learning models represents a key strategy to optimize cervical cancer screening by reducing clinical workload and inter-observer variability. However, analyzing Whole Slide Images (WSI) presents critical challenges, including the scarcity of high-quality annotations, high morphological complexity, and significant class imbalance.
This dataset addresses these limitations by providing a collection of 640×640px image patches extracted from conventional Papanicolaou (Pap) tests from a Colombian clinical cohort. The data was processed through a semi-automated pipeline integrating manual annotation in QuPath, automated patch extraction, and YOLO label generation via Groovy and Python scripts. To ensure high diagnostic reliability, a web-based expert validation interface was implemented for final label verification. This resource supports the development of robust computer vision models aimed at enhancing precision and efficiency in digital cytological diagnosis.
Asociado a la publicación
URI
DOI
10.71636/g7s3-1n94
Tabla de contenidos
1. File List:
✓ JPG patches (640x640)
✓ YOLO-format labels (.txt)
✓ Visual validated patches (.jpg with bounding boxes)
✓ Metadata snapshots (.csv)
✓ Logs of processed annotations (.txt)
2. Relationship between files, if important:
Each patch corresponds to a label file. Patches and labels are grouped by class. Visual validated images are derived from the patch-label pair.
3. File format:
✓ Images: .jpg
✓ Labels: .txt (YOLO format)
✓ Logs/Snapshots: .csv, .txt
Aparece en las colecciones
Ficheros en el ítem
Nombre:
dataset_Cytolens_DemIA_GREDOS.rarEmbargado hasta: 2030-12-31
Tamaño:
76.20Mb
Formato:
Desconocido
Descripción:
Datos Experimentales
Tamaño:
5.413Kb
Formato:
Desconocido
Descripción:
readme












