Afficher la notice abrégée

dc.contributor.authorCanal-Alonso, Ángel
dc.contributor.authorJiménez, Pedro
dc.contributor.authorEgido, Noelia
dc.contributor.authorPrieto Tejedor, Javier 
dc.contributor.authorCorchado Rodríguez, Juan Manuel 
dc.date.accessioned2023-10-03T10:10:22Z
dc.date.available2023-10-03T10:10:22Z
dc.date.issued2022
dc.identifier.urihttp://hdl.handle.net/10366/153123
dc.description.abstract[EN]Next-generation sequencing (NGS) has revolutionized the field of genomics, allowing a detailed and precise look at DNA. As this technology advanced, the need arose for standardized file formats to represent, analyze and store the vast data sets produced. In this article, we review the key file formats used in NGS: FASTA, FASTQ, BED, GFF, and VCF. The FASTA format, one of the oldest, provides a basic representation of genomic and protein sequences, identifiable by unique headers. FASTQ is essential for NGS, as it stores both the sequence and the associated quality information. BED provides a tabular representation of genomic loci, while GFF details the localization and structure of genomic features in reference sequences. Finally, VCF has emerged as the predominant standard for documenting genetic variants, from simple SNPs to complex structural variants. The adoption and adaptation of these formats have been fundamental for progress in bioinformatics and genomics. They provide a foundation on which to build sophisticated analyses, from gene discovery and function prediction to the identification of disease-associated variants. With a clear understanding of these formats, researchers and practitioners are better equipped to harness the power and potential of next-generation sequencing.es_ES
dc.description.sponsorshipThis study has been funded by the AIR Genomics project (with file number CCTT3/20/SA/0003), through the call 2020 R&D PROJECTS ORIENTED TO THE EXCELLENCE AND COMPETITIVE IMPROVEMENT OF THE CCTT by the Institute of Business Competitiveness of Castilla y León and FEDER fundes_ES
dc.language.isoenges_ES
dc.subjectNext-Generation sequencinges_ES
dc.subjectFile formates_ES
dc.subjectData sharinges_ES
dc.titleFile formats used in next generation sequencing: A literature reviewes_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.subject.unesco1203.17 Informáticaes_ES
dc.subject.unesco2410.07 Genética Humanaes_ES
dc.relation.projectIDCCTT3/20/SA/0003es_ES
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses_ES
dc.type.hasVersioninfo:eu-repo/semantics/publishedVersiones_ES


Fichier(s) constituant ce document

Thumbnail

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée