Repositório Digital de Publicações Científicas: BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language


Sign on to:
	Login
	My DSpace authorized users
	Edit Profile
	Receive email updates

Browse
	Communities & Collections
	Issue Date
	Author
	Title
	Subject

Helps
	Regulamento RDPC
	Guia do Utilizador RDPC
	Depósito RDPC
	Faq's RDPC

	Integração CV DeGóis
	Workshop Open Access

	Newsletter Open Access


	About Dspace
	DSpace Software

Repositorio Digital de Publicacoes Cientificas da Universidade de Evora

/ CIDEHUS - Centro Interdisciplinar de História, Culturas e Sociedades / CIDEHUS - Artigos em Livros de Actas/Proceedings /

Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/32260

Title:	BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language
Authors:	Consoli, Bernardo Dias, Henrique Vieira, Renata Bordini, Rafael Ana, Ulbrich
Issue Date:	Jun-2022
Publisher:	LREC
Citation:	Consoli, B, Dias, H., Ulbrich, A., Vieira, R., Bordini, R. (2022) BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese LanguageProceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pages 5609–5616 Marseille, 20-25 June 2022 © European Language Resources Association (ELRA)
Abstract:	Computational medicine research requires clinical data for training and testing purposes, so the development of datasets composed of real hospital data is of utmost importance in this field. Most such data collections are in the English language, were collected in anglophone countries, and do not reflect other clinical realities, which increases the importance of national datasets for projects that hope to positively impact public health. This paper presents a new Brazilian Clinical Dataset containing over 70,000 admissions from 10 hospitals in two Brazilian states, composed of a sum total of over 2.5 million free-text clinical notes alongside data pertaining to patient information, prescription information, and exam results. This data was collected, organized, deidentified, and is being distributed via credentialed access for the use of the research community. In the course of presenting the new dataset, this paper will explore the new dataset’s structure, population, and potential benefits of using this dataset in clinical AI tasks.
URI:	http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.602.pdf http://hdl.handle.net/10174/32260
Type:	article
Appears in Collections:	CIDEHUS - Artigos em Livros de Actas/Proceedings

Files in This Item:

File	Description	Size	Format
2022.lrec-1.602.pdf		285.95 kB	Adobe PDF	View/Open

Serviços de Ciência e Cooperação - Universidade de Évora