Physics Maths Engineering
Saeed Niksaz,
Fahimeh Ghasemian
Automatic medical report generation is the production of reports from radiology images that are grammatically correct and coherent. Encoder-decoder is the most common architecture for report generation, which has not achieved to a satisfactory performance because of the complexity of this task. This paper presents an approach to improve the performance of report generation that can be easily added to any encoder-decoder architecture. In this approach, in addition to the features extracted from the image, the text related to the most similar image in the training data set is also provided as the input to the decoder. So, the decoder acquires additional knowledge for text production which helps to improve the performance and produce better reports. To demonstrate the efficiency of the proposed method, this technique was added to several different models for producing text from chest images. The results of evaluation demonstrated that the performance of all models improved. Also, different approaches for word embedding, including BioBert, and GloVe, were evaluated. Our result showed that BioBert, which is a language model based on the transformer, is a better approach for this task.
The study focuses on enhancing automatic medical report generation by incorporating text from images similar to the input image into the encoder-decoder architecture, aiming to improve the coherence and accuracy of the generated reports.
In addition to using features extracted from the input image, the proposed approach provides the decoder with text related to the most similar image in the training dataset. This additional input offers the decoder more context, aiding in generating more accurate and coherent reports.
The implementation of this technique across various models for generating text from chest images demonstrated improved performance in all cases. The study also evaluated different word embedding methods, finding that BioBERT, a language model based on the transformer architecture, was particularly effective for this task.
BioBERT, being a transformer-based language model, is well-suited for biomedical text processing. Its use in this study highlights its effectiveness in understanding and generating medical language, contributing to more accurate and contextually relevant report generation.
This study introduces a novel method of enhancing encoder-decoder architectures by incorporating text from similar images, thereby improving the quality of generated medical reports. It also underscores the effectiveness of using advanced word embedding techniques like BioBERT in medical natural language processing tasks.
Show by month | Manuscript | Video Summary |
---|---|---|
2025 April | 2 | 2 |
2025 March | 74 | 74 |
2025 February | 48 | 48 |
2025 January | 56 | 56 |
2024 December | 63 | 63 |
2024 November | 60 | 60 |
2024 October | 36 | 36 |
2024 September | 67 | 67 |
2024 August | 41 | 41 |
2024 July | 43 | 43 |
2024 June | 28 | 28 |
2024 May | 33 | 33 |
2024 April | 41 | 41 |
2024 March | 12 | 12 |
Total | 604 | 604 |
Show by month | Manuscript | Video Summary |
---|---|---|
2025 April | 2 | 2 |
2025 March | 74 | 74 |
2025 February | 48 | 48 |
2025 January | 56 | 56 |
2024 December | 63 | 63 |
2024 November | 60 | 60 |
2024 October | 36 | 36 |
2024 September | 67 | 67 |
2024 August | 41 | 41 |
2024 July | 43 | 43 |
2024 June | 28 | 28 |
2024 May | 33 | 33 |
2024 April | 41 | 41 |
2024 March | 12 | 12 |
Total | 604 | 604 |