Automated Captioning of Image and Audio for Visually and Hearing Impaired

Generating captions and text descriptions of images will enable visually and hearing impaired extended accessibility to the real-world, thus reducing their social isolation, and improving their well-being, employability, and education experience. This thesis presents significant advancements in algorithmic approaches for generating captions and text descriptions. These enhancements are pivotal in processing and interpreting both image and audio data. The focus on algorithmic innovation ensures that the platform is not only efficient but also adaptable to various types of visual and auditory information, making it a versatile tool for aiding those with visual impairments. The thesis has addressed this aim in three main contribution chapters, image captioning, video captioning, and audio-visual video captioning approaches. The progression of this research is methodically structured, starting with image captioning. This initial phase concentrates on developing sophisticated algorithms capable of accurately interpreting and describing still images. This foundational work sets the stage for the subsequent phase, video captioning.

Süreli Ambargo
Görüntülenme
2
01.03.2024 tarihinden bu yana
İndirme
1
01.03.2024 tarihinden bu yana
Son Erişim Tarihi
17 Nisan 2024 17:05
Google Kontrol
Tıklayınız
Tam Metin
Süreli Ambargo
Şu tarihte dosyalar erişime açılacaktır : 01.09.2024
Detaylı Görünüm
Eser Adı
(dc.title)
Automated Captioning of Image and Audio for Visually and Hearing Impaired
Eser Sahibi
(dc.contributor.author)
Özkan Çaylı
Tez Danışmanı
(dc.contributor.advisor)
Volkan Kılıç
Yayıncı
(dc.publisher)
İzmir Katip Çelebi Üniversitesi Fen Bilimleri Enstitüsü
Tür
(dc.type)
Yüksek Lisans
Özet
(dc.description.abstract)
Generating captions and text descriptions of images will enable visually and hearing impaired extended accessibility to the real-world, thus reducing their social isolation, and improving their well-being, employability, and education experience. This thesis presents significant advancements in algorithmic approaches for generating captions and text descriptions. These enhancements are pivotal in processing and interpreting both image and audio data. The focus on algorithmic innovation ensures that the platform is not only efficient but also adaptable to various types of visual and auditory information, making it a versatile tool for aiding those with visual impairments. The thesis has addressed this aim in three main contribution chapters, image captioning, video captioning, and audio-visual video captioning approaches. The progression of this research is methodically structured, starting with image captioning. This initial phase concentrates on developing sophisticated algorithms capable of accurately interpreting and describing still images. This foundational work sets the stage for the subsequent phase, video captioning.
Kayıt Giriş Tarihi
(dc.date.accessioned)
2024-03-01
Açık Erişim Tarihi
(dc.date.available)
2024-09-01
Yayın Tarihi
(dc.date.issued)
2024
Yayın Dili
(dc.language.iso)
eng
Konu Başlıkları
(dc.subject)
Image Captioning
Konu Başlıkları
(dc.subject)
Video Captioning
Tek Biçim Adres
(dc.identifier.uri)
https://hdl.handle.net/11469/3884
Analizler
Yayın Görüntülenme
Yayın Görüntülenme
Erişilen ülkeler
Erişilen şehirler
6698 sayılı Kişisel Verilerin Korunması Kanunu kapsamında yükümlülüklerimiz ve çerez politikamız hakkında bilgi sahibi olmak için alttaki bağlantıyı kullanabilirsiniz.

creativecommons
Bu site altında yer alan tüm kaynaklar Creative Commons Alıntı-GayriTicari-Türetilemez 4.0 Uluslararası Lisansı ile lisanslanmıştır.
Platforms