Derinlemesine ayrılabilir evrişim ve LSTM ağları ile görüntülerden anlamsal&nbsp;ifade&nbsp;çıkarma

Şenel, Ezgi̇su

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.13091/5099

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	İşcan, Hazi̇m	-
dc.contributor.author	Şenel, Ezgi̇su	-
dc.date.accessioned	2024-02-11T17:46:51Z	-
dc.date.available	2024-02-11T17:46:51Z	-
dc.date.issued	2023	-
dc.identifier.uri	https://tez.yok.gov.tr/UlusalTezMerkezi/TezGoster?key=weFMBHaUra8rsS5wi2bmHI6uXUY9TEjCQf7BWWOKLWTZ4yNv3i29dSQayD5Vrf3i	-
dc.identifier.uri	https://hdl.handle.net/20.500.13091/5099	-
dc.description.abstract	Bir görüntünün içeriğini, görüntünün bize neler anlattığını cümleler kurarak doğru bir şekilde ifade etmek insan beyni için her ne kadar kolay olsa da bir bilgisayar için bu işlemi yapmak oldukça zordur. Doğru ve iyi biçimlendirilmiş cümleler oluşturmak için, dilin hem sözdizimsel hem de anlamsal olarak anlaşılması gerekir. Bu konuda karşımıza çıkan en büyük zorluk sadece görüntülerde bulunan nesneleri değil, aynı zamanda bu nesnelerin birbirleriyle ilişkisini, nasıl bir ilişki içerisinde olduklarını ifade eden bir açıklama oluşturabilmektir. Derin öğrenme yaklaşımı ile ağlar, görüntülerdeki nesneleri, yüzleri, sahneleri ve diğer anlamsal bilgileri anlamak için büyük veri kümeleri üzerinde eğitilir. Görüntülerin anlamsal analizi, otomotiv, güvenlik, video gözetimi ve tıbbi görüntüleme gibi birçok alanda uygulanabilmektedir. Bu alan, daha doğru ve karmaşık analizler sağlayan yeni derin öğrenme modelleri ve büyük veri kümeleriyle sürekli olarak gelişmekte ve ilerlemektedir. Bu çalışmada, Flickr_8k veri setinde bulunan 8000 görüntünün, Xception modeli ile özellik çıkarımı yapılmıştır. Diğer bir yandan Flickr_8k'da bulunan görüntülere ait 5 açıklamadan, LSTM ile benzersiz sözlük yapısı ortaya çıkarılmıştır. Elde edilen bu iki veri transfer öğrenme yapılan modele verilerek görüntülerin doğal cümlelere çevrilmesi sağlanmıştır.	en_US
dc.description.abstract	While it is easy for the human brain to accurately express the content of an image and what it tells us in sentences, it is very difficult for a computer. In order to create accurate and well-formed sentences, the language needs to be understood both syntactically and semantically. The biggest challenge is to create a description of not only the objects in the images, but also how they relate to each other and how they are related. With a deep learning approach, networks are trained on large datasets to understand objects, faces, scenes and other semantic information in images. Semantic analysis of images can be applied in many fields such as automotive, security, video surveillance and medical imaging. This field is constantly evolving and advancing with new deep learning models and large datasets that enable more accurate and complex analysis. In this study, feature extraction of 8000 images from the Flickr_8k dataset was performed with the Xception model. On the other hand, a unique lexicon structure was extracted from 5 descriptions of the images in Flickr_8k with LSTM. These two data were given to the transfer learning model to translate the images into natural sentences.	en_US
dc.language.iso	tr	en_US
dc.publisher	Konya Teknik Üniversitesi	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	en_US
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Derinlemesine ayrılabilir evrişim ve LSTM ağları ile görüntülerden anlamsal ifade çıkarma	en_US
dc.title.alternative	Semantic expression extraction from images with depthwise separable convolution and LSTM networks	en_US
dc.type	Master Thesis	en_US
dc.department	Entitüler, Lisansüstü Eğitim Enstitüsü, Bilgisayar Mühendisliği Ana Bilim Dalı	en_US
dc.identifier.startpage	1	en_US
dc.identifier.endpage	54	en_US
dc.institutionauthor	Şenel, Ezgi̇su	-
dc.relation.publicationcategory	Tez	en_US
dc.identifier.yoktezid	840970	en_US
item.fulltext	No Fulltext	-
item.openairetype	Master Thesis	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.grantfulltext	none	-
item.cerifentitytype	Publications	-
item.languageiso639-1	tr	-
Appears in Collections:	Tez Koleksiyonu

Show simple item record

CORE Recommender

Page view(s)

30

checked on May 13, 2024

Google Scholar^TM

Check

Page view(s)

Google ScholarTM

Google Scholar^TM