Image Captioning Using Deep Learning

Uploaded by

Mayank Gupta

50% found this document useful (2 votes)

4K views16 pages

ppt of image captioning project using deep learning

Original Title

image captioning ppt

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

ppt of image captioning project using deep learning

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

50% found this document useful (2 votes)

4K views16 pages

Image Captioning Using Deep Learning

Uploaded by

Mayank Gupta

ppt of image captioning project using deep learning

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 16

Search inside document

OVERVIEW

 IMAGE CAPTIONING IS THE PROCESS OF GENERATING TEXTUAL DESCRIPTION OF AN

IMAGE.
 IT USES BOTH NATURAL-LANGUAGE-PROCESSING AND COMPUTER-VISION TO
GENERATE THE CAPTIONS.
DEEP LEARNING
• DEEP LEARNING IS A SUBFIELD OF MACHINE LEARNING CONCERNED WITH ALGORITHMS
INSPIRED BY THE STRUCTURE AND FUNCTION OF THE BRAIN CALLED ARTIFICIAL NEURAL
NETWORKS.
• IN-SHORT, IT IS NEURAL NETWORKS WITH MULTIPLE LAYERS.
• DEEP LEARNING TACKLES THE PROBLEM OF IMAGE CAPTIONING QUITE WELL THAN
ANY OTHER PARADIGMS OF PROGRAMMING.
PLATFORMS & TOOLS

• KAGGLE PLATFORM
• GOOGLE CLOUD CONSOLE PLATFORM
• KERAS WITH TENSORFLOW AS BACKEND
• PRE-TRAINED VGG MODEL BY OXFORD
• VIM TEXT-EDITOR
DATASET DESCRIPTION
• A GOOD DATASET TO USE WHEN GETTING STARTED WITH IMAGE CAPTIONING IS THE
FLICKR8K DATASET.
• FLICKR8K_DATASET.ZIP (1 GIGABYTE) AN ARCHIVE OF ALL PHOTOGRAPHS (6000+2000).
• FLICKR8K_TEXT.ZIP (2.2 MEGABYTES) AN ARCHIVE OF ALL TEXT DESCRIPTIONS FOR
PHOTOGRAPHS(5 CAPTIONS PER IMAGE).
• THE REASON OF USING FLICKR8K DATASET IS BECAUSE IT IS REALISTIC AND RELATIVELY
SMALL TO BUILD MODELS ON YOUR WORKSTATION USING A CPU.
CONCEPTS
• CNN ( CONVOLUTION NEURAL NETWORK ) USED IN VGG.
• LONG SHORT-TERM MEMORY (LSTM) RECURRENT NEURAL NETWORK.
• TEXT-PROCESSING.
• BLEU SCORE FOR TEXTUAL EVALUATION.
• LINUX AND COMMAND LINE
• GOOGLE CLOUD CONSOLE USAGE
BRIEF APPROACH
Words RNN
(pre-processed) (LSTM)

Merger Layer

Image features
(Pre-processed)
VGG (VISUAL GEOMETRY GROUP)
• VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE VISUAL RECOGNITION.
• CONVOLUTIONAL NETWORKS (CONVNETS) CURRENTLY SET THE STATE OF THE ART IN VISUAL
RECOGNITION.

• THEY CREATED 16 AND 19 LAYERED MODELS TO WIN THE IMAGENET COMPETITION IN 2014.
• WE HAVE USED THE 16 LAYERED VGG MODEL IN OUR PROJECT TO EXTRACT THE FEATURES OF
IMAGES.
CONVOLUTIONAL NEURAL NETWORK
(CNN)
• CNN ARE VERY SIMILAR TO ORDINARY NEURAL NETWORKS.
• CONVNET ARCHITECTURES MAKE THE EXPLICIT ASSUMPTION THAT THE INPUTS ARE IMAGES,
WHICH ALLOWS US TO ENCODE CERTAIN PROPERTIES INTO THE ARCHITECTURE.

• CNNS OPERATE OVER VOLUMES.

• IT HAS FILTERS, POOLING LAYERS BECAUSE OF THE ASSUMPTION THAT INPUT IS ALWAYS AN
IMAGE.
RNN AND LSTM
• RECURRENT NEURAL NETWORKS ARE THE STATE OF THE ART ALGORITHM FOR SEQUENTIAL DATA
AND AMONG OTHERS USED BY APPLES SIRI AND GOOGLES VOICE SEARCH.
• IT HAS AN INTERNAL MEMORY WHICH MAKES IT PERFECTLY SUITED FOR MACHINE LEARNING
PROBLEMS THAT INVOLVE SEQUENTIAL DATA BECAUSE IT CAN REMEMBERS IT’S INPUT.

• IN OUR MODEL, TEXT INPUT SEQUENCES WITH A PRE-DEFINED LENGTH (34 WORDS) WHICH ARE
FED INTO AN EMBEDDING LAYER THAT USES A MASK TO IGNORE PADDED VALUES. THIS IS
FOLLOWED BY AN LSTM LAYER WITH 256 MEMORY UNITS.
BLEU SCORE
• BLEU, OR THE BILINGUAL EVALUATION UNDERSTUDY, IS A SCORE FOR COMPARING TRANSLATION OF
TEXT TO ONE OR MORE REFERENCE TRANSLATIONS.
• IN OUR MODEL THERE WERE MORE THAN ONE POSSIBLE CAPTIONS FOR AN IMAGE, SO, WE EVALUATE
OUR MODEL USING BLEU SCORE.
• A PERFECT MATCH RESULTS IN A SCORE OF 1.0, WHEREAS A PERFECT MISMATCH RESULTS IN A SCORE
OF 0.0.
• NLTK, PROVIDES AN IMPLEMENTATION OF THE BLEU SCORE.
• WE HAVE USED CORPUS_BLUE FOR CALCULATING THE BLEU SCORE FOR MULTIPLE SENTENCES
SUCH AS A PARAGRAPH OR A DOCUMENT.
OUR EVALUATION
OUTPUT-1
OUTPUT-2
CONCLUSION

HERE WE CAN CONCLUDE THAT (RNN + VGG) CAN GIVE GOOD RESULTS FOR IMAGE CAPTIONING
EVEN AFTER TRAINING ON SMALL DATASETS LIKE FLICKR_8K.

Closure Properties of Regular Languages
Document27 pages
Closure Properties of Regular Languages
shruti
No ratings yet
5.29 A Homeowner Plants 6 Bulbs Selected at Random From A Box Containing
Document7 pages
5.29 A Homeowner Plants 6 Bulbs Selected at Random From A Box Containing
Eli
No ratings yet
MCQ Automata Unit-1
Document61 pages
MCQ Automata Unit-1
Anuj Pratap Singh
73% (22)
Time Series Forecasting of Rose Wine Sales
Document69 pages
Time Series Forecasting of Rose Wine Sales
Priyanka Patil
No ratings yet
Traffic Sign Recognition Project Report
Document19 pages
Traffic Sign Recognition Project Report
Sakshi Singh
100% (1)
Object Detection Methods Comparison
Document20 pages
Object Detection Methods Comparison
rohith mukkamala
67% (3)
Stock Market Prediction Using Machine Learning
Document17 pages
Stock Market Prediction Using Machine Learning
Sai Teja deevela
100% (1)
Time Series Forecasting Using Deep Learning - MATLAB & Simulink
Document6 pages
Time Series Forecasting Using Deep Learning - MATLAB & Simulink
Ali Algargary
100% (1)
Weather Forecasting App Python
Document12 pages
Weather Forecasting App Python
K D
No ratings yet
Internship Report On Machine Learning
Document25 pages
Internship Report On Machine Learning
Parveen
No ratings yet
Message Spam Classification Using Machine Learning Report
Document28 pages
Message Spam Classification Using Machine Learning Report
Dhanusri Ramesh
No ratings yet
Face Mask Detection Project
Document57 pages
Face Mask Detection Project
akhil rebels
No ratings yet
Test 1 Formula Sheet: P (A B) P (B)
Document3 pages
Test 1 Formula Sheet: P (A B) P (B)
Huang Yingyan
No ratings yet
Bank Management Report
Document71 pages
Bank Management Report
avi shahi
100% (2)
Machine Learning
Document58 pages
Machine Learning
Aniket Kumar
100% (2)
CKD Prediction Project Report
Document31 pages
CKD Prediction Project Report
y bro
100% (1)
Smart Farming Report
Document67 pages
Smart Farming Report
city cyber
No ratings yet
Image Caption CNN-LSTM
Document13 pages
Image Caption CNN-LSTM
Samrat Singh
No ratings yet
AI Mini Project Report
Document19 pages
AI Mini Project Report
Ashish Kamble
33% (3)
College Documentation - Automated Image Captioning
Document26 pages
College Documentation - Automated Image Captioning
Susmita Roy
No ratings yet
Convolutional Neural Networks
Document17 pages
Convolutional Neural Networks
Manish Man Shrestha
No ratings yet
AI Capstone Project Report for Image Captioning and Digital Assistant
Document28 pages
AI Capstone Project Report for Image Captioning and Digital Assistant
akg299
50% (2)
Final Year Project Phase 1 Report
Document25 pages
Final Year Project Phase 1 Report
Aman Sharma
No ratings yet
Machine Learning Report Summary
Document30 pages
Machine Learning Report Summary
Dinesh Chahal
No ratings yet
18csl47 - Daa Lab Manual
Document36 pages
18csl47 - Daa Lab Manual
Chandana C S
No ratings yet
Final PPT Heart Disease
Document23 pages
Final PPT Heart Disease
nithish Kumar
100% (1)
Prashant Kapgate Seminar ChatGPT
Document21 pages
Prashant Kapgate Seminar ChatGPT
Harish Augad
No ratings yet
Gold Price Prediction Using Ensemble Based Supervised Machine Learning
Document30 pages
Gold Price Prediction Using Ensemble Based Supervised Machine Learning
Sai Teja deevela
100% (1)
Heart Disease Prediction Using Machine Learning: Aftab Alam Khan Alka Joshi Amit Rai Gaurav Kumar
Document44 pages
Heart Disease Prediction Using Machine Learning: Aftab Alam Khan Alka Joshi Amit Rai Gaurav Kumar
Amit Rai
No ratings yet
KS School of Engineering Computer Graphics Lab Manual
Document44 pages
KS School of Engineering Computer Graphics Lab Manual
praveen ps
100% (2)
Credit Card Fraud Detection Using Machine Learning Methods
Document7 pages
Credit Card Fraud Detection Using Machine Learning Methods
IJRASETPublications
No ratings yet
Index: 1.1 Key Features
Document53 pages
Index: 1.1 Key Features
Pallavi Pallu
No ratings yet
Report Minor Project PDF
Document37 pages
Report Minor Project PDF
Supratibh Saikia
No ratings yet
Railway Reservation System Report
Document74 pages
Railway Reservation System Report
MOKSH GOEL
No ratings yet
Internship Report File
Document35 pages
Internship Report File
Manoj Kumar
No ratings yet
Super Important Questions For BDA
Document26 pages
Super Important Questions For BDA
Akhila R
100% (1)
Image Caption Generator
Document20 pages
Image Caption Generator
Krupa Patel
100% (1)
AeroCrash Report
Document26 pages
AeroCrash Report
Krishna Yadav
0% (1)
Cloud Computing Question Paper
Document2 pages
Cloud Computing Question Paper
EK
100% (1)
Visvesvaraya Technological University Belagavi: Computer Graphics Laboratory With Mini Project
Document21 pages
Visvesvaraya Technological University Belagavi: Computer Graphics Laboratory With Mini Project
Pushpakar L S
No ratings yet
Analysis of Heart Diseases Using Machine Learning: September 6, 2018
Document4 pages
Analysis of Heart Diseases Using Machine Learning: September 6, 2018
Ashish Ranjan
No ratings yet
Student Information Chatbot Final Report
Document21 pages
Student Information Chatbot Final Report
Shreyansh Gour
No ratings yet
Final Thesis - 16
Document24 pages
Final Thesis - 16
shanmukha
No ratings yet
Heart Failure Prediction using Hybrid Machine Learning Techniques
Document63 pages
Heart Failure Prediction using Hybrid Machine Learning Techniques
abhi spdy
No ratings yet
Machine Learning Disease Prediction Project Report
Document110 pages
Machine Learning Disease Prediction Project Report
Supreeth K L
No ratings yet
Weather Finder: A Weather Forecasting Android App
Document10 pages
Weather Finder: A Weather Forecasting Android App
Ryan Verma
0% (1)
Progress Report LinkedIn-clone
Document19 pages
Progress Report LinkedIn-clone
Archana Kushwaha
No ratings yet
Age and Gender Using OPENCV
Document49 pages
Age and Gender Using OPENCV
Shapna
No ratings yet
Facial K: Dynamic Selfie Filters Using ML
Document10 pages
Facial K: Dynamic Selfie Filters Using ML
deepika joshi
No ratings yet
Final Diabetes Prediction Documentation
Document52 pages
Final Diabetes Prediction Documentation
Shilpa Shahu
No ratings yet
CNS UNIT-2 Notes
Document30 pages
CNS UNIT-2 Notes
pravallika
100% (1)
Laptop Price Prediction Using Machine Learning: International Journal of Computer Science and Mobile Computing
Document5 pages
Laptop Price Prediction Using Machine Learning: International Journal of Computer Science and Mobile Computing
Aaneill Crest
No ratings yet
Online Reservation System Project in Java
Document7 pages
Online Reservation System Project in Java
lokesh
No ratings yet
Object Detection System Data Flow Diagram
Document16 pages
Object Detection System Data Flow Diagram
Pratik Waware
No ratings yet
Air Quality Prediction Using Machine Learning Algorithms
Document4 pages
Air Quality Prediction Using Machine Learning Algorithms
ATS
100% (1)
Project Synopsis For Chatbot
Document5 pages
Project Synopsis For Chatbot
Bibek Bc
No ratings yet
Project Presentation
Document34 pages
Project Presentation
DISHA SARAIYA
No ratings yet
Attendance Management App Using Firebase
Document26 pages
Attendance Management App Using Firebase
Mona Kamboj
No ratings yet
Project Report On Railway Reservation System by Amit Mittal
Document53 pages
Project Report On Railway Reservation System by Amit Mittal
amit431mittal
63% (65)
Chapter 2: THE PROJECT
Document25 pages
Chapter 2: THE PROJECT
Tal Tal
No ratings yet
Image Captioning
Document16 pages
Image Captioning
Pallavi Bharti
No ratings yet
Gap Method For Keyword Spotting: Varsha Thakur, Himanisikarwar
Document7 pages
Gap Method For Keyword Spotting: Varsha Thakur, Himanisikarwar
Sunil Chaluvaiah
No ratings yet
Detecting and captioning images using CNN-LSTM
Document34 pages
Detecting and captioning images using CNN-LSTM
tereyo
No ratings yet
DLW Deep Learning Workshop
Document3 pages
DLW Deep Learning Workshop
Indhu R
No ratings yet
MRI Brain Image Classification Using Deep Learning
Document18 pages
MRI Brain Image Classification Using Deep Learning
Rohit Arya
No ratings yet
Ker As Tutorial
Document33 pages
Ker As Tutorial
Yoann Dragneel
No ratings yet
Projects Prasanna Chandra 7E Ch4 Minicase Solution
Document3 pages
Projects Prasanna Chandra 7E Ch4 Minicase Solution
Jeet
No ratings yet
Linear and Logistic Regression Mathematical Intuition 1695069755
Document3 pages
Linear and Logistic Regression Mathematical Intuition 1695069755
mellouk ayoub
No ratings yet
Epsilon NFA Into NFA Into DFA
Document18 pages
Epsilon NFA Into NFA Into DFA
Alan
No ratings yet
Paper 2
Document11 pages
Paper 2
Corneille Nduwamungu
No ratings yet
BTP Presentation
Document29 pages
BTP Presentation
SAJAL PATHAK
No ratings yet
Perbandingan Regresi Zero Inflated Poisson Zip Dan PDF
Document6 pages
Perbandingan Regresi Zero Inflated Poisson Zip Dan PDF
almatrisa
No ratings yet
Tutorial 8
Document2 pages
Tutorial 8
Cassy
No ratings yet
PE20M009 - Application of CNN For Seismic Image Interpretation
Document20 pages
PE20M009 - Application of CNN For Seismic Image Interpretation
NAGENDR_006
No ratings yet
Homework 4
Document2 pages
Homework 4
Chakri Rangi
No ratings yet
CH 03
Document48 pages
CH 03
Yoosu Nguyen
No ratings yet
Unit 2 - 03
Document8 pages
Unit 2 - 03
Shahin Agahzadeh
No ratings yet
Very Deep Learning
Document38 pages
Very Deep Learning
nikhit5
No ratings yet
Queuing Theory Formulas
Document2 pages
Queuing Theory Formulas
ghkb4pxptx
No ratings yet
05 UML Part2
Document61 pages
05 UML Part2
Dandza Praditya
No ratings yet
Adithya Institute of Technology COIMBATORE - 641 107 Degree: B.E. & Branch: CSE Semester: 05 & Year: III Cs8501 - Theory of Computation Question Bank Unit 1-Automata Fundamentals Part - A
Document11 pages
Adithya Institute of Technology COIMBATORE - 641 107 Degree: B.E. & Branch: CSE Semester: 05 & Year: III Cs8501 - Theory of Computation Question Bank Unit 1-Automata Fundamentals Part - A
rupilaa
No ratings yet
Poisson Distribution: The Poisson Distribution Is Used When A Random
Document5 pages
Poisson Distribution: The Poisson Distribution Is Used When A Random
Dipta Kumar Nath 1612720642
No ratings yet
Muhammad Hamaas Taqiyuddin Al Mustadjabi - 185150200111071 - Bab9
Document5 pages
Muhammad Hamaas Taqiyuddin Al Mustadjabi - 185150200111071 - Bab9
Alfen Hasiholan
No ratings yet
Probability R, V
Document91 pages
Probability R, V
Yehya Mesalam
No ratings yet
Assignment - Week 6 (Neural Networks) Type of Question: MCQ/MSQ
Document4 pages
Assignment - Week 6 (Neural Networks) Type of Question: MCQ/MSQ
SURENDRAN D CS085
No ratings yet
Systems Analysis and Design 10th Edition Kendall Solutions Manual
Document39 pages
Systems Analysis and Design 10th Edition Kendall Solutions Manual
aidenqpepe
100% (10)
Normal Distribution Explained
Document36 pages
Normal Distribution Explained
Margarida Afonso
No ratings yet
CS229 Machine Learning Cheatsheet
Document2 pages
CS229 Machine Learning Cheatsheet
MichelleHan
No ratings yet
Properties of Rayleigh Distributed Random Variables
Document36 pages
Properties of Rayleigh Distributed Random Variables
Hector Fabio Gallego Escudero
No ratings yet
PE II-CompLabs-2014 04 01 - T
Document104 pages
PE II-CompLabs-2014 04 01 - T
Justina Sasnauskaite
No ratings yet