You are on page 1of 40

Working with Natural

Language Text: Tools and


Techniques
Nestor Rychtyckyj
Advanced & Manufacturing
Engineering Systems
Ford Motor Company
1

Agenda
Introduction
Description of problem Why is language
so important?
Dealing with Natural Language Text
Application Examples
Machine Translation
Future Directions
Conclusions
2

Natural Language Text is


everywhere

Internet
Web sites
Blogs
Customer Feedback
Dealer Feedback
Lessons Learned
Corporate Knowledge
Warranty Claims
Internal documentation
Spoken Dialog systems
3

Dealing With Text Information


Search Engines (Google, askjeeves.com)
Excel
Commercial Text Mining Tools (Wordstat, SAS
Text Miner, SMART Text Miner, etc)
Open Source tools (Wordnet, Senseclusters,
etc.)
Controlled Languages
Ontologies
Natural Language Processing
Semantic Web
4

Present Status
Mostly key-word based
Very little intelligence, no background knowledge
or context
Limited natural language dialog interpretation
Most of the processing is left to the human user
Difficult to build computer systems that can
retrieve information in an intelligent manner

Future State
Semantic Web information on the web is
organized using structured tagging based on
XML, RDF, OWL, SWRL
machine-processable data on the web
standard interface to data
rich knowledge representations through
ontologies
Allows for the development of systems that cab
retrieve information in an intelligent manner
6

Semantic Web Architecture

Source: Tim Berners-Lee, 2000

Artificial Intelligence (AI)


Study on how to build human-level intelligence into
computer applications
Uses learning, representation of human knowledge,
understanding of language, vision, speech, etc.
Applies the built-in knowledge using inference and
reasoning
Been very successful in limited problem domains less
so for general applications
Integrated into many applications areas including
manufacturing, planning, search, speech recognition,
financial analysis, games, customer analysis,
commercial fishing, etc.

Current use of AI in Manufacturing


at Ford
AI applications for manufacturing
Bring appropriate knowledge about manufacturing to
the proper people at the right time
Improve manufacturing efficiency
Reduce workplace injuries through better up-front
ergonomics analysis
Make assembly build instructions available to
operators in other languages
Develop common framework for representing
knowledge and exchanging it between different
systems
9

Knowledge Sources in
Manufacturing

Process Build Information


Required Tooling
Part Information
Ergonomics Analysis
Plant Layout Information
Assembly Visualization
Safety Concerns
Manufacturing Best Practices
10

Global Study Process Allocation


System (GSPAS)
The Allocation system used
to assign manufacturing
processes to plant
operation resources.
Process sheets use
STANDARD LANGUAGE
(159) verbs
Like - insert, select, grasp,
load
11

Global Study Process Allocation


System (GSPAS)
Global System to handle Manufacturing Costing,
Process and Labor Management for vehicle
assembly.
Standard Language and AI is an integral part of
GSPAS.
Launched in North America and Europe in 1998
to support the Focus program.
Currently deployed for almost all car and truck
manufacturing at Vehicle Operations assembly
plants world-wide.
12

Step by Step Instructions

Process sheets specify the operations, tasks, parts and

13

Standard Language
Controlled language where the grammar and
syntax is restricted.
Developed at Ford Body & Assembly to describe
the vehicle assembly process.
Contains information about tools, parts and work
required to build a vehicle.
Contains over 5000 words, 1000 abbreviations
that can be used by the process engineers.
Standard Language is checked by Artificial
Intelligence (AI) system.
14

Examples of Standard
Language
1. ALIGN-AND-SEAT DOOR TRIM PROTECTOR
2. FIRMLY PRESS SEALER INTO JOINT TO
AFFECT A POSITIVE SEAL
3. APPLY DAUB OF SEALER TO THE JOINT OF
THE CENTER FLOOR PAN AND FRONT
FLOOR PAN AT ROCKER PANEL
4. PUSH SEAT REARWARD TO EXPOSE
FRONT ATTACHMENTS

15

Standard Language Rules


Imperative form
Sentence must start with verb clause followed by
noun phrase.
Only one Standard Language (main action) verb
per sentence.
Some prepositions have special meaning
(using, with).
Size modifiers may follow nouns (bumper
large).
Free form allowed for certain verbs verify that..)
16

Standard Language Process


Sheet
Process Sheet Written in Standard Language from CAP (Focus) deck
TITLE: ASSEMBLE IMMERSION HEATER TO ENGINE
10 OBTAIN ENGINE BLOCK HEATER ASSEMBLY FROM STOCK
20 LOOSEN HEATER ASSEMBLY TURNSCREW USING POWER TOOL
30 APPLY GREASE TO RUBBER O-RING AND CORE OPENING
40 INSERT HEATER ASSEMBLY INTO RIGHT REAR CORE PLUG HOSE
50 ALIGN SCREW HEAD TO TOP OF HEATER
TOOL 20 1 P AAPTCA TSEQ RT ANGLE NUTRUNNER
TOOL 30 1 C COMM TSEQ GREASE BRUSH

Resulting Work Instructions Generated by DLMS For Line 20


LOOSEN HEATER ASSEMBLY TURNSCREW USING POWER TOOL
005 GRASP POWER TOOL (RT ANGLE NUTRUNNER) <01M4G1>
010 POSITION POWER TOOL (RT ANGLE NUTRUNNER) <01M4P2>
015 ACTIVATE POWER TOOL (RT ANGLE NUTRUNNER) <01M1P0>
020 REMOVE POWER TOOL (RT ANGLE NUTRUNNER) <01M4P0>
025 RELEASE POWER TOOL (RT ANGLE NUTRUNNER) <01M4P0>
.

17

Natural Language Parsing


Secure bracket

using multiple motor nutrunner


Prepositional
Phrase

Verb Phrase

Noun Phrase

Verb

Noun

Preposition

Secure

Bracket

Using

Noun Phrase

Noun

18

Process for Natural Language


Processing
Parse the text (sentence by sentence) into parse
tree structure
Bypass/ignore common words (articles, common
terms)
Stemming (get the root of the word)
Word lookup (synonyms, misspellings,
acronyms)
Word understanding (deeper-level ontologies)
Controlled languages with automated checking
19

Parsing Information in Standard


Language
Example of Standard Language parsing: Feed 2 150
mm wire assemblies through hole in liftgate panel
(S (VP (VERB FEED)) (NP (SIMPLE-NP (QUANTIFIER
2) (DIM (QUANTIFIER 150) (DIM-UNIT-1 MM))
(ADJECTIVE WIRE) (NOUN ASSEMBLY))) (S-PP (SPREP THROUGH) (NP (SIMPLE-NP (NOUN HOLE) (NPP (N-PREP in) (NP (SIMPLE-NP (ADJECTIVE
LIFTGATE) (ADJECTIVE OUTER) (NOUN PANEL))))))))

20

Ontology used to represent


knowledge

Individuals
Classes (with hierarchy); think sets
Properties (w/ hierarchy); not part of class
Equivalence
Property characteristics/restrictions
Complex classes

21

GSPAS Ontology
Thing

Tools

Parts

Lexical Nodes

Operations

Intervening Concept Nodes

HAMMER

Attributes: Size,
Part of Speech,
Subsystem-id, etc.

22

GSPAS Knowledge Base

23

Ergonomics Analysis
Check the assembly work instructions to
determine what type of physical action is being
described
Check the assembly work instruction to
determine what object is manipulated
Check the associated parts and tools for part
weight and tool properties
Flag potential ergonomics concerns at the
process level and at the work allocation level
Knowledge can be represented as a business
rule
24

Machine Translation
The Spirit is willing but the flesh is weak
"The vodka is tempting, but the meat's a bit
suspect".
The alcohol is arranged, but the meat is weak.
This kind of spirit is wants, but the flesh and
blood is weak.
The spirit is willing, but the flesh is impossible
The spirit puts out the flag and does, the flesh
omits but.
25

Machine Translation
Use of computers to translate from one
language to another
Examples: Babelfish
Translation accuracy is highly dependant on the
quality of the source text
Use proper grammar, punctuation, shorter
sentences, active voice to improve quality
Customize translation systems for each
application domain
26

Problem Description
Need to translate assembly build instructions from
English to the language used at the assembly plants
A single vehicle may require several thousand process
sheets to describe the assembly process
Large amount of assembly instructions are frequently
modified
Large volume of translations precludes the use of human
translators
Specialized terminology requires technical glossaries
MT performance can be improved greatly by improving
the source text
27

Application Description
Machine Translation is integrated into the process
planning for manufacturing system known as GSPAS
(Global Study Process Allocation System)
The translation process is fully automated and does not
require human intervention
Translation occurs automatically after a process sheet is
validated by the AI system and before it is released to the
assembly plants.
We currently translate build instructions for 26 different
vehicle lines in 5 languages (we also have a separate
glossary for Mexican Spanish)
Data is read in from an Oracle database, processed
through the translation system and the output is then
written out to the Oracle database

28

Machine Translation
Source: Process build instructions in English
Target: Process build instructions in Spanish, German,
Portuguese, Dutch & Turkish
Translate both controlled language and embedded freeform text
Example: SECURE BUMPER BRACKET {FOR LHS
ONLY} TO VEHICLE BODY USING POWER TOOL
Utilize customized SYSTRAN translation engine,
automotive and Ford-specific terminology glossaries and
embedded tagging
Future plans include additional parsing and tagging
information to improve translation accuracy
29

Machine Translation
Implementation in GSPAS
Worked with Systran & Apptek to customize their
translation software for our requirements.
Develop technical dictionaries that contain Ford
terminology with correct translation for each
language pair.
Develop and integrate the translation process
into GSPAS.
Developed a system to check and improve the
source text prior to translation
30

Translation Statistics
Language pairs being translated:
English/German, English/Spanish,
English/Dutch, English/Portuguese, EnglishSpanish (Mexican), English-Turkish
Ford specific terminology in Standard Language:
over 5000 words, 13,000 noun phrases, over
1000 abbreviations and acronyms .
Typically translate over 200,000 records each
month
Over 10,000,000 records already translated.

31

GSPAS Translation Process

32

Standard Language Translation


Issues
Sentence structure is not grammatical English (ROBOT
APPLY 50 MM TAPE-STRIPE)
Ford terminology is complex and must be explicitly
translated as an entire phrase (INSULATION ASSEMBLY
BODY PILLAR)
Use of abbreviations, misspellings, acronyms (ABS,
A.B.S)
Use of compound verbs (PICK-AND-SPOON)
Inverted phrase structure with modifiers (BODY PANEL
LRG)
Embedded comments (LOAD BUMPER {LOWER} TO
VEHICLE)
33

Standard Language Translation


Use of slang (shotgun)
Articles are seldom used (HAMMER HAMMER).
Need to handle British English as well as
American English. (terminology, use, spellings)
Source text is incorrectly written and not
understandable.
Punctuation is rarely used.
Standard Language is always evolving and
needs to be maintained.
34

Uses of AI Technology
Apply natural language processing (NLP) along
with knowledge representation and reasoning to
improve the source text
Analyze the source text; utilize the ontology to
identify terminology
Convert the source text to a more translatable
form by adding articles, replacing abbreviations,
improving grammar and punctuation
Utilize XML tagging and ontology lookup to
improve the structure of free-form source text

35

Improving Translation Quality


Process the source text prior to translation
(Standard Language pre-processor).
Add articles before the nouns.
Adjust the word order to deal with size modifiers
coming after nouns.
Replace acronyms, synonyms with original
expanded text (ASY -> ASSEMBLY)
Verify that punctuation is correct.
Pre-process the embedded comments to
improve translation quality.
36

Issues with Machine Translation


Quality
Localization issues (even with technical
terminology) Spanish in Spain, Mexico,
Argentina, etc.
Ensure that system correctly displays
special characters (umlaut, accents etc.)
Have additional space available on screen
as target languages require more room
than English.
37

Conclusions
Machine Translation is a cost-effective way to
translate information with high quality if you are
willing to customize the application to your
requirements
Machine Translation is not an out of the box
solution
Machine Translation accuracy can be greatly
improved by controlling and improving the
quality of the source text

38

Where are we going?


Intelligent search w/ context and understanding
Sharing of knowledge through ontologies
Growth of user-defined knowledge
(folksonomies)
Intelligent Dialog Systems integration of
speech recognition w/ intelligent engines
(Sync)
Automate the process of information retrieval

39

Questions
?????

40

You might also like