You are on page 1of 23

Linking Primary Texts to Electronic Dictionaries

COST Workshop Connecting Textual Corpora and Dictionaries

Samuel Lubli1,3

Sabine Tittel2
1 Institute

Martin-Dietrich Glessgen1

of Romance Studies University of Zurich

2 Dictionnaire

tymologique de lAncien Franais (DEAF) Heidelberg Academy of Sciences and Humanities


3 Institute

of Computational Linguistics University of Zurich

April 26, 2013

Contents
1. Introduction 2. Concept & Requirements 3. Interface Back-End Front-End 4. Plan of Action 5. Conclusion

Samuel Lubli | 2/23

Introduction

1.

Introduction

Sabine Tittel | 3/23

Concept & Requirements

2.

Concept & Requirements


Connecting Phoenix2 and DEAFl

Sabine Tittel, Samuel Lubli | 4/23

Concept & Requirements

Current State: Phoenix2

Samuel Lubli | 5/23

Concept & Requirements

Phoenix2: Earlier Concept

Samuel Lubli | 6/23

Concept & Requirements

Current State: DEAF Writing System

Samuel Lubli | 7/23

Concept & Requirements

Aim
What do we want to do? Include references to DocLing texts in DEAFl (attestations)

DocLing chHM 130; DocLing chMe 195; ...

Samuel Lubli | 8/23

Concept & Requirements

DocLing: Charte chMe 195


Date: Octobre 1266 Type de document: charte: aranchissement Auteur: Jean seigneur de Joinville et snchal de Champagne ... ... 44 Cil de Moustier pourront amener en la vile totes fames par mariaige qui n a \37 veront suite ne reclain d autre seignour et autre fames non fors mes fames de cors 45 Et li home de Moustier ne porront marier lour llies se mes homes non de ma propre terre ou ceus de la juree 46 Les genz de Moustier ne poent faire lour fy clers se par moi non Et cil de \38 Moustier peuvent faire mairiage aus genz de la ter re mon frere de Vauquelour, selonc l atiremant ...

Samuel Lubli | 9/23

Concept & Requirements

Requirements
Phoenix2
Adapt to DEAF lemmatization policy Lemmatize texts Serve to texts / occurrences

DEAF Writing System


Enhance writing system with GUIs to integrate DocLing attestations Fetch texts / occurrences

INTERFACE

Samuel Lubli | 10/23

Interface

3.

Interface
Back-End | Front-End

Sabine Tittel, Samuel Lubli | 11/23

Interface

Back-End

Back-End

We need some kind of interface to do this

Samuel Lubli | 12/23

Interface

Back-End

Back-End: SOAP Service


We decided to implement a SOAP service
Protocol specication for exchanging data via RPC/HTTP Ocial W3C recommendation Uses XML as a transport format Fully platform independent

Samuel Lubli | 13/23

Interface

Back-End

Back-End: SOAP Service


The Phoenix2 SOAP Service provides two functions: getOccurrences ( Lemma ) getOccurrenceDetails ( OccurrenceID )

Phoenix2

DEAFl

Samuel Lubli | 14/23

Interface

Back-End

Back-End: SOAP Service


The Phoenix2 SOAP Service thus enables the following functionality: getOccurrences ( Lemma ) Show all occurrences, given a lemma getOccurrenceDetails ( OccurrenceID ) Show meta information, given the ID (a numeral identier) of an occurence

Try it yourself SOAP Endpoint (document/literal) http://sa.muel.tv/test/soap/ph2deafel.wsdl Short WSDL Documentation http://sa.muel.tv/test/soap/doc/wsdl.html XML Schema Denitions (XSD) http://sa.muel.tv/test/soap/doc/xsd.html
Samuel Lubli | 15/23

Interface

Front-End

Front-End
The DEAF develops a number of graphical user interfaces (GUIs) which
Build upon the DEAFs electronic dictionary writing system Allow for an integration of DocLing material

No complete blend: DocLing material will continue to be recognizeable as external material

Sabine Tittel | 16/23

Plan of Action

4.

Plan of Action

Sabine Tittel | 17/23

Plan of Action

Release
Our joint work is foreseen to be released in three steps: 1. All materials without semantic structure 2. All materials with a dedicated semantic structure for DocLing entries 3. Full integration of the DocLing entries into the DEAF article structure

Sabine Tittel | 18/23

Plan of Action

Milestones
Phoenix2
 Migrate old lemmata Implement SOAP service Lemmatize texts

DEAF Writing System


Implement new GUIs Adapt publication format (web edition)

First version due in autumn 2013

Samuel Lubli | 19/23

Conclusion

5.

Conclusion

Sabine Tittel, Samuel Lubli | 20/23

Conclusion

Benets
Both DocLing and Phoenix2 benet from our cooperation: Phoenix2
The vocabulary of the DocLing texts is embedded into its natural context of the Old French language, andvia the DEAFs etymological discussion in the broader context of the Romance languages.

DEAF
A considerable number of digital source texts is added to the dictionary. This new source material will strengthen the foundation of the semantic structure of the DEAF articles and enhance its quality.

Sabine Tittel, Samuel Lubli | 21/23

Conclusion

Conclusion

Questions?
Feedback is always very welcome

Sabine Tittel, Samuel Lubli | 22/23

Thank You
These slides are available at www.cl.uzh.ch/people/team/laeubli.html

Sabine Tittel, Samuel Lubli | 23/23

You might also like