Scholars

Gary F. Simons

Chief Research Officer


SIL: A Metaschema Language

N.B. This is work in progress. The materials presented on this page are not yet in final form.


SIL is a Semantic Interpretation Language that is used to define the meaning of the elements and attributes in an XML markup schema in terms of the concepts defined in a formal semantic schema (such as an RDF schema or an OWL ontology). SIL is implemented as an XML document type and a document instance is called a metaschema. It is meta- for two reasons: it transcends the markup schema by mapping it into a more abstract level of representation, and it embodies a change from one kind of schema to another.

The metaschema language has been developed as part of the EMELD project. EMELD, for Electronic Metastructures for Endangered Language Data, is a project that seeks to build infrastructure for assisting field linguists in the task of documenting and describing endangered languages. EMELD is promoting XML markup as best practice for language documentation and description, but is not mandating any one markup schema. Metaschemas are being developed as part of the solution for achieving interoperability across language resources when those resources use different markup schemas. Using a metaschema that formally defines the meaning of the markup, a language resource can be translated into its semantic interpretation, which in turn can be loaded into a pooled knowledge store that supports queries across all the resources.

Documentation

For a basic overview, see the presentation slides for a paper presented at the joint international conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing, 29 May to 2 June 2003, Athens, GA.

For detailed documentation, see SIL: A Metaschema Language for the Semantic Interpretation of XML Markup in Documents.

Implementation

The DTD for SIL, the Semantic Interpretation Language:

The XSLT script for SILC, the Semantic Interpretation Language Compiler:

A DOS batch file for interpreting a source document:

Examples

The examples come from the domain of lexicography. The sample files below are the letter A from three dictionaries. The following RDF Schema is used for the target semantic schema:

First sample, Sikaiana of Solomon Islands (by William Donner):

Second sample, Limbu of Nepal (by Boyd Michailovsky):

Third sample, Sindarin of Middle-Earth (by Didier Willis):


Author: Gary Simons
Last revised: 9 July 2003
Page URL: https://scholars.sil.org/gary_f_simons/workpaper/metaschema/