(Open Document Text) odt > TEI (odt_tei.xsl)

Vous qui entrez, laissez toute espérance ! (de comprendre)

© 2010, École nationale des chartes, licence CeCILL-C (LGPL compatible droit français)

Cette transformation prend en entrée du XML OpenDocument (ex : OpenOffice.org), et produit un TEI générique. L'insistance a porté sur la robustesse du filtre, afin de réduire les reprises du stylage dans le traitements de textes (ex : styles utilisateur corrompus). L'outil a été développé pour plusieurs éditions savantes (textes manuscrits, dictionnaires) sur des fichiers tels qu'envoyés à l'imprimeur (sans modèles de documents, avec peu de stylage). Il vise à terme la récupération de textes issus de numérisation avec mise en forme. Le résultat XML est optimisé pour des normalisations ultérieures (regroupements, hiérarchies, de niveau caractères et paragraphes). Cette transformation peut être branchée directement comme filtre d'export OpenOffice, mais cet usage n'est pas le plus conseillé. Le processus est parfois long, et peut échouer sans explication, notamment sur des fichiers importants. On tirera mieux parti de la sortie en la traitant avec d'autres filtres (expressions régulières, XSLT).

This transformation takes as input the OpenDocument XML (eg OpenOffice.org) and produces a generic TEI. The emphasis is on the robustness of the filter, to reduce rework the styling in the word processing (eg, user styles corrupted). The tool was developped for different scholarly publications (manuscripts, dictionaries) on files sent to the printing house (without templates, with few styling). It aims to recover texts from scanning with formatting. The XML output is optimized for subsequent normalization (grouping, hierarchy, on character and paragraph level). This transformation can be plugged directly as an export filter in OpenOffice, but this usage is not the most advisable. The process is sometimes long, and may fail without explanation, especially on important files. Best usage of output could be as an input for other filters (regular expressions, XSLT).

<xsl:param…
le nom de fichier en cours de traitementcf. [t6]
$filename=
Autoriser la génération des div en CDATAcf. [t10], [t12]
$CDATA=true()

Noms de styles équivalents à des éléments de niveau paragraphe (liste séparée d'espaces)

Space separated list of style names used as paragraph level element names.

cf. [t9]
$el-p=" bibl byline closer dateline desc docAuthor docDate epigraph label opener p postscript salute signed trailer witness"
element char levelcf. [t1]
$el-c=" abbr add actor affiliation age bibl c caption cb char code corr date del desc distinct email emph foreign gloss hi ident idno index l label measure mentioned name num q quote s seg sic stamp stage support term time title "

Noms de styles équivalents à des éléments pouvant contenir des paragraphe (liste séparée d'espaces)

Space separated list of style names used as element names for paragraph containers.

cf. [t9]
$el-quote=" argument note quote stage view "
<xsl:variable…
$lf
$label="ÀÁÂÃÄÅÆÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöùúûüýÿþ .()/\?"
$key="aaaaaaeeeeeiiiidnooooouuuuybbaaaaaaaceeeeiiiionooooouuuuyyb-------"
TODOs
source
<xsl:template name="class">
<!-- nom de style, peut être passé en paramètre -->
<xsl:param name="style-name" select="
@text:style-name | @class | @draw:style-name | @draw:text-style-name
"/>
<!-- poignée sur le style à explorer -->
<xsl:variable name="style" select="key('style', $style-name)"/>
<!-- nom de style automatique -->
<xsl:variable name="l" select="substring($style-name, 1, 1)"/>
<xsl:variable name="style_auto" select="
boolean( ( $l = 'T' or $l = 'P') and translate(substring($style-name, 2), '1234567890', '') = '')
"/>
<!-- obtenir le nom d'un style sémantique, malgré les dérives automatiques -->
<xsl:variable name="style_class">
<xsl:choose>
<!-- probablement pas un style automatique -->
<xsl:when test="not($style_auto)">
<xsl:value-of select="$style-name"/>
</xsl:when>
<!-- style automatique, prendre le parent -->
<xsl:when test="$style/@style:parent-style-name">
<xsl:value-of select="$style/@style:parent-style-name"/>
</xsl:when>
</xsl:choose>
</xsl:variable>
<xsl:value-of select="$style_class"/>
</xsl:template>
(l. 122) [class]
$style-name, $style/@style:parent-style-name, $style_class
  • $style-name : nom de style, peut être passé en paramètre
Chercher un nom sémantique de classe

Format caractère (char level)

source
<xsl:template match="text:span">
<xsl:variable name="class">
<xsl:call-template name="class"/>
</xsl:variable>
<!-- poignée sur le style à explorer -->
<xsl:variable name="style" select="
key('style', @text:style-name | @class | @draw:style-name | @draw:text-style-name)
"/>
<!-- interprétation d'un style sous forme d'élément -->
<xsl:variable name="style_el">
<xsl:choose>
<xsl:when test="$class = 'Emphasis'">emph</xsl:when>
<xsl:when test="$class = 'Strong_20_Emphasis'">distinct</xsl:when>
<xsl:when test="$class = 'Bullet_20_Symbols'">glyph</xsl:when>
<xsl:when test="$class = 'Page_20_Number'">num</xsl:when>
<xsl:when test="$class = 'Line_20_numbering'">num</xsl:when>
<xsl:when test="$class = 'Q'">q</xsl:when>
<xsl:when test="$class = 'Citation'">quote</xsl:when>
<xsl:when test="$class = 'Example'">eg</xsl:when>
<xsl:when test="$class = 'Definition'">gloss</xsl:when>
<xsl:when test="$class = 'Source_20_Text'">code</xsl:when>
<xsl:when test="$class = 'Teletype'">code</xsl:when>
<xsl:when test="$class = 'exposant'">sup</xsl:when>
<xsl:when test="$class = 'Example'">q</xsl:when>
<xsl:when test="$class = 'Variable'">term</xsl:when>
<xsl:when test="$class = 'pet-_20_cap-_20_notes'">name</xsl:when>
<xsl:when test="$class = 'User_20_Entry'">code</xsl:when>
<!-- autres styles OO dont on ne fait rien -->
<xsl:when test="$class = 'Footnote_20_Symbol'"/>
<xsl:when test="$class = 'Placeholder'"/>
<xsl:when test="$class = 'Rubies'"/>
<!-- reprendre le nom de style comme nom d'élément -->
<xsl:when test="
not(contains($class, ' ') or contains($class, '_20_'))
">
<xsl:value-of select="$class"/>
</xsl:when>
</xsl:choose>
</xsl:variable>
<xsl:choose>
<!-- les espaces en italique -->
<xsl:when test="normalize-space(.)='' and not(*)">
<xsl:value-of select="."/>
</xsl:when>
<!-- <style> sans mise en forme supplémentaire -->
<xsl:when test="
$style_el != '' and contains( $el-c, concat(' ', $style_el, ' '))
">
<xsl:element name="{$style_el}">
</xsl:element>
</xsl:when>
<!-- Semble un style sémantique -->
<xsl:when test="$style_el != ''">
<seg rend="{translate($class, $label, $key)}">
</seg>
</xsl:when>
<!-- empilement des mises en forme -->
<xsl:otherwise>
<!-- exposant/indice -->
<xsl:variable name="position">
<xsl:choose>
<!-- Pas de <sup> autour d'un appel de note -->
<xsl:when test="not(text()[normalize-space(.) != ''])">
</xsl:when>
<xsl:when test="
contains($style//@style:text-position, 'sub') or starts-with($style//@style:text-position, '-')
">
<xsl:choose>
<xsl:when test="translate(., '0123456789aeiou', '') = ''">
<xsl:value-of select="translate(., '0123456789aeiou', '₀₁₂₃₄₅₆₇₈₉ₐₑᵢₒᵤ')"/>
</xsl:when>
<xsl:otherwise>
<sub>
</sub>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:when test="starts-with($style//@style:text-position,'0%')">
</xsl:when>
<!-- style:text-position="33% 100%" -->
<xsl:when test="contains($style//@style:text-position, 'super')">
<xsl:choose>
<-- réduction de certains exposants courants, éviter de mettre trop de lettres
un mots complet s'affichera mal 0123456789
                '0123456789abcdefghijklmnoprstuvwxyz0123456789',
                '⁰¹²³⁴⁵⁶⁷⁸⁹ªᵇᶜᵈᵉᶠᵍʰⁱʲᵏˡᵐⁿºᵖʳˢᵗᵘᵛʷˣʸᶻ⁰¹²³⁴⁵⁶⁷⁸⁹' -->
<xsl:when test="translate(., 'ao', '') = ''">
<xsl:value-of select=" translate(., 'ao', 'ªº')"/>
</xsl:when>
<xsl:otherwise>
<sup>
</sup>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:when test="$style//@style:text-position !=''">
<sup>
</sup>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<!-- gras, italique, petites capitales -->
<xsl:variable name="bi">
<xsl:choose>
<xsl:when test="
$style//@fo:font-weight='bold' or $style//@font-weight-complex='bold'
">
<ident>
<xsl:copy-of select="$position"/>
</ident>
</xsl:when>
<xsl:when test="$style//@fo:font-variant = 'small-caps'">
<xsl:choose>
<!-- Petites capitales en exposant, supprimer -->
<xsl:when test="$style//@style:text-position !=''">
<xsl:copy-of select="$position"/>
</xsl:when>
<!-- Pas de lettre minuscule, invisible, probablement pas nécessaire -->
<xsl:when test="
translate(., 'aàbcdeéèfghijklmnopqrstuvwxyz', '') = .
">
<xsl:copy-of select="$position"/>
</xsl:when>
<!-- Chiffres romains -->
<xsl:when test="translate(., 'ivxlcdm1234567890,.() ', '') = ''">
<num>
<xsl:copy-of select="$position"/>
</num>
</xsl:when>
<xsl:otherwise>
<name>
<xsl:copy-of select="$position"/>
</name>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<!-- lettres espacées -->
<xsl:when test="
$style//@fo:letter-spacing != '' and $style//@fo:letter-spacing != 'normal' and (number(translate($style//@fo:letter-spacing, 'abcdefghijklmnopqrstuvwxyz', '')) &gt; 0.03 )
">
<phr>
<xsl:copy-of select="$position"/>
</phr>
</xsl:when>
<xsl:when test="
$style//@fo:font-style='italic' or $style//@font-style-complex='italic'
">
<xsl:choose>
<xsl:when test="translate(., '.()', '') = 'sic'">
<sic>
<xsl:value-of select="."/>
</sic>
</xsl:when>
<xsl:otherwise>
<emph>
<xsl:copy-of select="$position"/>
</emph>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<--
              <xsl:copy-of select="$position"/>
              -->
<xsl:choose>
<xsl:when test="$style//@style:text-position !=''">
<xsl:copy-of select="$position"/>
</xsl:when>
<xsl:otherwise>
<!-- Semble un segment non significatif -->
<xsl:copy-of select="$position"/>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<!-- souligné -->
<xsl:variable name="u">
<xsl:choose>
<xsl:when test="ancestor::text:a">
<xsl:copy-of select="$bi"/>
</xsl:when>
<xsl:when test="
not($style//@style:text-underline) and not($style//@style:text-underline-style)
">
<xsl:copy-of select="$bi"/>
</xsl:when>
<xsl:when test="
$style//@style:text-underline and $style//@style:text-underline = 'none'
">
<xsl:copy-of select="$bi"/>
</xsl:when>
<xsl:when test="
$style//@style:text-underline-style and $style//@style:text-underline-style = 'none'
">
<xsl:copy-of select="$bi"/>
</xsl:when>
<xsl:otherwise>
<title>
<xsl:copy-of select="$bi"/>
</title>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<!-- Couleurs, notamment pour surignements enveloppant -->
<xsl:choose>
<!-- pas de style de couleur dans les liens -->
<xsl:when test="ancestor::text:a or ancestor::text:h">
<xsl:copy-of select="$u"/>
</xsl:when>
<xsl:when test="
$style//@fo:background-color != '#ffffff' and $style//@fo:background-color != 'transparent'
">
<xsl:element name="
bg_{substring-after( $style//@fo:background-color, '#')}
">
<xsl:copy-of select="$u"/>
</xsl:element>
</xsl:when>
<xsl:when test="$style//@fo:color != '#000000'">
<xsl:element name="col_{substring-after( $style//@fo:color, '#')}">
<xsl:copy-of select="$u"/>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="$u"/>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
(l. 186) text:span
<emph>, <ident>, <name>, <num>, <phr>, <seg>, <sic>, <sub>, <sup>, <title>, <{$style_el}> , "emph", "distinct", "glyph", "num", "num", "q", "quote", "eg", "gloss", "code", "code", "sup", "q", "term", "name", "code", $position, $position, $position, $position, $position, $position, $position, $position, $position, $bi, $bi, $bi, $bi, $bi, $u, $u, $u, $u, $class, ., translate(., '0123456789aeiou', '₀₁₂₃₄₅₆₇₈₉ₐₑᵢₒᵤ'), translate(., 'ao', 'ªº'), .

Interpréter les styles et les mises en forme locales en éléments TEI courts choisis pour faciliter des regroupements ultérieurs.

Interpret the styles and local formatting in TEI short names, to facilitate subsequent groupings.

Exemple de regroupement grouping sample

  • Vue : Le Quixotte au xixe siècle
  • odt xml : <text:span text:style-name="T18">Le </text:span> <text:span text:style-name="T3">Quixotte</text:span> <text:span text:style-name="T18"> au </text:span> <text:span text:style-name="T15">xix</text:span> <text:span text:style-name="T7">e</text:span> <text:span text:style-name="T18"> siècle</text:span>
  • odt_tei.xsl : <title>Le </title><title><hi>Quixotte</hi></title><title> au </title><title><num>xix</num></title><title>ᵉ</title><title> siècle</title>
  • s/<\/([^>]+)>( *)<\1>/$2/
  • <title>Le <hi>Quixotte</hi> au <num>xix</num>ᵉ siècle</title>

Ordre d'imbrication des éléments générés (nesting order of elements)

  • Surlignage (hilite) : <bg_{code-couleur}>
  • Style : <{style}>
  • Couleur (color) : <col_{code-couleur}>
  • Souligné (underline) : <title>
  • gras (bold) : <ident> (cf. <html5:b> : keywords)
  • Lettres espacées (letter spacing) : <phr> (expressions)
  • Italique (italic) : <hi>
  • Petites capitales (Small-Caps) : <name>, <num> (chiffres, notamment romains)
  • Exposant superscript et indice subscript : unicode, <hi rend="sup">, <hi rend="sub">
source
<xsl:template match="text:s">
<xsl:choose>
<xsl:when test="@text:c">
<space>
<xsl:value-of select="substring( ' ' ,1, @text:c - 2)"/>
</space>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
(l. 407) text:s
<space>, substring( ' ' ,1, @text:c - 2)
espacements
source
<xsl:template match="text:tab">
<xsl:choose>
<!-- no para indent -->
<xsl:when test="
not(preceding-sibling::node()[normalize-space(.) != ''])
"/>
</xsl:choose>
</xsl:template>
(l. 421) text:tab
"    "
source
<xsl:template match="text:a">
<ref>
<xsl:attribute name="target">
<xsl:choose>
<xsl:when test="starts-with(@xlink:href, '#')">
<xsl:value-of select="translate(@xlink:href, ' ', '_')"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="@xlink:href"/>
</xsl:otherwise>
</xsl:choose>
</xsl:attribute>
</ref>
</xsl:template>
(l. 430) text:a
<ref>, @target, translate(@xlink:href, ' ', '_'), @xlink:href
liens
source
<xsl:template match="
draw:frame/draw:text-box/text:p[draw:frame/draw:image|draw:image]
">
<xsl:apply-templates select="draw:image|draw:frame"/>
<head>
<xsl:apply-templates select="
node()[local-name() != 'image' and local-name() != 'frame']
"/>
</head>
</xsl:template>
(l. 460) draw:frame/draw:text-box/text:p[draw:frame/draw:image|draw:image]
<head>

Images, TODO, trouver le nom de l'image et le message alternatif <draw:frame draw:style-name="fr2" draw:name="Nom" text:anchor-type="as-char" svg:width="14.97cm" svg:height="6.219cm" draw:z-index="2"> <draw:image xlink:href="Pictures/10000000000006E900000301284AF6AA.png" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad"/> <svg:title>Alternative</svg:title> </draw:frame>

<text:p text:style-name="P33">

<draw:frame draw:style-name="fr1" draw:name="Cadre3" text:anchor-type="paragraph" svg:width="12.698cm" draw:z-index="2"> <draw:text-box fo:min-height="9.523cm"> <text:p text:style-name="Caption"><draw:frame draw:style-name="fr2" draw:name="images3" text:anchor-type="paragraph" svg:x="0.004cm" svg:y="0.002cm" svg:width="12.698cm" style:rel-width="100%" svg:height="9.523cm" style:rel-height="scale" draw:z-index="3"><draw:image xlink:href="../../../../elec/conferences/src/knoch-mund/olgiati.png" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" draw:filter-name="&lt;Tous les formats&gt;"/></draw:frame>© Mirta Olgiati</text:p> </draw:text-box> </draw:frame> </text:p>

source
<xsl:template match="draw:image">
<xsl:variable name="url">
<xsl:choose>
<xsl:when test="
$filename != '' and contains(@xlink:href, $filename)
">
<xsl:value-of select="$filename"/>
<xsl:value-of select="substring-after(@xlink:href, $filename)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="@xlink:href"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<graphic url="{$url}"/>
</xsl:template>
(l. 466) draw:image
<graphic>, $filename, substring-after(@xlink:href, $filename), @xlink:href
source
<xsl:template match="text:line-break">
<xsl:variable name="class">
<xsl:for-each select="ancestor::text:p[1]">
<xsl:call-template name="class"/>
</xsl:for-each>
</xsl:variable>
<xsl:choose>
<!-- Les sauts de lignes dans les titres ne fonctionnent souvent que pour le papier -->
<xsl:when test="ancestor::text:h">
</xsl:when>
<xsl:when test="$class='Subtitle'">
</xsl:when>
<xsl:otherwise>
<lb/>
<xsl:value-of select="$lf"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
(l. 482) text:line-break
<lb>, $lf
Saut de ligne
source
<xsl:template match="text:page-number">
<pb n="{.}"/>
</xsl:template>
(l. 502) text:page-number
<pb>

Format paragraphe (paragraph level)

source
<xsl:template match="text:p">
<xsl:variable name="style" select="
key('style', @text:style-name | @class | @draw:style-name | @draw:text-style-name)
"/>
<-- interprétation d'un style sous forme d'élément 
Attention, ne pas glisser n'importe comment des saut de lignes pour permettre 
facilement des regroupements en sed
    -->
<xsl:variable name="class">
<xsl:call-template name="class"/>
</xsl:variable>
<xsl:variable name="xml">
<xsl:choose>
<!-- empty paras maybe -->
<xsl:when test=".='' and not(*)"/>
<!-- styles ne pouvant pas contenir de paragraphes -->
<xsl:when test="$class = 'Preformatted_20_Text' or $class = 'eg'">
<eg>
<!-- keep line breaks -->
<xsl:value-of select="$lf"/>
<xsl:call-template name="bloc"/>
</eg>
</xsl:when>
<xsl:when test="
$class != '' and contains( $el-p, concat(' ', $class, ' '))
">
<xsl:element name="{$class}">
<xsl:call-template name="bloc"/>
</xsl:element>
</xsl:when>
<xsl:when test="$class = 'Salutation'">
<xsl:value-of select="$lf"/>
<salute>
<xsl:call-template name="bloc"/>
</salute>
</xsl:when>
<xsl:when test="$class = 'bibl' or $class = 'Biblio'">
<xsl:value-of select="$lf"/>
<bibl>
<xsl:call-template name="bloc"/>
</bibl>
</xsl:when>
<xsl:when test="$class = 'Signature'">
<xsl:value-of select="$lf"/>
<signed>
<xsl:call-template name="bloc"/>
</signed>
</xsl:when>
<xsl:when test="$class = 'Sender'">
<xsl:value-of select="$lf"/>
<byline>
<xsl:call-template name="bloc"/>
</byline>
</xsl:when>
<xsl:when test="starts-with($class, 'Date')">
<xsl:value-of select="$lf"/>
<dateline>
<xsl:call-template name="bloc"/>
</dateline>
</xsl:when>
<xsl:when test="$class = 'Auteur'">
<xsl:value-of select="$lf"/>
<byline>
<xsl:call-template name="bloc"/>
</byline>
</xsl:when>
<xsl:when test="$class = 'Title'">
<xsl:value-of select="$lf"/>
<head>
<xsl:call-template name="bloc"/>
</head>
</xsl:when>
<!-- styles contenant des paragraphes à regrouper -->
<xsl:when test="
$class != '' and contains( $el-quote, concat(' ', $class, ' '))
">
<xsl:element name="{$class}">
<xsl:value-of select="$lf"/>
<p>
<xsl:call-template name="bloc"/>
</p>
</xsl:element>
</xsl:when>
<xsl:when test="$class = 'l'">
<lg>
<xsl:value-of select="$lf"/>
<l>
<xsl:call-template name="bloc"/>
</l>
</lg>
</xsl:when>
<xsl:when test="$class = 'List_20_Heading'">
<dl>
<xsl:value-of select="$lf"/>
<label>
<xsl:call-template name="bloc"/>
</label>
</dl>
</xsl:when>
<xsl:when test="$class = 'List_20_Contents'">
<dl>
<xsl:value-of select="$lf"/>
<item>
<xsl:call-template name="bloc"/>
</item>
</dl>
</xsl:when>
<xsl:when test="$class = 'Quotations'">
<quote>
<xsl:value-of select="$lf"/>
<p>
<xsl:call-template name="bloc"/>
</p>
</quote>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$lf"/>
<p>
<xsl:variable name="rend">
<xsl:choose>
<xsl:when test="$class ='Textformatvorlage'"/>
<xsl:when test="
starts-with($class, 'WW') or starts-with($class, 'Normal') or contains($class, 'Web')
"/>
<!-- Paragraphe normal -->
<xsl:when test="$class = 'Text_20_body'"/>
<xsl:when test="normalize-space($class) != ''">
<xsl:call-template name="_20_">
<xsl:with-param name="string" select="$class"/>
</xsl:call-template>
</xsl:when>
</xsl:choose>
<xsl:variable name="left" select="$style//@fo:margin-left"/>
<xsl:choose>
<xsl:when test="$left and not(starts-with($left, '0'))"> indent</xsl:when>
<xsl:when test="$style//@fo:text-align = 'end'"> right</xsl:when>
<xsl:when test="$style//@fo:text-align = 'center'"> center</xsl:when>
</xsl:choose>
<xsl:choose>
<!-- Paragraphe normal tout en gras ? -->
<xsl:when test="
$style//@fo:font-weight='bold' or $style//@font-weight-complex='bold'
">
b</xsl:when>
</xsl:choose>
</xsl:variable>
<xsl:if test="normalize-space($rend)">
<xsl:attribute name="rend">
<xsl:value-of select="normalize-space($rend)"/>
</xsl:attribute>
</xsl:if>
<xsl:call-template name="bloc"/>
</p>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:variable name="border">
<xsl:call-template name="border"/>
</xsl:variable>
<xsl:choose>
<xsl:when test="$border != ''">
<figure>
<xsl:copy-of select="$xml"/>
<xsl:value-of select="$lf"/>
</figure>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="$xml"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
(l. 546) text:p
<bibl>, <byline>, <dateline>, <dl>, <eg>, <figure>, <head>, <item>, <l>, <label>, <lg>, <p>, <quote>, <salute>, <signed>, @rend, <{$class}> , " indent", " right", " center", " b", $xml, $xml, $lf, $lf, $lf, $lf, $lf, $lf, $lf, $lf, $lf, $lf, $lf, $lf, $lf, $lf, normalize-space($rend), $lf

Paragraphes, le XML est préparé pour des étapes ultérieures de regroupements (paragraphs and grouping).

  • Vue
    Citation
    Quotations
  • odt
    <text:p text:style-name="Quotations">Citation</text:p>
    <text:p text:style-name="P13">Quotations</text:p>
  • odt_tei.xsl
    <quote><p>Citation</p></quote>
    <quote><p>Quotations</p></quote>
  • s/<\/([^>]+)>\t(\n+)<\1>/$2/g
  • <quote>
      <p>Citation</p>
      <p>Quotations</p>
    </quote>

Regroupements (grouping)

  • Encadré (border) : <figure>
  • Texte Préformaté (Preformated Text) : <eg>
  • Liste de termes (Definition list) : list/(label+,item)+. En-tête de liste (List Heading) : <label> ; Contenu de liste (List Content) : <item>
  • Citation (Quotations) : <quote> ;
source
<xsl:template name="bloc">
<xsl:variable name="tei">
<xsl:choose>
<xsl:when test="text()">
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="*"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<--
    <xsl:choose>
      <xsl:when test="function-available('exslt:node-set')">
        <xsl:apply-templates select="exslt:node-set($tei)"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:copy-of select="$tei"/>
      </xsl:otherwise>
    </xsl:choose>
    -->
<xsl:copy-of select="$tei"/>
</xsl:template>
(l. 713) [bloc]
$tei
Appeler par tout élément bloc, peut permettre d'autres traitements (fonction exslt:node-set)

Structure

source
<xsl:template match="text:h">
<!-- tant pis pour les titres vide, sinon risque de mauvaise indentation -->
<xsl:variable name="level" select="@text:outline-level"/>
<!-- génération sale des <div> -->
<xsl:if test="
$CDATA and not(ancestor::table:table | ancestor::text:list | ancestor::text:note)
">
<xsl:variable name="prev">
<xsl:choose>
<xsl:when test="preceding::text:h[1]">
<xsl:value-of select="preceding::text:h[1]/@text:outline-level"/>
</xsl:when>
</xsl:choose>
</xsl:variable>
<!-- fermer les sections ouvertes précédemment, le saut de ligne est nécessaire pour une raison étrange -->
<xsl:if test="$prev">
<xsl:value-of select="$lf"/>
<xsl:variable name="close">&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</xsl:variable>
<xsl:value-of disable-output-escaping="yes" select="substring($close, 1, (1+$prev - $level) * 6)"/>
</xsl:if>
<!-- ouvrir des sections -->
<xsl:variable name="open">&lt;div&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;</xsl:variable>
<xsl:value-of select="$lf"/>
<xsl:value-of disable-output-escaping="yes" select="substring($open, 1, 5)"/>
<xsl:value-of disable-output-escaping="yes" select="substring($open, 1, ($level - $prev - 1) * 5)"/>
</xsl:if>
<xsl:variable name="xml">
<xsl:value-of select="$lf"/>
<head type="h{$level}">
<xsl:variable name="start-value" select="
/office:document/office:document-styles/office:styles/text:outline-style[@style:name='Outline']/text:outline-level-style[@text:level='1']/@text:start-value
"/>
<xsl:if test="$start-value">
<xsl:variable name="n">
<xsl:number count="text:h[@text:outline-level = $level]" level="any"/>
</xsl:variable>
<xsl:attribute name="n">
<xsl:value-of select="$n - 1 + $start-value"/>
</xsl:attribute>
</xsl:if>
<xsl:call-template name="bloc"/>
</head>
</xsl:variable>
<xsl:variable name="border">
<xsl:call-template name="border"/>
</xsl:variable>
<xsl:choose>
<xsl:when test="$border != ''">
<figure>
<xsl:copy-of select="$xml"/>
<xsl:value-of select="$lf"/>
</figure>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="$xml"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
(l. 742) text:h
<figure>, <head>, @n, "0", $xml, $xml, preceding::text:h[1]/@text:outline-level, $lf, substring($close, 1, (1+$prev - $level) * 6), $lf, substring($open, 1, 5), substring($open, 1, ($level - $prev - 1) * 5), $lf, $n - 1 + $start-value, $lf
titres avec structuration des divisions
source
<xsl:template match="text:list-header">
<head>
</head>
</xsl:template>
(l. 798) text:list-header
<head>
titre dans une liste
source
<xsl:template match="office:text">
<body>
<xsl:apply-templates select="*"/>
<xsl:if test="$CDATA">
<!-- Attention si le dernier titre est vide -->
<xsl:variable name="prev" select=".//text:h[position() = last()]/@text:outline-level"/>
<xsl:variable name="close">&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</xsl:variable>
<xsl:value-of select="$lf"/>
<xsl:value-of disable-output-escaping="yes" select="substring($close, 1, $prev * 6)"/>
</xsl:if>
</body>
</xsl:template>
(l. 804) office:text
<body>, $lf, substring($close, 1, $prev * 6)
Corps du texte, fermer les divisions ouvertes par des titres
source
<xsl:template match="text:table-of-content | text:alphabetical-index ">
<xsl:value-of select="$lf"/>
<divGen>
<xsl:attribute name="type">
<xsl:choose>
<xsl:when test="self::text:table-of-content">toc</xsl:when>
<xsl:when test="self::text:alphabetical-index">index</xsl:when>
<xsl:otherwise>
<xsl:value-of select="local-name()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:attribute>
</divGen>
</xsl:template>
(l. 817) text:table-of-content | text:alphabetical-index
<divGen>, @type, "toc", "index", $lf, local-name()
Section générée
source
<xsl:template match="office:document">
<xsl:apply-templates select="office:document-content"/>
</xsl:template>
(l. 834) office:document

attention ne pas matcher la racine, ou bien matchera ce que l'on envvoit en nodeset <xsl:template match="/"/>

source
<xsl:template match="office:document-content">
<xsl:if test="function-available('date:date-time')">
<xsl:comment>
<xsl:value-of select="date:date-time()"/>
</xsl:comment>
</xsl:if>
<TEI>
<!-- TODO, prendre les métas -->
<teiHeader>
<fileDesc>
<titleStmt>
<title/>
</titleStmt>
<publicationStmt>
<p/>
</publicationStmt>
<sourceDesc>
<p/>
</sourceDesc>
</fileDesc>
</teiHeader>
<xsl:apply-templates select="*"/>
</TEI>
</xsl:template>
(l. 837) office:document-content
<TEI>, <fileDesc>, <p>, <publicationStmt>, <sourceDesc>, <teiHeader>, <title>, <titleStmt>, date:date-time()
source
<xsl:template match="office:meta">
<teiHeader>
<xsl:apply-templates select="*"/>
</teiHeader>
</xsl:template>
(l. 862) office:meta
<teiHeader>
Structure
source
<xsl:template match="office:body">
<text>
<xsl:apply-templates select="*"/>
</text>
</xsl:template>
(l. 868) office:body
<text>
Conteneur
source
<xsl:template match="text:section">
</xsl:template>
(l. 874) text:section
passer à travers
source
<xsl:template match="draw:frame">
<xsl:choose>
<xsl:when test="ancestor::draw:frame">
</xsl:when>
<xsl:otherwise>
<figure>
</figure>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
(l. 878) draw:frame
<figure>
Cadres de texte
source
<xsl:template match="draw:text-box">
</xsl:template>
(l. 890) draw:text-box
source
<xsl:template match="draw:plugin">
<ptr target="{@xlink:href}"/>
</xsl:template>
(l. 893) draw:plugin
<ptr>
source
<xsl:template match="
office:scripts | office:font-face-decls | text:sequence-decls | office:forms | office:automatic-styles | text:soft-page-break | office:settings | office:styles | office:master-styles
"/>
(l. 899) office:scripts | office:font-face-decls | text:sequence-decls | office:forms | office:automatic-styles | text:soft-page-break | office:settings | office:styles | office:master-styles
couper

Indexation

source
<xsl:template match="
text:alphabetical-index-mark | text:alphabetical-index-mark-start | text:user-index-mark
">
<term rend="index">
<xsl:if test="@text:key1">
<xsl:attribute name="type">
<xsl:value-of select="@text:key1"/>
</xsl:attribute>
</xsl:if>
<xsl:if test="@text:key2">
<xsl:attribute name="subtype">
<xsl:value-of select="@text:key2"/>
</xsl:attribute>
</xsl:if>
<xsl:attribute name="key">
<xsl:choose>
<xsl:when test="@text:string-value">
<xsl:value-of select="@text:string-value"/>
</xsl:when>
<xsl:when test="self::text:alphabetical-index-mark-start">
<xsl:variable name="id" select="@text:id"/>
<xsl:value-of select="
following-sibling::node()[following-sibling::text:alphabetical-index-mark-end[@text:id=$id]]
"/>
</xsl:when>
</xsl:choose>
</xsl:attribute>
</term>
</xsl:template>
(l. 908) text:alphabetical-index-mark | text:alphabetical-index-mark-start | text:user-index-mark
<term>, @type, @text:key1, @text:key2, @text:string-value, following-sibling::node()[following-sibling::text:alphabetical-index-mark-end[@text:id=$id]]

formater une marque d'index, à appeler dans text:alphabetical-index-mark, l'attribut protège l'élément d'écrasement dans les processus de simplification

source
<xsl:template match="text:alphabetical-index-mark-end"/>
(l. 933) text:alphabetical-index-mark-end

Notes

source
<xsl:template match="text:span[text:note][not(*[2])]" priority="3">
</xsl:template>
(l. 940) text:span[text:note][not(*[2])]
passer à travers les appels de note
source
<xsl:template match="text:note">
<note>
<!-- garder le n° -->
<xsl:if test="text:note-citation">
<xsl:attribute name="n">
<xsl:value-of select="normalize-space(text:note-citation)"/>
</xsl:attribute>
</xsl:if>
<xsl:attribute name="type">
<xsl:value-of select="@text:note-class"/>
</xsl:attribute>
<xsl:apply-templates select="text:note-body"/>
</note>
</xsl:template>
(l. 947) text:note
<note>, @n, normalize-space(text:note-citation), @text:note-class

<text:span class="footnote_20_symbol"><text:note text:id="ftn144" text:note-class="footnote"><text:note-citation>144</text:note-citation><text:note-body> <text:p fo:text-align="justify" class="standard"><text:span><text:s/>nitebantur] nitebanter.</text:span></text:p></text:note-body></text:note></text:span>

source
<xsl:template match="text:note-body">
<xsl:choose>
<xsl:when test="count(*)=1">
<xsl:apply-templates select="*/node()"/>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
(l. 961) text:note-body
source
<xsl:template match="office:annotation">
<note place="margin" resp="{dc:creator}">
<xsl:comment>
<xsl:value-of select="dc:date"/>
</xsl:comment>
<xsl:choose>
<xsl:when test="count(text:*)=1">
<xsl:apply-templates select="text:*/node()"/>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="text:*"/>
</xsl:otherwise>
</xsl:choose>
</note>
</xsl:template>
(l. 972) office:annotation
<note>, dc:date
les notes jaunes

Listes et tables

source
<xsl:template match="text:list">
<xsl:variable name="style-name" select="@text:style-name"/>
<xsl:variable name="level" select="count(ancestor-or-self::text:list)"/>
<!-- poignée sur le style à explorer -->
<xsl:variable name="list-level" select="
key('list-style', $style-name)/*[@text:level=$level]
"/>
<xsl:variable name="list">
<xsl:value-of select="$lf"/>
<list>
<xsl:choose>
<xsl:when test="
local-name($list-level) = 'list-level-style-bullet'
">
<xsl:attribute name="type">ul</xsl:attribute>
</xsl:when>
<xsl:when test="
local-name($list-level) = 'list-level-style-number'
">
<xsl:attribute name="type">ol</xsl:attribute>
</xsl:when>
</xsl:choose>
<xsl:if test="name($list-level)">
<xsl:attribute name="rend">
<xsl:value-of select="
($list-level/@text:bullet-char | $list-level/@style:num-format)
"/>
</xsl:attribute>
</xsl:if>
<xsl:apply-templates select="*"/>
<xsl:value-of select="$lf"/>
</list>
</xsl:variable>
<!-- Attraper le premier item pour voir s'il est encadré -->
<xsl:variable name="stylename" select="
*//@text:style-name | *//@class | *//@draw:style-name | *//@draw:text-style-name
"/>
<xsl:variable name="border">
<xsl:for-each select="text:list-item[1][count(*) = 1]/*[1]">
<xsl:call-template name="border"/>
</xsl:for-each>
</xsl:variable>
<xsl:choose>
<xsl:when test="$border != ''">
<!-- No space before a border, to help future cleaning -->
<figure>
<xsl:copy-of select="$list"/>
<xsl:value-of select="$lf"/>
</figure>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="$list"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
(l. 994) text:list
<figure>, <list>, @type, "ul", "ol", $list, $list, $lf, ($list-level/@text:bullet-char | $list-level/@style:num-format), $lf, $lf
liste
source
<xsl:template name="border">
<!-- nom de style, peut être passé en paramètre -->
<xsl:param name="style-name" select="
@text:style-name | @class | @draw:style-name | @draw:text-style-name
"/>
<!-- poignée sur le style à explorer -->
<xsl:variable name="style" select="key('style', $style-name)"/>
<!-- nom de style automatique -->
<xsl:variable name="l" select="substring($style-name, 1, 1)"/>
<xsl:variable name="style_auto" select="
boolean( ( $l = 'T' or $l = 'P') and translate(substring($style-name, 2), '1234567890', '') = '')
"/>
<xsl:choose>
<!-- style de bordure annulé -->
<xsl:when test="$style//@fo:border = 'none'"/>
<xsl:when test="$style//@fo:border">
<xsl:value-of select="$style//@fo:border"/>
</xsl:when>
<!-- style automatique, prendre le parent -->
<xsl:when test="$style/@style:parent-style-name">
<xsl:variable name="class" select="key('style', $style/@style:parent-style-name)"/>
<xsl:value-of select="$class//@fo:border"/>
</xsl:when>
</xsl:choose>
</xsl:template>
(l. 1039) [border]
$style//@fo:border, $class//@fo:border
  • $style-name : nom de style, peut être passé en paramètre
source
<xsl:template match="text:list-item">
<xsl:value-of select="$lf"/>
<item>
<xsl:choose>
<!-- Inutile de mettre un paragraphe unique dans un item. Récupérer style d'item ? -->
<xsl:when test="count(*) = 1">
<xsl:for-each select="*[1]">
<xsl:variable name="class">
<xsl:call-template name="class"/>
</xsl:variable>
<xsl:if test="$class != ''">
<xsl:attribute name="rend">
<xsl:value-of select="$class"/>
</xsl:attribute>
</xsl:if>
</xsl:for-each>
<xsl:apply-templates select="*/node()"/>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
</xsl:choose>
</item>
</xsl:template>
(l. 1061) text:list-item
<item>, @rend, $lf, $class
item de liste
source
<xsl:template match="table:table">
<xsl:value-of select="$lf"/>
<table>
<--
table:name="Tableau1" table:style-name="Tableau1"
-->
<xsl:apply-templates select="node()"/>
<xsl:value-of select="$lf"/>
</table>
</xsl:template>
(l. 1086) table:table
<table>, $lf, $lf
table
source
<xsl:template match="table:table-column"/>
(l. 1097) table:table-column
Colonnes, rien
source
<xsl:template match="table:table-row | table:row">
<xsl:param name="role"/>
<xsl:value-of select="$lf"/>
<row>
<xsl:if test="$role != ''">
<xsl:attribute name="role">
<xsl:value-of select="$role"/>
</xsl:attribute>
</xsl:if>
<--
      <xsl:apply-templates select="@*"/>
      table:style-name="Tableau1.2"
      -->
<xsl:value-of select="$lf"/>
</row>
</xsl:template>
(l. 1099) table:table-row | table:row
<row>, @role, $lf, $role, $lf
  • $role :
Lignes
source
<xsl:template match="table:table-cell | table:cell">
<xsl:value-of select="$lf"/>
<cell>
<xsl:apply-templates select="@*"/>
<xsl:choose>
<!-- cellule vide, vu : des espaces insécables (?) -->
<xsl:when test="translate(normalize-space(.), ' ', '')= ''"/>
<!-- Un seul paragraphe -->
<xsl:when test="count(*)=1">
<xsl:apply-templates select="*/node()"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$lf"/>
</xsl:otherwise>
</xsl:choose>
</cell>
</xsl:template>
(l. 1117) table:table-cell | table:cell
<cell>, $lf, $lf
Cellules
source
<xsl:template match="@office:value-type"/>
(l. 1135) @office:value-type
source
<xsl:template match="@table:number-columns-spanned">
<xsl:if test="number(.) &gt; 1">
<xsl:attribute name="cols">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:if>
</xsl:template>
(l. 1136) @table:number-columns-spanned
@cols, .
source
<xsl:template match="@table:number-rows-spanned">
<xsl:if test="number(.) &gt; 1">
<xsl:attribute name="rows">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:if>
</xsl:template>
(l. 1143) @table:number-rows-spanned
@rows, .
source
<xsl:template match="@table:style-name">
<!-- TODO, voir cas réels -->
</xsl:template>
(l. 1150) @table:style-name
source
<xsl:template match="table:covered-table-cell"/>
(l. 1154) table:covered-table-cell
Cellule vide provenant d'une fusion
source
<xsl:template match="table:table-header-rows">
<xsl:apply-templates>
<xsl:with-param name="role">head</xsl:with-param>
</xsl:apply-templates>
</xsl:template>
(l. 1156) table:table-header-rows
thead

Liens et renvois

source
<xsl:template match="
text:bookmark-start[starts-with(@text:name, '_Toc')] | text:bookmark-end[starts-with(@text:name, '_Toc')] | text:bookmark[starts-with(@text:name, '_toc')]
"/>
(l. 1174) text:bookmark-start[starts-with(@text:name, '_Toc')] | text:bookmark-end[starts-with(@text:name, '_Toc')] | text:bookmark[starts-with(@text:name, '_toc')]

des trucs <text:bookmark-start text:name="_Toc177196146"/>Préface<text:bookmark-end text:name="_Toc177196146"/> <text:bookmark text:name="_toc1604"/>

source
<xsl:template match="text:bookmark-start | text:bookmark-end"/>
(l. 1176) text:bookmark-start | text:bookmark-end
source
<xsl:template match="text:bookmark">
<xsl:choose>
<xsl:when test="contains(@text:name, 'RefHeading')"/>
<xsl:otherwise>
<anchor>
<xsl:attribute name="xml:id">
<xsl:value-of select="translate(@text:name, ' ', '_')"/>
</xsl:attribute>
</anchor>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
(l. 1178) text:bookmark
<anchor>, @xml:id, translate(@text:name, ' ', '_')
Ancre
source
<xsl:template match="text:bookmark-end"/>
(l. 1190) text:bookmark-end

Divers

source
<xsl:template name="_20_">
<xsl:param name="string"/>
<xsl:choose>
<xsl:when test="contains($string, '_20_')">
<xsl:value-of select=" substring-before($string, '_20_') "/>
<xsl:call-template name="_20_">
<xsl:with-param name="string" select="substring-after($string, '_20_')"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select=" $string "/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
(l. 1200) [_20_]
"_", substring-before($string, '_20_') , $string
  • $string :
Restaurer des noms de style OOo
source
<xsl:template name="color">
<xsl:param name="code"/>
<xsl:variable name="hex" select="translate($code, 'abcdef', 'ABCDEF')"/>
<xsl:choose>
<xsl:when test="$hex = 'FFFFFF'">white</xsl:when>
<xsl:when test="$hex = 'FF0000'">red</xsl:when>
<xsl:when test="$hex = '00FF00'">green</xsl:when>
<xsl:when test="$hex = '0000FF'">blue</xsl:when>
<xsl:when test="$hex = 'FFFF00'">yellow</xsl:when>
<xsl:when test="$hex = '00FFFF'">cyan</xsl:when>
<xsl:when test="$hex = 'FF00FF'">magenta</xsl:when>
<xsl:when test="$hex = '808080'">gray</xsl:when>
<xsl:when test="$hex = '000000'">black</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$hex"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
(l. 1217) [color]
"white", "red", "green", "blue", "yellow", "cyan", "magenta", "gray", "black", "_", $hex
  • $code :
interpréter quelques codes couleur
source
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
(l. 1239) node()|@*
node()|@*
Par défaut tout recopier