2011-10-06 2 views
0

html 출력에 앵커를 추가하려고합니다. html은 xsl 2.0을 사용하여 xml을 html로 변환합니다. 내 스타일 시트에 정규 표현식 목록을 전달하고 정규 표현식 목록의 모든 일치 인스턴스를 앵커로 만들 수 있어야합니다. 하나의 regex에서 작동하는 코드가 있지만 정규식 목록을 실행할 때 같은 단락의 배수가됩니다. 나는 결코 xsl 2.0의 전문가는 아니다. 이런 식으로 할 수 있을지 모르겠습니다. 그게 더 쉽다면 C#도 사용할 수 있습니다. 누군가가 그것이 더 나은 해결책이 될 것이라고 생각한다면 그것은 확실하지 않습니다.Regex 목록을 사용하여 HTML에 앵커 추가

하나의 정규식에 대해 작동하는 코드는 다음과 같습니다

<xsl:template match="text()" mode="content"> 
    <xsl:variable name="text"> 
     <xsl:value-of select="."></xsl:value-of> 
    </xsl:variable> 
    <!-- 
     IndexTerms is a parameter passed into the sheet it is a list of regex expressions seperated by semi colons 
    --> 
    <xsl:for-each select="tokenize($IndexTerms, ';')"> 
     <xsl:call-template name="IndexTerm"> 
      <xsl:with-param name="matchedRegex"> 
       <xsl:text>(.*)(</xsl:text> 
       <xsl:value-of select="."></xsl:value-of> 
       <xsl:text>)(.*)</xsl:text> 
      </xsl:with-param> 
      <xsl:with-param name="text"> 
       <xsl:value-of select="$text"></xsl:value-of> 
      </xsl:with-param> 
     </xsl:call-template> 
    </xsl:for-each> 
</xsl:template> 

<xsl:template name="IndexTerm"> 
    <xsl:param name="matchedRegex"> 
     <xsl:text>asdf</xsl:text> 
    </xsl:param> 
    <xsl:param name="text"></xsl:param> 
    <xsl:analyze-string select="$text" regex="{$matchedRegex}" flags="m"> 
     <xsl:matching-substring> 
      <xsl:call-template name="IndexTerm"> 
       <xsl:with-param name="text"> 
        <xsl:value-of select="regex-group(1)"></xsl:value-of> 
       </xsl:with-param> 
       <xsl:with-param name="matchedRegex"> 
        <xsl:value-of select="$matchedRegex"></xsl:value-of> 
       </xsl:with-param> 
      </xsl:call-template> 
       <xsl:element name="a"> 
        <xsl:attribute name="class"> 
         <xsl:text>IndexAnchor</xsl:text> 
        </xsl:attribute> 
        <xsl:value-of select="regex-group(2)"></xsl:value-of> 
       </xsl:element> 
       <xsl:value-of select="regex-group(3)"></xsl:value-of> 
     </xsl:matching-substring> 
     <xsl:non-matching-substring> 
      <xsl:value-of select="."></xsl:value-of> 
     </xsl:non-matching-substring> 
    </xsl:analyze-string> 
</xsl:template> 

샘플 입력 : 정규식 입력 "디지털 텔레비전, 인터넷을?"사용

<body> 
<sec sec-type="intro"> 
    <title>INTRODUCTION</title> 
    <p>Digital Television is the most advanced version of Television 
     technology improved in the last century. Digital TV provides 
     customers more choices and interactivity. New technology called 
     Internet Protocol-based Television (IPTV) uses digital TV technology 
     and transmits it over IP based networks (Driscol, 2008), 
      (<xref ref-type="bibr" rid="r15">Moawad, 2008</xref>). IPTV is a 
     technique that transmits TV and video content over a network that 
     uses the IP networking protocol. With increasing the number of 
     users, performance becomes more important in order to provide 
     interest in video content applications and relative services. The 
     requirement for new video applications on traditional broadcast 
     networks (cable, terrestrial transmitters, and satellite) opens a 
     new perspective for the developed use of IP networks to satisfy the 
     new service demands (Driscol, 
     2008</p> 
    <sec> 
     <title>More Introducing</title> 
     <p>Internet Protocol Television, IPTV, Telco TV, or broadband TV is 
      delivering high quality broadcast television and/or on-demand video 
      and audio content over a broadband network. On the other hand, IPTV 
      is a mechanism applied to deliver old TV channels, movies, and 
      video-on-demand contents over a private network. The official 
      definition approved by the International Telecommunication Union 
      focus group on IPTV (ITU-T FG IPTV) is as: &#x201C;IPTV is 
      defined as multimedia services such as 
      television/video/audio/text/graphics /data delivered over IP based 
      networks managed to provide the required level of quality of service 
      and experience, security, interactivity and reliability&#x201D; 
      (Driscol, 2008, 
      pp.2).</p> 
    </sec> 
</sec> 

샘플 출력 될 것이다 :

<body> 
<h1>INTRODUCTION</h1> 
<p><a class="IndexAnchor">Digital Television</a> is the most advanced version of Television 
    technology improved in the last century. Digital TV provides 
    customers more choices and interactivity. New technology called 
    <a class="IndexAnchor">Internet</a> Protocol-based Television (IPTV) uses digital TV technology 
    and transmits it over IP based networks (Driscol, 2008), 
     (Moawad, 2008). IPTV is a 
    technique that transmits TV and video content over a network that 
    uses the IP networking protocol. With increasing the number of 
    users, performance becomes more important in order to provide 
    interest in video content applications and relative services. The 
    requirement for new video applications on traditional broadcast 
    networks (cable, terrestrial transmitters, and satellite) opens a 
    new perspective for the developed use of IP networks to satisfy the 
    new service demands (Driscol, 
    2008</p> 
<h2>More Introducing</h2> 
    <p><a class="IndexAnchor">Internet</a> Protocol Television, IPTV, Telco TV, or broadband TV is 
     delivering high quality broadcast television and/or on-demand video 
     and audio content over a broadband network. On the other hand, IPTV 
     is a mechanism applied to deliver old TV channels, movies, and 
     video-on-demand contents over a private network. The official 
     definition approved by the International Telecommunication Union 
     focus group on IPTV (ITU-T FG IPTV) is as: &#x201C;IPTV is 
     defined as multimedia services such as 
     television/video/audio/text/graphics /data delivered over IP based 
     networks managed to provide the required level of quality of service 
     and experience, security, interactivity and reliability&#x201D; 
     (Driscol, 2008, 
     pp.2).</p> 

+0

따라서 문자열 매개 변수에 세미콜론으로 구분 된 정규식 패턴 목록이 있습니다. 일부 샘플 입력은 어떻게 보이나요? 해당 출력을 XSLT로 어떻게 만들고 싶습니까? –

답변

1

솔직히 세미콜론으로 다른 패턴을 분리하는 대신 막대 "|"를 사용하는 것이 좋습니다. 대체 단어를 구분하기위한 정규 표현식 언어 문자입니다.

<xsl:stylesheet 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:xs="http://www.w3.org/2001/XMLSchema" 
    version="2.0" 
    exclude-result-prefixes="xs"> 

    <xsl:param name="patterns" as="xs:string" select="'Digital Televisions?|Internet'"/> 

    <xsl:template match="@* | node()"> 
    <xsl:copy> 
     <xsl:apply-templates select="@*, node()"/> 
    </xsl:copy> 
    </xsl:template> 

    <xsl:template match="text()"> 
    <xsl:analyze-string select="." regex="{$patterns}"> 
     <xsl:matching-substring> 
     <a class="IndexAnchor"> 
      <xsl:value-of select="."/> 
     </a> 
     </xsl:matching-substring> 
     <xsl:non-matching-substring> 
     <xsl:value-of select="."/> 
     </xsl:non-matching-substring> 
    </xsl:analyze-string> 
    </xsl:template> 

</xsl:stylesheet> 

가 도움을합니까 : 그럼 당신은 단순히 분석 문자열에 그 완전한 매개 변수를 공급 할 수 있습니까? 세미콜론으로 구분 된 목록을 분리 된 막대로 변환해야하는 경우 다음을 수행하십시오. <xsl:param name="patterns" as="xs:string" select="string-join(tokenize($yourParam, ';'), '|')"/>.

나는 모드를 사용하지 않았으며 원하는 다른 변환을 보지 못했지만 물론 필요한 경우 모드로 제공 한 템플릿을 사용할 수 있어야합니다.

+0

Works 대단히 감사합니다. –

관련 문제