XSLT를 사용하여 HTML 파일의 특성을 제거하려고합니다. HTML 파일은 다음과 같습니다 : 그것은 개념의 증거이기 때문에XSLT 서식 지정 HTML 입력
<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type" />
<title>CB Comfy Bike</title>
<meta name="atg:string,index:$repositoryId" content="prod10005" />
<meta name="atg:date:creationDate" content="955050507" />
<meta name="atg:date:startDate" content="978325200" />
<meta name="atg:date:endDate" content="1009861200" />
<meta name="atg:string:$url"
content="atgrep:/ProductCatalog/frame-product/prod10005?locale=en_US" />
<meta name="atg:string,index:$baseUrl"
content="atgrep:/ProductCatalog/frame-product/prod10005" />
<meta name="atg:string:$repository.repositoryName" content="ProductCatalog" />
<meta name="atg:string:$itemDescriptor.itemDescriptorName" content="frame-product" />
<meta name="atg:string:childSKUs.$repositoryId" content="sku20007" />
<meta name="atg:string:childSKUs.$itemDescriptor.itemDescriptorName" content="bike-sku" />
<meta name="atg:date:childSKUs.creationDate" content="955068027" />
<meta name="atg:float:childSKUs.listPrice" content="400.0" />
<meta name="atg:float:childSKUs.salePrice" content="300.0" />
<meta name="atg:boolean:childSKUs.onSale" content="false" />
<meta name="atg:string:parentCategory.$repositoryId" content="cat55551" />
<meta name="atg:date:parentCategory.creationDate" content="956950321" />
<meta name="atg:string,docset:ancestorCategories.$repositoryId" content="cat10002" />
<meta name="atg:string,docset:ancestorCategories.$repositoryId" content="cat10003" />
<meta name="atg:string,docset:ancestorCategories.$repositoryId" content="cat55551" />
</head>
<body>
<div class="atg:role:displayName" id="0"> CB Comfy Bike </div>
<div class="atg:role:longDescription" id="1"> This bike is just right, whether you are a
commuter or want to explore the fire roads. The plush front suspension will smooth out
the roughest bumps and the big disc brakes provide extra stopping power for those big
downhills. </div>
<div class="atg:role:keywords" id="2"> mountain_bike comfort_bike </div>
<div class="atg:role:childSKUs.displayName" id="3"> CB Comfy Bike Medium </div>
<div class="atg:role:childSKUs.listPrice" id="4"> 400.0 </div>
<div class="atg:role:childSKUs.description" id="5"> Medium </div>
<div class="atg:role:parentCategory.displayName" id="6"> Mountain Bikes </div>
</body>
</html>
나는 각 사업부에 대한 새 태그에서 찾고 있어요는, 내가 아직 이름 convensions에 집중하지 않았습니다. 그러나 어떻게 div 태그 사이의 differenciate 모르겠다. 내가 지금까지 가지고이 XSLT는 다음과 같습니다
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="head"/>
<xsl:template match="body">
<xsl:copy>
<xsl:value-of select="div/text()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
반환 :
<?xml version="1.0" encoding="utf-8"?>
<html>
<body> CB Comfy Bike </body>
</html>
어떻게 내가 가지고있는 문제는 구별되는이
<?xml version="1.0" encoding="UTF-8"?>
<root>
<tag1>CB Comfy Bike</tag1>
<tag2>This bike is just right, whether you are a
commuter or want to explore the fire roads. The plush front suspension will smooth out
the roughest bumps and the big disc brakes provide extra stopping power for those big
downhills.</tag2>
<tag3>mountain_bike comfort_bike</tag3>
<tag4>CB Comfy Bike Medium</tag4>
<tag5>400.0</tag5>
<tag6>Medium</tag6>
<tag7>Mountain Bikes</tag7>
</root>
처럼 무언가로 입력을 켜 것 Div 태그
일부 XSLT 프로세서는 임의의 HTML을 처리하는 데 문제가있을 수 있습니다. 가장 중요한 것은 XML 명세에서 많은 명명 된 엔티티가 없다는 것입니다. 내가 틀렸다면 누군가가 나를 바로 잡을 수 있습니까? – Sprague