2014-10-14 4 views
-1

XML을 사용하여 XML 파싱을 연구하기 시작합니다. :: Twig와 나는 2 개의 작은 문제에 관여합니다. 내 XML (서지 레코드의 모음)의 구조는 다음과 같습니다XML :: Twig 초보자 스크립트

<?xml version="1.0" encoding="UTF-8"?> 

<collection xmlns="http://www.loc.gov/MARC21/slim"> 
<!-- FIRST INCREMENTAL --> 
<!-- INSTANCE:sfxudn --> 
<record> 
    <leader>-----nas-a2200000z--4500</leader> 
    <controlfield tag="008">140922uuuuuuuuuxx-uu-|------u|----|eng-d</controlfield> 
    <datafield tag="010" ind1="" ind2=""> 
    <subfield code="a">01015589</subfield> 
    </datafield> 
    <datafield tag="245" ind1="" ind2="0"> 
    <subfield code="a">Publishers weekly</subfield> 
    </datafield> 
    <datafield tag="260" ind1="" ind2=""> 
    <subfield code="a">New York, NY</subfield> 
    <subfield code="b">Reed Business Information</subfield> 
    </datafield> 
    <datafield tag="022" ind1="" ind2=""> 
    <subfield code="a">0000-0019</subfield> 
    </datafield> 
    <datafield tag="776" ind1="" ind2=""> 
    <subfield code="x">2150-4008</subfield> 
    </datafield> 
    <datafield tag="090" ind1="" ind2=""> 
    <subfield code="a">954921332001</subfield> 
    </datafield> 
    <datafield tag="866" ind1="" ind2=""> 
    <subfield code="a">Available from 1997. </subfield> 
    <subfield code="s">1000000000001224</subfield> 
    <subfield code="t">1000000000000630</subfield> 
    <subfield code="x">EBSCOhost Business Source Complete:Full Text</subfield> 
    <subfield code="z">1000000000125212</subfield> 
    </datafield> 
</record> 

....more records... 
    </collection> 

을 나는이 조작하고 싶습니다 : A의

1) costant 내용으로 한 costant 라인 (추가)를/collection/record/datafield [/ @ tag = '866']/하위 필드 [\ @ code = 'a'] "에 정확한 위치가 있습니다. 한 단어, IE에서 의

<datafield tag="866" ind1="" ind2=""> 
    <subfield code="a">Available from 1997. </subfield> 
    <subfield code="s">1000000000001224</subfield> 
    <subfield code="t">1000000000000630</subfield> 
    <subfield code="x">EBSCOhost Business Source Complete:Full Text</subfield> 
    <subfield code="z">1000000000125212</subfield> 
    </datafield> 

가로 변환해야합니다

<datafield tag="866" ind1="" ind2=""> 
    <subfield code="a">Available from 1997. </subfield> 
    ****add the following line with "code" attribute in alphabetical order, after "a" and before "s"**** 
    <subfield code="i">DEFAULT</subfield> 
    <subfield code="s">1000000000001224</subfield> 
    <subfield code="t">1000000000000630</subfield> 
    <subfield code="x">EBSCOhost Business Source Complete:Full Text</subfield> 
    <subfield code="z">10000000value 00125212</subfield> 
    </datafield> 

2) 찾을 ALL 만 기록 타이틀 (은 \ [/ 수집/기록/데이터 필드의 내용입니다 @

a) 값/"collection/record/datafield [\ @ tag = '866']/하위 필드 [\ @ 태그 = '245']/하위 필드 [\ @ 코드 = 'a' Elsevier SD Freedom Collection : 전체 텍스트 " 과 동일합니다. b)"/ collection/record/datafield [\ @ tag = '866']/하위 필드 [\ @ co 드 = 'A'] "완전히 결석, 또는 -if present-은 empty.IE :

<datafield tag="866" ind1="" ind2=""> 
    <subfield code="s">1000000000000992</subfield> 
    <subfield code="t">1000000000000473</subfield> 
    <subfield code="x">Elsevier SD Freedom Collection:Full Text</subfield> 
    <subfield code="z">1000000000043233</subfield> 
    </datafield> 

또는

<datafield tag="866" ind1="" ind2=""> 
    <subfield code="a"></subfield> 
    <subfield code="s">1000000000000992</subfield> 
    <subfield code="t">1000000000000473</subfield> 
    <subfield code="x">Elsevier SD Freedom Collection:Full Text</subfield> 
    <subfield code="z">1000000000043233</subfield> 
    </datafield> 

답장을 보내 주셔서 감사 많은,

fabianope

+2

Twig 튜토리얼을 통해 작업하고 코드를 작성하십시오. – toolic

+0

발견 된 레코드로 무엇을하고 싶습니까? 또한 무엇을 시도 했습니까? –

답변

1

처음에는 여기에 "순진한"해결책이 있습니다. 즉, 전체 문서를 메모리에로드합니다. 필요하면 twig_roots을 사용하여이를 방지하는 방법이 있습니다.

#!/usr/bin/perl 

use strict; 
use warnings; 

use XML::Twig; 

my $coll= "colls.xml"; 

my $tag_nb= 866; 
my $new_subfield= { code => 'i', content => 'DEFAULT' }; 

my $trigger= qq{datafield[\@tag="$tag_nb"]}; 

my $t= XML::Twig->new(twig_handlers => { 
          $trigger => sub{ add_subfied(@_, $new_subfield); } 
          }, 
         pretty_print => 'indented', 
        ) 
       ->parsefile($coll) 
       ->print; 

sub add_subfied { 
    my($t, $datafield, $subfield)= @_; 
    $datafield->insert_new_elt(first_child => subfield 
               => { code => $subfield->{code}, }, 
                $subfield->{content} 
          ); 
    $datafield->sort_children_on_att('code'); 
} 
관련 문제