2016-11-21 4 views
0
<div id="ext-gen392" class="x-panel-body"> 
    <div class="identify multiline"> 
     <div class="item"> 
      <span class="larger-text">1566 GREENE AVENUE, Brooklyn 11237</span> 
     </div> 
     <div class="item" style="display:none"> 
      <span class="label">Alternate address from NYC Dept of City Planning:</span> 
      <br>1566 GREENE AVENUE 
     </div> 
     <div class="item"> 
      <span style="background-color:#FFE094;" class="legend-color"></span><span class="label" style="font-style: italic;">&nbsp;Residential: Multi-Family Walk-up</span> 
     </div> 
     <div class="item" style="clear:both;"> 
      <span class="label">Owner:</span> BAEZ, IGNACIO 
     </div> 
     <div class="item"> 
      <span class="label">Block:</span> 3303 <span class="label">Lot:</span> 22 
     </div> 
     <div class="item"> 
      <span class="label">Property Characteristics:</span> 
      <ul style="list-style-type: none; padding-left: 0;"> 
       <li><span class="label">Lot Area:</span> 1,950 sq ft (19.5' x 100')</li> 
       <li><span class="label"># of Buildings:</span> 1 <span class="label">Year 
        built:</span> 1920 (Year built is an estimate)</li> 
       <li><span class="label">Building frontage:</span> 19.5' <span class="faded-text">(Building frontage along the street measured in feet.)</span></li> 
       <li><span class="label"># of floors:</span> 3 <span class="label">Building 
        Area:</span> 3,303 sq ft</li> 
       <li><span class="label">Total Units:</span> 3 <span class="label"> 
        Residential Units:</span> 3</li> 
       <li><span class="label">Primary zoning:</span> R6 <span class="label">Commercial Overlay:</span> 
        None</li> 
       <li><span class="label">Floor Area Ratio:</span> 1.69 
        <br> 
        <span class="label">Max. Allowable Residential FAR:</span> 2.43 
        <br> 
        <span class="label">Max. Allowable Commercial FAR:</span> 0 
        <br> 
        <span class="label">Max. Allowable Facility FAR:</span> 4.8 
        <!--REMOVED MAX FAR UNTIL WE FIGURE OUT HOW TO ADD DIFFT FAR VARS FROM PLUTO13--> 
        <!--<span class="label">Max. FAR:</span> 0 --> 
        <span class="faded-text"> 
         <br> 
         The Maximum Allowable Floor Area Ratios are exclusive of bonuses for plazas, plaza-connected open areas, arcades or other amenities. 
         <br> 
         FAR may depend on street widths or other characteristics. Contact <a href="http://www1.nyc.gov/site/planning/zoning/about-zoning.page" target="_blank">City Planning Dept.</a> for latest information.</span></li> 
      </ul> 
     </div> 
     <div class="item"> 
      <span class="label">MORE INFO:</span> 
      <ul> 
       <li><span class="label">Zoning Map#:</span> <a href="http://www1.nyc.gov/assets/planning/download/pdf/zoning/zoning-maps/map13b.pdf" target="_blank"> 
        13b</a> (<a href="http://www1.nyc.gov/site/planning/zoning/zoning-maps.page" target="_blank">how to read</a> NYC zoning maps)</li> 
       <li><span class="label">Historical Zoning Maps:</span> <a href="http://www1.nyc.gov/assets/planning/download/pdf/zoning/zoning-maps/historical-zoning-maps/maps13b.pdf" target="_blank"> 
        13b</a></li> 

       <li><a href="http://a810-bisweb.nyc.gov/bisweb/PropertyProfileOverviewServlet?boro=3&amp;block=3303&amp;lot=22" target="_blank">NYC Dept. of Buildings</a></li> 


       <li><a href="http://a836-acris.nyc.gov/bblsearch/bblsearch.asp?borough=3&amp;block=3303&amp;lot=22" target="_blank">Property transaction records</a> (<b>NB:</b> buildings w/condos may not show transaction results)</li> 

       <li><a href="http://webapps.nyc.gov:8084/CICS/fin1/find001i?FFUNC=C&amp;FBORO=3&amp;FBLOCK=3303&amp;FLOT=22" target="_blank">NYC Dept. of Finance Assessment Roll</a></li> 
       <li><a href="https://hpdonline.hpdnyc.org/HPDonline/provide_address.aspx" target="_blank">NYC HPD data</a></li><!--?p1=3&p2=street number =&p3=street name--> 
       <li><a href="http://gis.nyc.gov/doitt/nycitymap/template?z=8&amp;p=1008264,195724&amp;a=ZOLA&amp;c=ZOLA&amp;s=l:Brooklyn,3303,22,PLUTO" target="_blank">NYC Planning's ZoLa application</a></li> <!--http://gis.nyc.gov/doitt/nycitymap/template?z=8&p=988783,211983&a=ZOLA&c=ZOLA&s=a:365,FIFTH+AVENUE,MANHATTAN--> 
       <li><a href="http://maps.nyc.gov/taxmap/map.htm?searchType=BblSearch&amp;featureTypeName=EVERY_BBL&amp;featureName=3033030022" target="_blank">NYC Digital Tax Map</a></li> 
<!--    <li><a href="http://a810-bisweb.nyc.gov/bisweb/PropertyProfileOverviewServlet?boro=3&block=3303&lot=22" target="_blank">NYC Dept. of Buildings</a></li> 
       <li><a href="http://a836-acris.nyc.gov/bblsearch/bblsearch.asp?borough=3&block=3303&lot=22" target="_blank">Property transaction records</a></li> 
       <li><a href="http://webapps.nyc.gov:8084/CICS/fin1/find001i?FFUNC=C&FBORO=3&FBLOCK=3303&FLOT=22" target="_blank">NYC Dept. of Finance Assessment Roll</a></li> 
       <li><a href="http://gis.nyc.gov/taxmap/map.htm?searchType=FeatureSearch&featureTypeName=TAX_LOT_POLYGON&featureName=3033030022" target="_blank">NYC Digital Tax Map</a></li>--> 
       <li><a href="http://www.nyc.gov/html/dcp/html/subcats/zoning.shtml" target="_blank"> 
        NYC zoning guide</a></li> 
       <li><a href="http://www.oasisnyc.net/watershed/watershed.aspx" target="_blank">NYC 
        Watershed Resources</a></li> 
      </ul> 
     </div> 
     <div class="item"> 
      <span class="label">OASIS shortcut to this property:</span> 
      <br> 
      <a href="http://www.oasisnyc.net/map.aspx?zoomto=lot:3033030022">http://www.oasisnyc.net/map.aspx?zoomto=lot:3033030022</a> 
     </div> 
     <div class="item"> 
      <span class="faded-text">Source: MapPLUTO Tax 
       Block &amp; Tax Lot files from the New York City Department of City Planning, 
       2016 (ver. 16v1).</span> 
     </div> 
<!--  <div class="item" style="width: 95%; margin: 10px 0 5px 4px;"> 
      <span style="display:block;padding: 1px; color: #000066; background-color: #dddddd; border-bottom: solid 1px #aabbdd;"> 
       NYC Department of City Planning Census Factfinder 
      </span> 
      Find all census tracts within 
      <select id="selTaxLotRadius" style="font-size:1.1em" > 
       <option>0.25</option> 
       <option>0.5</option> 
       <option>1</option> 
      </select> 
      mile(s) 
      <input type="button" value="Go" style="font-size:1.1em;font-weight:bold;" onclick="var sel=document.getElementById('selTaxLotRadius');CUR.IdentifyLotTemplate.goToNycFF('1566 GREENE AVENUE','3', sel.options[sel.selectedIndex].value);" /> 
     </div>--> 
<!--  <div class="item"> 
      <div style="width: 95%; margin: 10px 0 5px 4px;"> 
       <div style="padding: 1px; color: #000066; background-color: #dddddd; border-bottom: solid 1px #aabbdd;"> 
        <a href="http://local.yahoo.com/" style="text-decoration: none;" 
         target="newWin"><span style="color: #ff0000; font-weight: bold;">YAHOO!</span> <span style="color: #000066;"> 
          Local</span></a> search results for this 
        address:</div> 
       <div style="padding-left: 4px;"> 

        <div style="margin-top: 4px; color: #888888; font-style: italic;"> 
         &nbsp;Know of something that's missing? <a href="http://listings.local.yahoo.com/csubmit/index.php" 
          target="newWin">Add it to YAHOO!</a></div> 
       </div> 
      </div> 
     </div>--> 
    </div> 
</div> 

속성에 대한 데이터를 수집하기 위해 웹 사이트를 폐기하는 중입니다. 소유자 이름을 얻으려고하고 결국에는 <span class="label"> 다음의 다른 모든 텍스트 속성을 얻으려고합니다. 다음은 쿼리 표현식 normalize-space(//span[(@class='label') and contains(., 'Owner:')]/following-sibling::text())의 표현식입니다. FirePath를 사용하여 표현식을 평가했지만 올바른 문자열을 반환합니다. 그러나 Google 스프레드 시트에서는 반환 값이 비어 있습니다. 어떤 제안?IMPORTXML을 사용하여 웹 사이트에서 데이터 스크랩

+0

XPath 쿼리가 제대로 표시, 나는 HTML 모드에서 xmllint가 그것을 확인할 수 있습니다. 하지만 유효한 XML이 아닙니다 ... Google 스프레드 시트는 XML이 아닌 HTML을 처리 할 수 ​​있습니까? – Markus

+0

예, Google 스프레드 시트는 IMPORTXML 기능을 사용하여 HTML을 처리 할 수 ​​있습니다. –

+0

페이지의 URL을 게시 할 수 있습니까? – Markus

답변

1

당신은 당신의 질의가 조금 URL을 수정하여이 작업을 수행 할 수 있습니다 - 예를 들어 당신이 원하는 원시 데이터를 엔드 포인트는 다음과 같습니다 것을 발견 : 그럼 당신은 변환 할 수 있습니다이 공식을 사용하여 http://www.oasisnyc.net/service.svc/lot/3033030022?layerstoselect=

당신의 올바른 엔드 포인트에 원래 URL : 당신이 =transpose(IMPORTDATA(B1))로 데이터를 끌어 경우 당신은 당신이 데이터를 배열하는 방법에 따라, 모든 필드 열을 볼 수

="http://www.oasisnyc.net/service.svc/lot/"&REGEXEXTRACT(A1,"lot:(\d+)")&"?layerstoselect="

, 당신은 다음 ARRAYFORMULA를 사용할 수 있으며, 청소/전송을위한 멍청이 필요에 따라 컬럼의 헤더와 데이터를 원하는 경우, 예를 들어 ... 그들을 분리의 조각을 해요, 당신은이를 입력 할 수 있습니다

=arrayformula(regexreplace({iferror(ARRAYFORMULA(REGEXEXTRACT(transpose(IMPORTDATA(B1)),"(\w+):"))),iferror(ARRAYFORMULA(REGEXEXTRACT(transpose(IMPORTDATA(B1)),":""?(.*)""?")))},"""","")) 

enter image description here

것은 당신이 행에 트랜스하려면 전치에서 전체를 감싸는 :

=transpose(arrayformula(regexreplace({iferror(ARRAYFORMULA(REGEXEXTRACT(transpose(IMPORTDATA(B1)),"(\w+):"))),iferror(ARRAYFORMULA(REGEXEXTRACT(transpose(IMPORTDATA(B1)),":""?(.*)""?")))},"""",""))) 

enter image description here

+0

좋은 형사. 흥미롭게도 소유자 이름이 귀하의 테이블에서 완전하지 않은 것 같습니다. – Markus

관련 문제