2013-04-09 2 views
0

저는 파이썬을 텍스트 처리에 사용하고 있습니다. 기본적으로 두 랜드 마크 사이의 내용을 추출하고 싶습니다. 구체적인 내용은 다음과 같습니다. "Find intent vulnerabilities"와 "Print intent summary"사이의 텍스트를 얻기 위해 정규식을 어떻게 디자인 할 수 있습니까? 감사합니다두 마커 내에서 문자열을 찾는 방법

Find component vulnerabilities 
    ****************************************************************************************** 
    [email protected]_for_class[org.apache.cordova.BatteryListener$1*org/apache/cordova/BatteryListener/execute(Ljava/lang/String;Lorg/json/JSONArray;Lorg/apache/cordova/api/CallbackContext;)@70!] is nil 
    [email protected]_for_class[org.apache.cordova.CordovaWebView$1*org/apache/cordova/CordovaWebView/setup()@124!] is nil 
    [email protected]_for_class[org.apache.cordova.NetworkManager$1*org/apache/cordova/NetworkManager/initialize(Lorg/apache/cordova/api/CordovaInterface;Lorg/apache/cordova/CordovaWebView;)@57!] is nil 
    [email protected]_for_class[org.apache.cordova.Device$1*org/apache/cordova/Device/initTelephonyReceiver()@29!] is nil 
    Protected Receiver: org.apache.cordova.BatteryListener$1*org/apache/cordova/BatteryListener/execute(Ljava/lang/String;Lorg/json/JSONArray;Lorg/apache/cordova/api/CallbackContext;)@70!, 0 
    Protected Receiver: org.apache.cordova.CordovaWebView$1*org/apache/cordova/CordovaWebView/setup()@124!, 0 
    Possible Malicious Broadcast Injection: org.apache.cordova.NetworkManager$1*org/apache/cordova/NetworkManager/initialize(Lorg/apache/cordova/api/CordovaInterface;Lorg/apache/cordova/CordovaWebView;)@57, 0 
    Possible Malicious Broadcast Injection: org.apache.cordova.Device$1*org/apache/cordova/Device/initTelephonyReceiver()@29, 0 

    Find intent vulnerabilities 
    ****************************************************************************************** 
    Possible Activity Hijacking: org/apache/cordova/CordovaWebView/showWebPage(Ljava/lang/String;ZZLjava/util/HashMap;)@147, Source Line: 664, hasExtras=false, hasRead=false, hasWrite=false 
    Possible Activity Hijacking: org/apache/cordova/CordovaWebView/showWebPage(Ljava/lang/String;ZZLjava/util/HashMap;)@201, Source Line: 676, hasExtras=false, hasRead=false, hasWrite=false 
    Possible Activity Hijacking: org/apache/cordova/CordovaWebViewClient/shouldOverrideUrlLoading(Landroid/webkit/WebView;Ljava/lang/String;)@83, Source Line: 131, hasExtras=false, hasRead=false, hasWrite=false 
    Possible Activity Hijacking: org/apache/cordova/CordovaWebViewClient/shouldOverrideUrlLoading(Landroid/webkit/WebView;Ljava/lang/String;)@161, Source Line: 142, hasExtras=false, hasRead=false, hasWrite=false 
    Possible Activity Hijacking: org/apache/cordova/CordovaWebViewClient/shouldOverrideUrlLoading(Landroid/webkit/WebView;Ljava/lang/String;)@239, Source Line: 153, hasExtras=false, hasRead=false, hasWrite=false 
    Possible Activity Hijacking: org/apache/cordova/CordovaWebViewClient/shouldOverrideUrlLoading(Landroid/webkit/WebView;Ljava/lang/String;)@368, Source Line: 185, hasExtras=true, hasRead=false, hasWrite=false 
    Possible Activity Hijacking: org/apache/cordova/CordovaWebViewClient/shouldOverrideUrlLoading(Landroid/webkit/WebView;Ljava/lang/String;)@544, Source Line: 209, hasExtras=false, hasRead=false, hasWrite=false 
    Possible Service Hijacking: org/apache/cordova/api/LegacyContext/bindService(Landroid/content/Intent;Landroid/content/ServiceConnection;I)@22, Source Line: 142, hasExtras=false, hasRead=false, hasWrite=false 
    Possible Activity Hijacking: org/apache/cordova/api/LegacyContext/startActivity(Landroid/content/Intent;)@20, Source Line: 82, hasExtras=false, hasRead=false, hasWrite=false 
    Possible Service Hijacking: org/apache/cordova/api/LegacyContext/startService(Landroid/content/Intent;)@20, Source Line: 136, hasExtras=false, hasRead=false, hasWrite=false 

    Print intent summary 
    ****************************************************************************************** 
    ************************** 
    org/apache/cordova/Capture/captureAudio()@8 
    invoke-direct {v0,v1},android/content/Intent/<init> ; <init>(Ljava/lang/String;)V 
    Explicit: false 
    Destination Type: 
    Done: false 
    ************************** 

답변

2

regex를 사용해야합니까? str.find 또는 str.index를 사용하여 각 마커를 찾은 다음 슬라이스를 사용하여 그 사이의 내용을 추출하는 것이 더 간단 할 것입니다.

re.search(r"(?s)Find intent vulnerabilities\n(.*?)Print intent summary\n", text).group(1) 
+0

초보자를위한 책일지도 모르겠다. Jan Goyvaerts와 Steven Levithan의 << 정규 표현식 요리 책 >>을 확인해보십시오. – pinkdawn

2

정규식 과잉 여기에 있습니다 정규식을 사용하지만

. 인덱스와 슬라이싱을 사용하지 않는 이유는 무엇입니까?

>>> l = len('test123') 
>>> s = ' test123 somestuff here test456 ' 
>>> s.index('test123') 
2 
>>> s.index('test456') 
25 
>>> s[2+l:25] 
' somestuff here ' 
관련 문제