2013-06-10 10 views
0

저는 Python을 처음 접했고 특정 단어 사이의 일치를 수행하는 정규식을 구성하려고합니다. 내가 IP = 1.0.8.0 통계 = rtt.mean 예측에 대한 모든 데이터를 추출 만하면단어 사이의 정규식 일치 python

ip=1.0.8.0 statistic=rtt.std_dev predictions=iad-mci:114.717204,ord-cgnt:30.107700,nyc-inap:32.537077,iad-cgnt:0.000000,hkg-pccw:98.157281,ord-tata:6.058292,sjc-l3:57.089664,nyc-cgnt:36.489616,pvg-cu2:1039.978803,bgl-rel:115.671650,nyc-bgp:94.454690,pvg-cu1:377.429628,las-level3:0.000000,nyc-tgl:119.197070,atl-inap:42.021698 

ip=1.0.8.0 statistic=rtt.match_length predictions=iad-mci:13.000000,ord-cgnt:16.000000,nyc-inap:16.000000,iad-cgnt:20.000000,hkg-pccw:16.000000,ord-tata:16.000000,sjc-l3:16.000000,nyc-cgnt:16.000000,pvg-cu2:16.000000,bgl-rel:13.000000,nyc-bgp:16.000000,pvg-cu1:16.000000,las-level3:20.000000,nyc-tgl:16.000000,atl-inap:16.000000 

ip=1.0.8.0 statistic=rtt.mean predictions=iad-mci:348.247084,ord-cgnt:319.301775,nyc-inap:328.353336,iad-cgnt:248.600000,hkg-pccw:452.789753,ord-tata:313.643350,sjc-l3:321.487964,nyc-cgnt:315.238098,pvg-cu2:312.502609,bgl-rel:352.945035,nyc-bgp:382.419130,pvg-cu1:332.139637,las-level3:177.400000,nyc-tgl:392.333887,atl-inap:325.668400 

ip=1.0.8.0 statistic=rtt.age predictions=iad-mci:3066.160981,ord-cgnt:3366.161424,nyc-inap:4266.160056,iad-cgnt:49566.161227,hkg-pccw:5166.165995,ord-tata:3066.158230,sjc-l3:5466.160068,nyc-cgnt:3366.161192,pvg-cu2:5166.160410,bgl-rel:1566.160768,nyc-bgp:3666.159675,pvg-cu1:2766.160713,las-level3:251466.160789,nyc-tgl:3966.159966,atl-inap:4866.167164 

: 내 REST 호출은 다음 형식의 긴 문자열을 반환했습니다. 정규 표현식은 어떻게해야합니까? re.findall 또는 re.match를 사용해야합니까?

답변

0

정규식을 사용하지 않는 것이 좋습니다. 이는 더 빠를뿐만 아니라 대부분의 경우 더 강력 해집니다.

In [10]: text = '''ip=1.0.8.0 statistic=rtt.std_dev predictions=iad-mci:114.717204,ord-cgnt:30.107700,nyc-inap:32.537077,iad-cgnt:0.000000,hkg-pccw:98.157281,ord-tata:6.058292,sjc-l3:57.089664,nyc-cgnt:36.489616,pvg-cu2:1039.978803,bgl-rel:115.671650,nyc-bgp:94.454690,pvg-cu1:377.429628,las-level3:0.000000,nyc-tgl:119.197070,atl-inap:42.021698 
ip=1.0.8.0 statistic=rtt.match_length predictions=iad-mci:13.000000,ord-cgnt:16.000000,nyc-inap:16.000000,iad-cgnt:20.000000,hkg-pccw:16.000000,ord-tata:16.000000,sjc-l3:16.000000,nyc-cgnt:16.000000,pvg-cu2:16.000000,bgl-rel:13.000000,nyc-bgp:16.000000,pvg-cu1:16.000000,las-level3:20.000000,nyc-tgl:16.000000,atl-inap:16.000000 
ip=1.0.8.0 statistic=rtt.mean predictions=iad-mci:348.247084,ord-cgnt:319.301775,nyc-inap:328.353336,iad-cgnt:248.600000,hkg-pccw:452.789753,ord-tata:313.643350,sjc-l3:321.487964,nyc-cgnt:315.238098,pvg-cu2:312.502609,bgl-rel:352.945035,nyc-bgp:382.419130,pvg-cu1:332.139637,las-level3:177.400000,nyc-tgl:392.333887,atl-inap:325.668400 
ip=1.0.8.0 statistic=rtt.age predictions=iad-mci:3066.160981,ord-cgnt:3366.161424,nyc-inap:4266.160056,iad-cgnt:49566.161227,hkg-pccw:5166.165995,ord-tata:3066.158230,sjc-l3:5466.160068,nyc-cgnt:3366.161192,pvg-cu2:5166.160410,bgl-rel:1566.160768,nyc-bgp:3666.159675,pvg-cu1:2766.160713,las-level3:251466.160789,nyc-tgl:3966.159966,atl-inap:4866.167164''' 

In [13]: ip = '1.0.8.0' 

In [14]: result = filter(lambda s: s.startswith('ip={0} statistic=rtt.mean predictions'.format(ip)), text.split('\n')) 

In [15]: list(result) 
Out[15]: ['ip=1.0.8.0 statistic=rtt.mean predictions=iad-mci:348.247084,ord-cgnt:319.301775,nyc-inap:328.353336,iad-cgnt:248.600000,hkg-pccw:452.789753,ord-tata:313.643350,sjc-l3:321.487964,nyc-cgnt:315.238098,pvg-cu2:312.502609,bgl-rel:352.945035,nyc-bgp:382.419130,pvg-cu1:332.139637,las-level3:177.400000,nyc-tgl:392.333887,atl-inap:325.668400'] 
+0

고마워. :)하지만 내 문제는 ip가 항상 상수가 아니며 매번 그렇게 할 것입니다. 그렇다면 변수 이름을 필터에 어떻게 포함시킬 수 있습니까? 즉 ip 대신 start_ip 같은 변수를 사용 하시겠습니까? 내가 어떻게 할 수 있는지 말해 줄 수 있니? –

+0

@wannabe_geek 답변을 업데이트했습니다. – kirelagin

+0

많은 도움을 주셔서 감사합니다! –

0
>>> text = '''ip=1.0.8.0 statistic=rtt.std_dev predictions=iad-mci:114.717204,ord-cgnt:30.107700,nyc-inap:32.537077,iad-cgnt:0.000000,hkg-pccw:98.157281,ord-tata:6.058292,sjc-l3:57.089664,nyc-cgnt:36.489616,pvg-cu2:1039.978803,bgl-rel:115.671650,nyc-bgp:94.454690,pvg-cu1:377.429628,las-level3:0.000000,nyc-tgl:119.197070,atl-inap:42.021698 

ip=1.0.8.0 statistic=rtt.match_length predictions=iad-mci:13.000000,ord-cgnt:16.000000,nyc-inap:16.000000,iad-cgnt:20.000000,hkg-pccw:16.000000,ord-tata:16.000000,sjc-l3:16.000000,nyc-cgnt:16.000000,pvg-cu2:16.000000,bgl-rel:13.000000,nyc-bgp:16.000000,pvg-cu1:16.000000,las-level3:20.000000,nyc-tgl:16.000000,atl-inap:16.000000 

ip=1.0.8.0 statistic=rtt.mean predictions=iad-mci:348.247084,ord-cgnt:319.301775,nyc-inap:328.353336,iad-cgnt:248.600000,hkg-pccw:452.789753,ord-tata:313.643350,sjc-l3:321.487964,nyc-cgnt:315.238098,pvg-cu2:312.502609,bgl-rel:352.945035,nyc-bgp:382.419130,pvg-cu1:332.139637,las-level3:177.400000,nyc-tgl:392.333887,atl-inap:325.668400 

ip=1.0.8.0 statistic=rtt.age predictions=iad-mci:3066.160981,ord-cgnt:3366.161424,nyc-inap:4266.160056,iad-cgnt:49566.161227,hkg-pccw:5166.165995,ord-tata:3066.158230,sjc-l3:5466.160068,nyc-cgnt:3366.161192,pvg-cu2:5166.160410,bgl-rel:1566.160768,nyc-bgp:3666.159675,pvg-cu1:2766.160713,las-level3:251466.160789,nyc-tgl:3966.159966,atl-inap:4866.167164''' 
>>> import re 
>>> re.findall(r'ip=1.0.8.0 statistic=rtt.mean predictions.*', text) 
['ip=1.0.8.0 statistic=rtt.mean predictions=iad-mci:348.247084,ord-cgnt:319.301775,nyc-inap:328.353336,iad-cgnt:248.600000,hkg-pccw:452.789753,ord-tata:313.643350,sjc-l3:321.487964,nyc-cgnt:315.238098,pvg-cu2:312.502609,bgl-rel:352.945035,nyc-bgp:382.419130,pvg-cu1:332.139637,las-level3:177.400000,nyc-tgl:392.333887,atl-inap:325.668400'] 
+0

내 IP가 하드 코드되지 않고 매번 변경되는 start_ip와 같은 변수에 표시되면 어떻게됩니까? 정규식에 변수 이름을 삽입 할 수 있습니까? –

+0

@wannabe_geek 예 정규 표현식은 일반적인 문자열과 같습니다. 예를 들어 변수를 대체 할 수 있습니다. 'r'ip = {0}. *'. format ('1.0.8.0') – jamylak

관련 문제