펄 코드의 성능 향상

다음 프로그램은 정상적으로 실행되지만 큰 데이터 은 무한한 시간이 걸립니다.
INPUT.txt. 사실, 나는 한 줄에 100 개까지 1 개까지 1000 개의 라인을 가지고있다.펄 코드의 성능 향상

10 
6 
9 
7 
9 11 
3 4 
1 9 
5 12 
1 11 
5 11 
9 12 
10 5 8 
7 4 1 
and so on... 
last: 1 2 3 4 5 6 7 . . .any number of elements (100 in my case).

matrix.txt (TAB DELIMIITED)

1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 1 1 1 
1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 0 1 0 0 1 1 
1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 
1 0 1 1 1 1 0 1 1 1 1 0 1 1 0 1 1 0 1 1 1 1 0 1 0 1 1 1 
1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 
1 0 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 0 1 1 
1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 0 1 0 1 1 1 1 1 1 1 0 
and so on....upto 25000 lines

OUTPUT.TXT 이러한 합 INPUT의 각 라인에서 찍은 인덱스 위치 matrix.txt의 요소는 .txt.
실제 합계는이 가상 샘플 출력과 다를 수 있습니다.

1 1 1 1 1 0 1 1 1 2 2 2 2 2 . . .columns upto number of lines in input.txt 
1 1 1 1 1 1 1 1 1 2 2 2 2 2 
1 0 0 1 1 1 1 1 1 2 2 2 2 2 
1 1 1 0 1 0 0 1 1 2 2 2 2 2 
1 1 1 1 1 1 1 1 0 2 2 2 2 2 
1 1 1 0 1 0 1 1 1 1 2 2 2 2 
1 1 1 1 1 0 1 1 1 2 2 2 2 2 
1 1 1 1 1 0 0 1 1 1 2 2 2 2 
0 1 1 1 1 1 1 1 0 2 2 2 2 2

코드 :는 당신이 무슨 일이 일어나고 있는지 이해하는 데 도움이 될 것입니다 코드를 보라.

use List::Util 'sum'; 
my @indexes = do { 
    open my $fh, '<', "INPUT.txt"; 
    map { [map {$_ - 1} split ' '] } <$fh> 
}; 
open my $infh, '<', "matrix.txt"; 
open OUT, '>', "output.txt"; 
while (<$infh>) { 
    my @vals = split ' '; 
    print OUT join(' ', map {sum(@vals[@$_])} @indexes), "\n"; 
} 
close OUT;

적은 시간에이 작업을 수행하는 다른 방법이 있습니까.

파일 가용성 :
INPUT : https://www.dropbox.com/s/48ikhnfs7gzk8vm/input.txt?dl=0
MATRIX : https://www.dropbox.com/s/ebxi608eday9z1e/matrix.txt?dl=0

출처

2014-11-26 BioDeveloper

당신이 당신의 input.txt를 (그리고 matrix.txt) – osirisgothra

@osirisgothra 입력에 대한 링크를 삭제할 수있는 방법이있다 : https://www.dropbox.com/s/48ikhnfs7gzk8vm/input.txt은? dl = 0 매트릭스 : https://www.dropbox.com/s/ebxi608eday9z1e/matrix.txt?dl=0 – BioDeveloper

프로그램 프로필을 작성 했습니까? 그렇지 않다면 [Devel :: NYTProf] (https://metacpan.org/pod/Devel::NYTProf)를 시도하여 병목 현상이 무엇인지 확인하십시오. 또한, 귀하의 문제는 [PDL] (https://metacpan.org/pod/PDL)의 적절한 작업 인 것으로 보입니다.그것은 읽기 어렵습니다, 비슷한 일이 앞당겨지는 것을 본다면 잠시 가치가있을 것입니다. 의존성으로'Module :: Compile'을 가지고 있으며, 가장 최신 버전의 모듈은 많은 시스템에서 테스트를 통과하지 못합니다. 대부분의 시스템에서 동작하는 버전을 얻으려면'cpanm Module :: Compile @ 0.30'과 같이 설치하십시오. –

당신에게 성능을 비용되어있는 '비트'로 더 좋은 생각을 가지고시겠습니까?

이유 물어 - 성능 병목의 거룩한 삼위 일체의 일종있다 :

CPU가 - 실제 작업은 프로세서에서 수행되는
'활성'메모리 메모리 프로파일 대의 (크기 사용 가능한 RAM을 그리고 당신이 얼마나 개편하고 있는지).
IO - 디스크로 데이터를 전송합니다.

종종 다른 것과 상충 될 수 있습니다. 조회 테이블을 생성하여 CPU 효율을 높일 수 있습니다.

map과 같은 연산은 제가 가까이에서보기 시작하는 것들입니다. map/sort/grep과 같은 것들은 매우 강력하지만 최적의 알고리즘보다 덜한 알고리즘을 사용할 가능성이 있습니다.

CPU를 사용하는 경우 멀티 스레드 또는 포킹을 사용해 CPU 액세스를 늘릴 수 있습니다. 그것의면에서, 은 'matrix.txt'(예 : 각 줄은 독립 실행 형) 처리에 의존하지 않으므로으로 보입니다. 따라서 병렬 처리를위한 좋은 후보가 될 수 있습니다.

나는 Parallel :: ForkManager를 사용하여 루프를 while 루프로 감싸고 있다고 생각할 것입니다. 이 작업의 단점은 출력을 결정적으로 정렬하지 않아 주소 지정이 필요한 것입니다.이 작업을 하겠지만, 당신은 거의 확실 원하지 무엇을 임의의 출력 순서를 얻을 것이다 -

use List::Util 'sum'; 
use Data::Dumper; 
use Fcntl qw(:flock); 

use Parallel::ForkManager; 

my $mgr = Parallel::ForkManager->new(10); 

my @indexes = do { 
    open my $fh, '<', "INPUT.txt"; 
    map { 
     [ map { $_ - 1 } split ' ' ] 
    } <$fh>; 
}; 
open my $infh, '<', "matrix.txt"; 
open my $out_fh, '>', "output.txt"; 
while (<$infh>) { 
    $mgr->start and next; 
    my @vals = split ' '; 
    my $output_line = join(' ', map { sum(@vals[@$_]) } @indexes), 
     "\n"; 
    { 
     flock($out_fh, LOCK_EX); 
     print {$out_fh} $output_line; 
    } 
} 
close $out_fh;

참고 :

그래서 10 스타터 수 있습니다. 그러나 'join/map/sum'작업을 수행하는 데 동시에 10 개의 프로세서가 사용됩니다.

(물론 IO 바인딩을 사용하는 경우에는 도움이되지 않습니다.) 예를 들어

use warnings; 
use strict; 

use List::Util 'sum'; 

use threads; 
use Thread::Queue; 

my $line_q = Thread::Queue -> new(); 
my $output_q = Thread::Queue -> new(); 

my %line_output : shared; 

    my @indexes = do { 
     open my $fh, '<', "INPUT.txt"; 
     map { 
      [ map { $_ - 1 } split ' ' ] 
     } <$fh>; 
}; 


sub generate_output { 
    while (my $item = $line_q -> dequeue()) { 
    print "processing $item \n"; 
     my ($line_num, @vals) = split (' ', $item);   
     $output_q -> enqueue($line_num.":". join(' ', map {sum(@vals[@$_])} @indexes). "\n"); 
    } 
} 

sub coalesce_output { 
    open my $out_fh, '>', "output.txt"; 
    my $current_line = 0; 
    my %lines; 
    while (my $item = $output_q -> dequeue) { 
     my ($line_num, $output_line) = split (":", $item); 
     if ($line_num = $current_line) { 
      print {$out_fh} $output_line; 
      $current_line++; 
     } 
     else { 
      $lines{$line_num} = $output_line; 
     } 
     while (defined $lines{$current_line}) { 
      print {$out_fh} $lines{$current_line}; 
      delete $lines{$current_line}; 
      $current_line++; 
     } 
    } 
} 




open my $infh, '<', "matrix.txt"; 

my @workers; 
for (1..10) { 
    push (@workers, threads -> create (\&generate_output)); 
} 

threads -> create (\&coalesce_output); 

while (my $line = <$infh>) { 
    $line_q -> enqueue ("$.: $line"); 
} 

$line_q -> end(); 
foreach my $thr (@workers) { 
    $thr -> join(); 
} 

$output_q -> end();

는하지만 IO를 동기화, 나는 스레딩 아주 좋은 방법입니다 찾을 수 있습니다. 합계 작업을 병렬로 수행하려면 10 명의 작업자를, 올바른 순서로 데이터를 쓰려면 하나의 '출력'스레드를 돌립니다.

그래서 같은 :

use warnings; 
use strict; 

use List::Util 'sum'; 

use threads; 
use Thread::Queue; 

my $line_q = Thread::Queue->new(); 
my $output_q = Thread::Queue->new(); 

my @indexes = do { 
    open my $fh, '<', "INPUT.txt"; 
    map { 
     [ map { $_ - 1 } split ' ' ] 
    } <$fh>; 
}; 


sub generate_output { 
    while (my $item = $line_q->dequeue()) { 

     #print "processing $item \n"; 
     my ($line_num, @vals) = split(' ', $item); 
     $output_q->enqueue($line_num . ":" 
       . join(' ', map { sum(@vals[@$_]) } @indexes) 
       . "\n"); 
    } 
} 

sub coalesce_output { 
    open my $out_fh, '>', "output.txt"; 
    my $current_line = 1; 
    my %lines; 
    while (my $item = $output_q->dequeue) { 

     my ($line_num, $output_line) = split(":", $item); 

     #  print "Got $line_num ($current_line) $item\n"; 
     if ($line_num = $current_line) { 

      # print "printing $current_line = $output_line\n"; 
      print {$out_fh} $output_line; 
      $current_line++; 
     } 
     else { 
      $lines{$line_num} = $output_line; 
     } 
     while (defined $lines{$current_line}) { 

    # print "printing (while) $current_line = $lines{$current_line}\n"; 
      print {$out_fh} $lines{$current_line}; 
      delete $lines{$current_line}; 
      $current_line++; 
     } 
    } 
} 


open my $infh, '<', "matrix.txt"; 

my @workers; 
for (1 .. 40) { 
    push(@workers, threads->create(\&generate_output)); 
} 

threads->create(\&coalesce_output); 

while (my $line = <$infh>) { 
    $line_q->enqueue("$. $line"); 
} 

$line_q->end(); 
foreach my $thr (@workers) { 
    $thr->join(); 
} 

$output_q->end(); 
foreach my $thr (threads -> list) { $thr -> join(); }

는 (+ 이상) 생산 : 결국

1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

을하지만 - 그것은 당신의 제한 요인이 무엇인지에 다소 의존한다. 신속하고 더러운 테스트를 실행

는

Started at 1417007048, 
finished at 1417007064 
Took:16s

대를 제공합니다

Started at 1417007118 
finished at 1417007161 
Took:43s

가 보는

출처

2014-11-26 11:38:22 Sobrique

한 가지 (I 모두 철저의 출력을 검증하지 않은)/시도되는 수학 및 CPAN에 매트릭스 지향 모듈. 그들 중 일부는 이 더 빠를해야하는 고유 코드 (perl의 c 기반 확장자)를 사용합니다. 여기에 당신이 기본적으로 선택 벡터와 행렬 곱셈을 수행하고 있다는 사실을 악용하는 PDL 버전을 만든 them-

http://www.perlmonks.org/?node_id=284324

출처

2014-11-26 11:41:11 Doggerel

에 (일자) 프라이머이다. 이 버전에서는 행렬에 항상 100 개의 요소가 있다고 가정합니다. 그것이 사실이 아니라면, 그에 따라 제로 호출을 변경해야합니다.

크기 (1 000 x 100) (25 000 x 100) 입력의 경우 약 2 배 빠릅니다. 전체 행렬을 메모리로 읽어 들인 다음 병렬 처리를 활성화하면 처리 속도가 빨라지지만 결과는 동일한 런타임에서 처리됩니다. 대략적인 런타임 플로어가 궁금한 경우에 최적화 된 C 버전은 기존 버전보다 약 4 배 빠릅니다. 모든 시간은 물론 내 컴퓨터에 묶여 있지만, 나는 대부분의 컴퓨터에서 비슷한 비율을 기대합니다. 나는 또한 이것을 배움의 구실로 사용했기 때문에 PDL이 최적이라는 주장을하지 않습니다.

use strict; 
use warnings; 

use PDL; 

my $indexes = PDL::long(do { 
    open(my $fh, '<', 'INPUT.txt') or die; 
    # The first map is if you allow duplicates in the index list (i.e. 2 2 is a valid row) 
    # map { my $p = zeroes(100); $p->slice($_)++ foreach (map {$_ - 1} split /\t/); $p } <$fh> 
    map { zeroes(100)->dice([map {$_ - 1} split /\t/])++ } <$fh> 
})->xchg(0, 1); 

open(my $input, '<', 'matrix.txt') or die; 
open(my $output, '>', 'output.txt') or die; 

while(<$input>) { 
    my $vals = PDL::long(split(/\t/)); 
    print $output join("\t", ($vals x $indexes)->list) . "\n"; 
}

출처

2014-11-27 06:43:34

일부 메모리 (그리고 아마도 공간)가 변경됩니다 : 첫 번째'PDL :: long'은'indx'이어야하고, 두 번째는'short'로 빠져 나갈 수 있습니다. –

팁 주셔서 감사합니다! 나는 온라인 문서에서 indx를 보지 못했지만, 지금 내가 알고있는 올바른 유형이라는 것을 알 수있다. 짧은에 관해서는, 나는 그의 행렬에있는 데이터 범위가 무엇인지 확신하지 못했다. (지식의 부족을 기반으로 Double을 사용 했어야했는데, 모든 것이 샘플에 있었기 때문에 정수로 기본 설정되었다.) –

사실 두 데이터 유형이 일치해야합니다. 실제로 indx로 사용되는 유일한 것은 배열이 주사위로 전달되는 것입니다. 그 다음에는 카운터입니다. (나는 바이트가 가장 많을 것이라고 생각하지만, 두 번째 데이터 집합을 곱하면되므로 형식을 줄이기 위해 일치해야합니다. -casting) –

펄 코드의 성능 향상

답변

관련 문제