안드로이드 벤치마킹 퀀 타이즈

저는 benchmark_model을 사용하여 Exynos 7420에서 tensorflow 모델을 벤치마킹 해 왔습니다. 나는 Pete Warden's blog에 따라 테스트 퀀 타이즈 속도를 높이고 싶지만 아직 여러 가지를 깰 때 양자화 뎁스로 benchmark_model을 컴파일하지 못했습니다.안드로이드 벤치마킹 퀀 타이즈

deps = [":benchmark_model_lib", 
      "//tensorflow/contrib/quantization/kernels:quantized_ops", 
      ],

// tensorflow /있는 contrib/양자화/커널/BUILD cc_binary BUILD/

// tensorflow/도구/벤치 마크 :

나는이 stack overflow thread에 나와있는 지침을 따랐습니다

deps = [ 
    "//tensorflow/contrib/quantization:cc_array_ops", 
    "//tensorflow/contrib/quantization:cc_math_ops", 
    "//tensorflow/contrib/quantization:cc_nn_ops", 
    #"//tensorflow/core", 
    #"//tensorflow/core:framework", 
    #"//tensorflow/core:lib", 
    #"//tensorflow/core/kernels:concat_lib_hdrs", 
    #"//tensorflow/core/kernels:conv_ops", 
    #"//tensorflow/core/kernels:eigen_helpers", 
    #"//tensorflow/core/kernels:ops_util", 
    #"//tensorflow/core/kernels:pooling_ops", 
    "//third_party/eigen3", 
    "@gemmlowp//:eight_bit_int_gemm", 
],

그런 다음 실행

bazel는 '-c 선택 하 --cxxopt = 건설 -들 td = gnu ++ 11 '- crosstool_top = // 외부 : android/crosstool --cpu = armeabi-v7a --host_crosstool_top = @ bazel_tools // tools/cpp : 툴체인 tensorflow/tools/벤치 마크 : benchmark_model --verbose_failures

링크 된 게시물의 다른 모든 지시 사항과 함께 다음 중 어떤 것이 pthread와 연결되지 않는 것을 제외하고는 성공합니다.

tensorflow/tensorflow.bzl tfcopts()에서 -lpthread를 제거하고 tensorflow/tools/proto_text/BUILD 및 tensorflow/cc/BUILD에서 비슷한 방식으로 제거해 보았습니다.

def tf_copts(): 
    return (["-fno-exceptions", "-DEIGEN_AVOID_STL_ARRAY"] + 
      if_cuda(["-DGOOGLE_CUDA=1"]) + 
      if_android_arm(["-mfpu=neon"]) + 
      select({"//tensorflow:android": [ 
        "-std=c++11", 
        "-DMIN_LOG_LEVEL=0", 
        "-DTF_LEAN_BINARY", 
        "-O2", 
        ], 
        "//tensorflow:darwin": [], 
        "//tensorflow:ios": ["-std=c++11",], 
        #"//conditions:default": ["-lpthread"]})) 
        "//conditions:default": []}))

링크 오류가 계속 발생합니다.

external/androidndk/ndk/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/../lib/gcc/arm-linux-androideabi/4.9/../../../../arm-linux-androideabi/bin/ld: error: cannot find -lpthread 
collect2: error: ld returned 1 exit status

많은 도움을 주셨습니다. 꽤 고생했습니다.

봉투 :

우분투 # 4462
android_ndk_r11c
안드로이드 SDK - 리눅스 r24.4.1
파이썬 2.7.12 : 연속체 분석, 주식을 커밋 tensorflow 14.04
./GCP, HDFS 또는 GPU를 지원하지 않는 구성

출처

2016-09-21 Dwight Crow

TF 팀의 GitHub answer from Andrew Harp을 전사했습니다. 감사!!!

위의 변경 사항은 모두 필요하지 않았습니다.

자식은 (또한 복제 --recursive 이눔 수 libs와 @gemmlowp 얻기 위해) --recurse - 서브 모듈을 당겨
: 다음에 (또는 android_tensorflow_lib에 모든 대상에 따라 다름) benchmark_model 위해 일 양자화를 얻을 수 있습니다 // tensorflow/코어 다음 편집/

diff --git a/tensorflow/core/BUILD b/tensorflow/core/BUILD 
--- a/tensorflow/core/BUILD 
+++ b/tensorflow/core/BUILD 
@@ -713,8 +713,11 @@ cc_library(
# binary size (by packaging a reduced operator set) is a concern. 
cc_library(
    name = "android_tensorflow_lib", 
- srcs = if_android([":android_op_registrations_and_gradients"]), 
- copts = tf_copts(), 
+ srcs = if_android([":android_op_registrations_and_gradients", 
+      "//tensorflow/contrib/quantization:android_ops", 
+      "//tensorflow/contrib/quantization/kernels:android_ops", 
+      "@gemmlowp//:eight_bit_int_gemm_sources"]), 
+ copts = tf_copts() + ["-Iexternal/gemmlowp"], 
    linkopts = ["-lz"], 
    tags = [ 
     "manual",

이 위대한 작품을 구축 할 수 있습니다.흥미롭게도 양자화는 1/4 크기의 그래프를 생성하지만 추측 실행은 양자화되지 않은 그래프처럼 느린 4-5 배로 - 양자화 된 연산은 여전히 최적화 된 것처럼 보입니다.

출처

2016-09-23 04:18:53

현재 작동 중입니다. 예, 우리는 여전히 양자화 된 연산을 최적화하고 있으므로 최대 속도로 현재 속도를 사용하지 마십시오. –

안드로이드 벤치마킹 퀀 타이즈

답변

관련 문제