2014-04-17 7 views
0

하나의 이유는 내 gpu CPU 및 계산 단위 미만의 최대 작업 항목을 말할 수 있습니까 ??? 인텔 코어 i7 2.2GH GPU : AMD 라데온 HD 6700Mclinfo 장치 cpu-gpu 정보



Number of platforms:        2 
    Platform Profile:        FULL_PROFILE 
    Platform Version:        OpenCL 1.2 AMD-APP (1084.2) 
    Platform Name:         AMD Accelerated Parallel Proces 
sing 
    Platform Vendor:        Advanced Micro Devices, Inc. 
    Platform Extensions:       cl_khr_icd cl_amd_event_callbac 
k cl_amd_offline_devices cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_me 
dia_sharing 
    Platform Profile:        FULL_PROFILE 
    Platform Version:        OpenCL 1.2 
    Platform Name:         Intel(R) OpenCL 
    Platform Vendor:        Intel(R) Corporation 
    Platform Extensions:       cl_khr_fp64 cl_khr_icd cl_khr_g 
lobal_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32 
_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store 
cl_intel_printf cl_ext_device_fission cl_intel_exec_by_local_thread cl_khr_gl_sh 
aring cl_intel_dx9_media_sharing cl_khr_dx9_media_sharing cl_khr_d3d11_sharing 


    Platform Name:         AMD Accelerated Parallel Proces 
sing 
Number of devices:        2 
    Device Type:         CL_DEVICE_TYPE_GPU 
    Device ID:          4098 
    Max compute units:        6 
    Max work items dimensions:      3 
    Max work items[0]:       256 
    Max work items[1]:       256 
    Max work items[2]:       256 
    Max work group size:       256 
    Preferred vector width char:     16 
    Preferred vector width short:     8 
    Preferred vector width int:     4 
    Preferred vector width long:     2 
    Preferred vector width float:     4 
    Preferred vector width double:     0 
    Native vector width char:      16 
    Native vector width short:      8 
    Native vector width int:      4 
    Native vector width long:      2 
    Native vector width float:      4 
    Native vector width double:     0 
    Max clock frequency:       725Mhz 
    Address bits:         32 
    Max memory allocation:       536870912 
    Image support:         Yes 
    Max number of images read arguments:   128 
    Max number of images write arguments:   8 
    Max image 2D width:       16384 
    Max image 2D height:       16384 
    Max image 3D width:       2048 
    Max image 3D height:       2048 
    Max image 3D depth:       2048 
    Max samplers within kernel:     16 
    Max size of kernel argument:     1024 
    Alignment (bits) of base address:    2048 
    Minimum alignment (bytes) for any datatype: 128 
    Single precision floating point capability 
    Denorms:          No 
    Quiet NaNs:         Yes 
    Round to nearest even:      Yes 
    Round to zero:        Yes 
    Round to +ve and infinity:     Yes 
    IEEE754-2008 fused multiply-add:    Yes 
    Cache type:         None 
    Cache line size:        0 
    Cache size:         0 
    Global memory size:       2147483648 
    Constant buffer size:       65536 
    Max number of constant args:     8 
    Local memory type:        Scratchpad 
    Local memory size:        32768 
    Kernel Preferred work group size multiple:  64 
    Error correction support:      0 
    Unified memory for Host and Device:   0 
    Profiling timer resolution:     1 
    Device endianess:        Little 
    Available:          Yes 
    Compiler available:       Yes 
    Execution capabilities: 
    Execute OpenCL kernels:      Yes 
    Execute native function:      No 
    Queue properties: 
    Out-of-Order:        No 
    Profiling :         Yes 
    Platform ID:         02843864 
    Name:           Turks 
    Vendor:          Advanced Micro Devices, Inc. 
    Driver version:        1084.2 (VM) 
    Profile:          FULL_PROFILE 
    Version:          OpenCL 1.2 AMD-APP (1084.2) 
    Extensions:         cl_khr_global_int32_base_atomic 
s cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_lo 
cal_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store 
cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd 
_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d 
x9_media_sharing 


    Device Type:         CL_DEVICE_TYPE_CPU 
    Device ID:          4098 
    Max compute units:        8 
    Max work items dimensions:      3 
    Max work items[0]:       1024 
    Max work items[1]:       1024 
    Max work items[2]:       1024 
    Max work group size:       1024 
    Preferred vector width char:     16 
    Preferred vector width short:     8 
    Preferred vector width int:     4 
    Preferred vector width long:     2 
    Preferred vector width float:     8 
    Preferred vector width double:     4 
    Native vector width char:      16 
    Native vector width short:      8 
    Native vector width int:      4 
    Native vector width long:      2 
    Native vector width float:      8 
    Native vector width double:     4 
    Max clock frequency:       2195Mhz 
    Address bits:         32 
    Max memory allocation:       1073741824 
    Image support:         Yes 
    Max number of images read arguments:   128 
    Max number of images write arguments:   8 
    Max image 2D width:       8192 
    Max image 2D height:       8192 
    Max image 3D width:       2048 
    Max image 3D height:       2048 
    Max image 3D depth:       2048 
    Max samplers within kernel:     16 
    Max size of kernel argument:     4096 
    Alignment (bits) of base address:    1024 
    Minimum alignment (bytes) for any datatype: 128 
    Single precision floating point capability 
    Denorms:          Yes 
    Quiet NaNs:         Yes 
    Round to nearest even:      Yes 
    Round to zero:        Yes 
    Round to +ve and infinity:     Yes 
    IEEE754-2008 fused multiply-add:    Yes 
    Cache type:         Read/Write 
    Cache line size:        64 
    Cache size:         32768 
    Global memory size:       2147483648 
    Constant buffer size:       65536 
    Max number of constant args:     8 
    Local memory type:        Global 
    Local memory size:        32768 
    Kernel Preferred work group size multiple:  1 
    Error correction support:      0 
    Unified memory for Host and Device:   1 
    Profiling timer resolution:     466 
    Device endianess:        Little 
    Available:          Yes 
    Compiler available:       Yes 
    Execution capabilities: 
    Execute OpenCL kernels:      Yes 
    Execute native function:      Yes 
    Queue properties: 
    Out-of-Order:        No 
    Profiling :         Yes 
    Platform ID:         02843864 
    Name:            Intel(R) Core(TM) i7-2670 
QM CPU @ 2.20GHz 
    Vendor:          GenuineIntel 
    Driver version:        1084.2 (sse2,avx) 
    Profile:          FULL_PROFILE 
    Version:          OpenCL 1.2 AMD-APP (1084.2) 
    Extensions:         cl_khr_fp64 cl_amd_fp64 cl_khr_ 
global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int3 
2_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr 
_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_at 
tribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3 
d10_sharing 


    Platform Name:         Intel(R) OpenCL 
Number of devices:        1 
    Device Type:         CL_DEVICE_TYPE_CPU 
    Device ID:          32902 
    Max compute units:        8 
    Max work items dimensions:      3 
    Max work items[0]:       1024 
    Max work items[1]:       1024 
    Max work items[2]:       1024 
    Max work group size:       1024 
    Preferred vector width char:     1 
    Preferred vector width short:     1 
    Preferred vector width int:     1 
    Preferred vector width long:     1 
    Preferred vector width float:     1 
    Preferred vector width double:     1 
    Native vector width char:      16 
    Native vector width short:      8 
    Native vector width int:      4 
    Native vector width long:      2 
    Native vector width float:      8 
    Native vector width double:     4 
    Max clock frequency:       2200Mhz 
    Address bits:         32 
    Max memory allocation:       536838144 
    Image support:         Yes 
    Max number of images read arguments:   480 
    Max number of images write arguments:   480 
    Max image 2D width:       16384 
    Max image 2D height:       16384 
    Max image 3D width:       2048 
    Max image 3D height:       2048 
    Max image 3D depth:       2048 
    Max samplers within kernel:     480 
    Max size of kernel argument:     3840 
    Alignment (bits) of base address:    1024 
    Minimum alignment (bytes) for any datatype: 128 
    Single precision floating point capability 
    Denorms:          Yes 
    Quiet NaNs:         Yes 
    Round to nearest even:      Yes 
    Round to zero:        No 
    Round to +ve and infinity:     No 
    IEEE754-2008 fused multiply-add:    No 
    Cache type:         Read/Write 
    Cache line size:        64 
    Cache size:         262144 
    Global memory size:       2147352576 
    Constant buffer size:       131072 
    Max number of constant args:     480 
    Local memory type:        Global 
    Local memory size:        32768 
    Kernel Preferred work group size multiple:  128 
    Error correction support:      0 
    Unified memory for Host and Device:   1 
    Profiling timer resolution:     466 
    Device endianess:        Little 
    Available:          Yes 
    Compiler available:       Yes 
    Execution capabilities: 
    Execute OpenCL kernels:      Yes 
    Execute native function:      Yes 
    Queue properties: 
    Out-of-Order:        Yes 
    Profiling :         Yes 
    Platform ID:         00401218 
    Name:            Intel(R) Core(TM) i7-2670 
QM CPU @ 2.20GHz 
    Vendor:          Intel(R) Corporation 
    Driver version:        3.0.1.15216 
    Profile:          FULL_PROFILE 
    Version:          OpenCL 1.2 (Build 80752) 
    Extensions:         cl_khr_fp64 cl_khr_icd cl_khr_g 
lobal_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32 
_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store 
cl_intel_printf cl_ext_device_fission cl_intel_exec_by_local_thread cl_khr_gl_sh 
aring cl_intel_dx9_media_sharing cl_khr_dx9_media_sharing cl_khr_d3d11_sharing 

왜 GPU OpenCL을위한 세 가지 장치 CPU의 타입이 하나를 참조 은 그 CPU의 성능이

CPU가 GPU보다 더 의미있다 CPU의 인텔이나 내장 GPU 내가 두 개의 디스플레이 어댑터를위한 : AMD 라데온 HD 6700M 시리즈 인텔 HD 그래픽 제품군

+0

은 [su] (http://superuser.com/)에 속하는 스레드로 플래그를 지정합니다. –

+0

최대 계산 단위 : 6 또는 8을 의미합니다. 그것은 코어 i의 코어 수를 의미합니다. GPP 전용 6 ?? – user1848223

+0

어떤 도움이 필요하십니까 – user1848223

답변

2

"얼마나 많은 코어/처리 요소/하드웨어 스레드 내를 수행 GPU가 있습니까? "은 새로운 GPGPU 사용자에게 매우 자주 묻는 질문입니다. 내 평소 응답은 입니다. "왜 신경 쓰시겠습니까?". OpenCL API를 사용하여 장치가 가지고있는 처리 요소의 수를 쿼리하는 방법은 없습니다. 정확히 처리 요소와 계산 단위를 구성하는 요소는 아키텍처에 따라 크게 다릅니다.

실제로이 측정 기준을 사용하는 것이 기기의 성능을 예측하는 데 실제로 좋지 않기 때문에 기기에있는 처리 요소의 수는 실제로 중요하지 않습니다. 특정 애플리케이션에 대한 디바이스의 속도를 실제로 알아야하는 경우, 애플리케이션을 직접 벤치마킹해야합니다 (애플리케이션과 유사한 특성을 갖는 마이크로 벤치 마크).

다른 질문에 대답하십시오 : CPU, Intel 및 AMD를 사용할 수있는 두 개의 OpenCL 구현이 시스템에 있습니다. 따라서 두 플랫폼 모두 CPU를 사용 가능한 OpenCL 장치로보고합니다.

+0

이 질문에 질려합니다. 그러나 우리는 오랫동안 그것을 다루어야한다고 생각합니다 ... 그리고 실제로 논리적 인 질문입니다. 사람들은 여전히 ​​CPU 세계에서 나오고 각 "스레드"를 수동으로 제어하고 그 스레드의 양을 정확하게 파악하려고합니다. GPU가 수백만 개의 병렬 코어를 보유 할 때도 .... – DarkZeros