2016-10-09 2 views
0

Hadoop에서 terasort를 실행하려고합니다. 아래와 같이 타임 아웃 실행 오류가 발생합니다.하둡, 소켓 시간 초과 오류

[[email protected] mapreduce]$ hadoop jar $(ls hadoop-mapreduce-examples-2*.jar) teragen 100000000 /terasort/in 
16/10/08 21:30:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
16/10/08 21:30:17 INFO client.RMProxy: Connecting to ResourceManager at master/10.90.110.160:8032 
16/10/08 21:30:33 INFO terasort.TeraSort: Generating 100000000 using 2 
16/10/08 21:30:33 INFO mapreduce.JobSubmitter: number of splits:2 
16/10/08 21:30:34 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1475979237007_0002 
16/10/08 21:30:34 INFO impl.YarnClientImpl: Submitted application application_1475979237007_0002 
16/10/08 21:30:34 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1475979237007_0002/ 
16/10/08 21:30:34 INFO mapreduce.Job: Running job: job_1475979237007_0002 
16/10/08 21:38:25 INFO mapreduce.Job: Job job_1475979237007_0002 running in uber mode : false 
16/10/08 21:38:25 INFO mapreduce.Job: map 0% reduce 0% 
16/10/08 21:38:25 INFO mapreduce.Job: Job job_1475979237007_0002 failed with state FAILED due to: Application application_1475979237007_0002 failed 2 times due to Error launching appattempt_1475979237007_0002_000002. Got exception: org.apache.hadoop.net.ConnectTimeoutException: Call From master.someplace.net/69.172.201.153 to 69.172.201.153:35751 failed on socket timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=69.172.201.153/69.172.201.153:35751]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) 
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) 
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) 
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:751) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1480) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1407) 
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) 
    at com.sun.proxy.$Proxy32.startContainers(Unknown Source) 
    at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96) 
    at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:119) 
    at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:254) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=69.172.201.153/69.172.201.153:35751] 
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534) 
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) 
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609) 
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707) 
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) 
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1529) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1446) 
    ... 9 more 
. Failing the application. 
16/10/08 21:38:25 INFO mapreduce.Job: Counters: 0 

나는 세 개의 노드를 검사했으며 제대로 작동합니다.

Live datanodes (3): 

Name: 10.90.110.160:50010 (master.hadoop.mids.lulz.bz) 
Hostname: 69.172.201.153 
Decommission Status : Normal 
Configured Capacity: 105554829312 (98.31 GB) 
DFS Used: 831488 (812 KB) 
Non DFS Used: 5449568256 (5.08 GB) 
DFS Remaining: 100104429568 (93.23 GB) 
DFS Used%: 0.00% 
DFS Remaining%: 94.84% 
Configured Cache Capacity: 0 (0 B) 
Cache Used: 0 (0 B) 
Cache Remaining: 0 (0 B) 
Cache Used%: 100.00% 
Cache Remaining%: 0.00% 
Xceivers: 1 
Last contact: Sat Oct 08 21:47:42 CDT 2016 


Name: 10.90.110.169:50010 (slave2.hadoop.mids.lulz.bz) 
Hostname: 69.172.201.153 
Decommission Status : Normal 
Configured Capacity: 105554829312 (98.31 GB) 
DFS Used: 831488 (812 KB) 
Non DFS Used: 5448441856 (5.07 GB) 
DFS Remaining: 100105555968 (93.23 GB) 
DFS Used%: 0.00% 
DFS Remaining%: 94.84% 
Configured Cache Capacity: 0 (0 B) 
Cache Used: 0 (0 B) 
Cache Remaining: 0 (0 B) 
Cache Used%: 100.00% 
Cache Remaining%: 0.00% 
Xceivers: 1 
Last contact: Sat Oct 08 21:47:42 CDT 2016 


Name: 10.90.110.165:50010 (slave1.hadoop.mids.lulz.bz) 
Hostname: 69.172.201.153 
Decommission Status : Normal 
Configured Capacity: 105554829312 (98.31 GB) 
DFS Used: 831488 (812 KB) 
Non DFS Used: 5448441856 (5.07 GB) 
DFS Remaining: 100105555968 (93.23 GB) 
DFS Used%: 0.00% 
DFS Remaining%: 94.84% 
Configured Cache Capacity: 0 (0 B) 
Cache Used: 0 (0 B) 
Cache Remaining: 0 (0 B) 
Cache Used%: 100.00% 
Cache Remaining%: 0.00% 
Xceivers: 1 
Last contact: Sat Oct 08 21:47:42 CDT 2016 

해결 방법을 찾는 데 도움을주세요. 나는 완전히 여기에서 길을 잃는다. .. 미리 감사드립니다!

답변

2

나는 시스템이 DFSClient가 데이터 노드와 통신하는 동안 기본 시간 초과 기간을 사용하고 있다고 생각합니다. 다음 구성은 dfs.datanode.socket.write.timeout 및 dfs.socket.timeout을 늘리는 데 도움이 될 수 있습니다.

변경 또는 제한 시간을 증가시키기 위해 아래의 구성을 추가, 또한

<property> 
    <name>dfs.datanode.socket.write.timeout</name> 
    <value>2000000</value> 
</property> 

<property> 
    <name>dfs.socket.timeout</name> 
    <value>2000000</value> 
</property> 

, 시스템 로그에 69.172.201.153을 연결하기 위해 노력하고있다. 이것이 올바른 IP입니까?

+0

감사합니다. 그것은 일했다 !!! –