map-reduce 프로그램을 사용하여 CSV 파일을로드하려면이 명령을 실행하고 있습니다. 성공적으로 실행 중이지만 hbase 테이블을 검색 할 때 0 행이 표시됩니다. 다음
은 실행 과정의 콘솔 로그 데이터입니다 :맵 축소 프로그램이 CSV 파일을로드 할 수 없습니다 hbase 테이블
[[email protected] ~]$ HADOOP_CLASSPATH='hbase classpath' hadoop jar Desktop/bulk.jar /user/hadoop/3.csv /user/hadoop/load bulk
13/06/07 15:59:00 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-cdh3u1--1, built on 07/18/2011 15:17 GMT
13/06/07 15:59:00 INFO zookeeper.ZooKeeper: Client environment:host.name=01HW394491
13/06/07 15:59:00 INFO zookeeper.ZooKeeper: Client environment:java.version=1.6.0_0
13/06/07 15:59:00 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc.
13/06/07 15:59:00 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre
hbase.mapreduce.inputtable
13/06/07 15:59:00 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/06/07 15:59:02 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=172.29.179.59:2181 sessionTimeout=180000 watcher=hconnection
13/06/07 15:59:02 INFO zookeeper.ClientCnxn: Opening socket connection to server /172.29.179.59:2181
13/06/07 15:59:02 INFO zookeeper.ClientCnxn: Socket connection established to 01HW394491/172.29.179.59:2181, initiating session
13/06/07 15:59:02 INFO zookeeper.ClientCnxn: Session establishment complete on server 01HW394491/172.29.179.59:2181, sessionid = 0x13f1e28c4b4000a, negotiated timeout = 180000
13/06/07 15:59:03 INFO mapred.JobClient: Running job: job_201306071546_0001
13/06/07 15:59:04 INFO mapred.JobClient: map 0% reduce 0%
13/06/07 15:59:11 INFO mapred.JobClient: map 100% reduce 0%
13/06/07 15:59:18 INFO mapred.JobClient: map 100% reduce 33%
13/06/07 15:59:19 INFO mapred.JobClient: map 100% reduce 100%
13/06/07 15:59:19 INFO mapred.JobClient: Job complete: job_201306071546_0001
13/06/07 15:59:19 INFO mapred.JobClient: Counters: 21
13/06/07 15:59:19 INFO mapred.JobClient: Job Counters
13/06/07 15:59:19 INFO mapred.JobClient: Launched reduce tasks=1
13/06/07 15:59:19 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=5499
13/06/07 15:59:19 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
13/06/07 15:59:19 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
13/06/07 15:59:19 INFO mapred.JobClient: Rack-local map tasks=1
13/06/07 15:59:19 INFO mapred.JobClient: Launched map tasks=1
13/06/07 15:59:19 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=7561
13/06/07 15:59:19 INFO mapred.JobClient: FileSystemCounters
13/06/07 15:59:19 INFO mapred.JobClient: FILE_BYTES_READ=159
13/06/07 15:59:19 INFO mapred.JobClient: HDFS_BYTES_READ=63
13/06/07 15:59:19 INFO mapred.JobClient: FILE_BYTES_WRITTEN=127600
13/06/07 15:59:19 INFO mapred.JobClient: Map-Reduce Framework
13/06/07 15:59:19 INFO mapred.JobClient: Reduce input groups=0
13/06/07 15:59:19 INFO mapred.JobClient: Combine output records=0
13/06/07 15:59:19 INFO mapred.JobClient: Map input records=0
13/06/07 15:59:19 INFO mapred.JobClient: Reduce shuffle bytes=6
13/06/07 15:59:19 INFO mapred.JobClient: Reduce output records=0
13/06/07 15:59:19 INFO mapred.JobClient: Spilled Records=0
13/06/07 15:59:19 INFO mapred.JobClient: Map output bytes=0
13/06/07 15:59:19 INFO mapred.JobClient: Combine input records=0
13/06/07 15:59:19 INFO mapred.JobClient: Map output records=0
13/06/07 15:59:19 INFO mapred.JobClient: SPLIT_RAW_BYTES=63
13/06/07 15:59:19 INFO mapred.JobClient: Reduce input records=0
13/06/07 15:59:19 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=172.29.179.59:2181 sessionTimeout=180000 watcher=hconnection
13/06/07 15:59:19 INFO zookeeper.ClientCnxn: Opening socket connection to server /172.29.179.59:2181
13/06/07 15:59:19 INFO zookeeper.ClientCnxn: Socket connection established to 01HW394491/172.29.179.59:2181, initiating session
13/06/07 15:59:19 INFO zookeeper.ClientCnxn: Session establishment complete on server 01HW394491/172.29.179.59:2181, sessionid = 0x13f1e28c4b4000c, negotiated timeout = 180000
13/06/07 15:59:19 WARN mapreduce.LoadIncrementalHFiles: Skipping non-directory hdfs://01HW394491:9000/user/hadoop/load/_SUCCESS
이 모든 구성을 갖는 드라이버 클래스입니다. 이 프로그램을 분산 모드로 실행하고 있습니다. 이 프로그램을 실행하기 위해 cloudera cdh3u1 버전을 사용하고 있습니다.
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.HFileOutputFormat;
import org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles;
import org.apache.hadoop.hbase.mapreduce.TableInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
/**
* HBase bulk import example<br>
* Data preparation MapReduce job driver
* <ol>
* <li>args[0]: HDFS input path
* <li>args[1]: HDFS output path
* <li>args[2]: HBase table name
* </ol>
*/
public class Driver {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
//conf.set("hbase.table.name", "bulk");
conf.set("hbase.mapreduce.inputtable", args[2]);
conf.set("hbase.zookeeper.quorum","172.29.179.59");
conf.set("hbase.zookeeper.property.clientPort", "2181");
//conf.set("hbase.master", "172.29.179.59:60000");
//conf.set("hbase.zookeeper.quorum","ibm-r1-node2.apache-nextgen.com");
HBaseConfiguration.addHbaseResources(conf);
Job job = new Job(conf, "HBase Bulk Import Example");
job.setJarByClass(HBaseKVMapper.class);
job.setMapperClass(HBaseKVMapper.class);
job.setMapOutputKeyClass(ImmutableBytesWritable.class);
job.setMapOutputValueClass(KeyValue.class);
job.setInputFormatClass(TableInputFormat.class);
HTable hTable = new HTable(args[2]);
// HTable hTable = new HTable("bulkdata");
// Auto configure partitioner and reducer
HFileOutputFormat.configureIncrementalLoad(job, hTable);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
/*
* FileInputFormat.addInputPath(job, new
* Path("hdfs://localhost:9000/user/685536/input1.csv"));
*
* FileOutputFormat.setOutputPath(job, new
* Path("hdfs://localhost:9000/user/685536/outputs12348"));
*
*/
System.out.println(TableInputFormat.INPUT_TABLE);
job.waitForCompletion(true);
// Load generated HFiles into table
LoadIncrementalHFiles loader = new LoadIncrementalHFiles(conf);
loader.doBulkLoad(new Path(args[1]), hTable);
// loader.doBulkLoad(new
// Path("hdfs://localhost:9000/user/685536/outputs12348"), hTable);
}
}
이
내가 HBase를 테이블에 값을 삽입하고있는 클래스이며, 은 '부피', '필드'를 생성 명령을 사용하는 간단한 테이블을 생성한다. 데이터 출력을 검색하는 중 표시되지 않습니다.이 프로그램은 의사 배포 모드에서 성공적으로 실행됩니다.
코드를 보지 않고도 말하기가 어렵습니다. 너 그거 할 수있어? – Tariq
이 출력물에는 아무 것도 표시되지 않으며이 출력을 여기에 올릴 시점이 없습니다. 이것은 단지 입력 파일이 보이고 코드를 작성하면서 프로그램이 작동하고 지정된 디렉토리에 출력이 기록됩니다. – smttsp
안녕하세요, 코드도 추가했습니다. – user1651008