java - Sequence Files in Hadoop -


how these sequence files generated ? saw link sequence file here,

http://wiki.apache.org/hadoop/sequencefile 

are these written using default java serializer ? , how read sequence file ?

sequence files generated mapreduce tasks , and can used common format transfer data between mapreduce jobs.

you can read them in following manner:

configuration config = new configuration(); path path = new path(path_to_your_file); sequencefile.reader reader = new sequencefile.reader(filesystem.get(config), path, config); writablecomparable key = (writablecomparable) reader.getkeyclass().newinstance(); writable value = (writable) reader.getvalueclass().newinstance(); while (reader.next(key, value))   // perform operating reader.close(); 

also can generate sequence files using sequencefile.writer.

the classes used in example following:

import org.apache.hadoop.conf.configuration; import org.apache.hadoop.fs.filesystem; import org.apache.hadoop.fs.path; import org.apache.hadoop.io.sequencefile; import org.apache.hadoop.io.writable; import org.apache.hadoop.io.writablecomparable; 

and contained within hadoop-core maven dependency:

<dependency>     <groupid>org.apache.hadoop</groupid>     <artifactid>hadoop-core</artifactid>     <version>1.2.1</version> </dependency> 

Comments

Popular posts from this blog

linux - Mailx and Gmail nss config dir -

c# - Is it possible to remove an existing registration from Autofac container builder? -

php - Mysql PK and FK char(36) vs int(10) -