Build Loader

In order to build the loader, run following command.

> sbt "project loader" "clean" "assembly"

This will give you s2graph-loader-assembly-X.X.X-SNAPSHOT.jar under loader/target/scala-2.xx/

Source Data Storage Options

For bulk loading, source data can be either in HDFS or a Kafka queue.

For source data in HDFS

provide example run of following step under loader/

1. subscriber.TransferToHFile

tranfer edge format in TSV into HFile directly.

following is paramter for this job.

parameter index note example
0 input path in hdfs
1 output path in hdfs
2 zkQuorum for target hbase
3 tableName will be used in target hbase
4 dbUrl for s2graph core
5 maxHFilePerRegionServer

2. distcp hfile into production hbase cluster(optional)

3. chmod hfile

4. complete load

hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles #{pathToHFile} #{hbase table name}

