Quick Start with Docker
- install docker and VirtualBox(optional for OSX).
- register the IP of the docker-machine into
/etc/hostsfile(optional for OSX).- ex)
192.168.99.100 default
- ex)
- open up docker terminal and go to root directory.
- Build a docker image of the s2graph in the project's root directory
sbt "project s2rest_play" docker:publishLocal
- Run MySQL and HBase container first.
- change directory to dev-support.
cd dev-support docker-compose builddocker-compose up -d graph_mysqlwill run MySQL and HBase at same time.
- change directory to dev-support.
- Run graph container.
- in docker terminal, go to dev-support folder then run
docker-compose up -d - you will find
defaultimage running on VirtualBox. - if you are on linux, then use your machine name.
- in docker terminal, go to dev-support folder then run
- Run rest server with Play on local machine(optionally).
sbt "project s2rest_play" run -Dhost=default- you will see play application running on local machine.
- Run test cases.
sbt test -Dhost=defaultthis will use hbase and mysql on default image at VirtualBox.
S2Graph should be connected with MySQL at initial state. Therefore you have to run MySQL and HBase before running it.
Quick Start with Vagrant
S2Graph comes with a Vagrantfile that lets you spin up a virtual environment for test and development purposes. (On setting up S2Graph in your local environment directly, please refer to Quick Start in Your Local Environment.)
You will need VirtualBox and Vagrant installed on your system.
With everything ready, let's get started by running the following commands:
> git clone https://github.com/apache/incubator-s2graph s2graph
> cd s2graph
> vagrant up
> vagrant ssh
// in the virtual environment..
> cd s2graph
> activator "project s2rest_play" run
Finally, join the mailing list by sending a message to [email protected]!
Quick Start in Your Local Environment
This section describes how to set up S2Graph on your local machine on top of HBase and MySQL.
S2Graph consists of a number of modules.
- S2Core is the core library for common classes to store and retrieve data as edges and vertices.
- Root Project is the Play-Framework-based REST API.
- Spark contains spark-related common classes.
- Loader has spark applications for data migration purposes. Load data to the HBase back-end as graph representations (using the S2Core library) by either consuming events from Kafka or copying straight from HDFS.
- Asynchbase is a fork of https://github.com/OpenTSDB/asynchbase. We added a few functionalities to GetRequest which are yet to be merged to the original project. Here are some of the tweaks listed:
- rpcTimeout
- setFilter
- column pagination
- retryAttempCount
- timestamp filtering
There are some prerequisites for running S2Graph:
- An SBT installation
> brew install sbtif you are on a Mac. (Otherwise, checkout the SBT document.)
- An Apache HBase installation
> brew install HBaseif you are on a Mac. (Otherwise, checkout the HBase document.)- Run
> start-hbase.sh.
- Run
- Please note that we currently support latest stable version of Apache HBase 1.0.1 with Apache Hadoop version 2.7.0. If you are using CDH, checkout feature/cdh5.3.0. @@@We are working on providing a profile on HBase/Hadoop version soon.
- S2Graph currently supports MySQL for metadata storage.
> brew install mysqlif you are on a Mac. (Otherwise, checkout the MySQL document.)- Run
> mysql.server start.
With the prerequisites are setup correctly, checkout the project:
> git clone https://github.com/apache/incubator-s2graph s2graph
> cd s2graph
Create necessary MySQL tables by running the provided SQL file:
> mysql -uroot < ./s2core/migrate/mysql/schema.sql
Now, go ahead and build the project:
> sbt compile
You are ready to run S2Graph!
> sbt "project s2rest_play" run
Finally, join the mailing list by sending a message to [email protected]!
Your First S2Graph
As a toy problem, let's try to create the backend for a simple timeline of a new social network service. (Think of a simplified version of Facebook's Timeline. :stuck_out_tongue_winking_eye:) You will be able to manage "friends" and "posts" of a user with simple S2Graph queries.
First, we need a name for the new service.
Why don't we call it Kakao Favorites?
curl -XPOST localhost:9000/graphs/createService -H 'Content-Type: Application/json' -d ' {"serviceName": "KakaoFavorites", "compressionAlgorithm" : "gz"} 'Make sure the service is created correctly.
curl -XGET localhost:9000/graphs/getService/KakaoFavoritesNext, we will need some friends.
In S2Graph, relationships are defined as Labels.
Create a
friendslabel with the followingcreateLabelAPI call:curl -XPOST localhost:9000/graphs/createLabel -H 'Content-Type: Application/json' -d ' { "label": "friends", "srcServiceName": "KakaoFavorites", "srcColumnName": "userName", "srcColumnType": "string", "tgtServiceName": "KakaoFavorites", "tgtColumnName": "userName", "tgtColumnType": "string", "isDirected": "false", "indices": [], "props": [], "consistencyLevel": "strong" } 'Check the label:
curl -XGET localhost:9000/graphs/getLabel/friendsNow that the label
friendsis ready, we can store friend entries.Entries of a label are called edges, and you can add edges with the
edges/insertAPI:curl -XPOST localhost:9000/graphs/edges/insert -H 'Content-Type: Application/json' -d ' [ {"from":"Elmo","to":"Big Bird","label":"friends","props":{},"timestamp":1444360152477}, {"from":"Elmo","to":"Ernie","label":"friends","props":{},"timestamp":1444360152478}, {"from":"Elmo","to":"Bert","label":"friends","props":{},"timestamp":1444360152479}, {"from":"Cookie Monster","to":"Grover","label":"friends","props":{},"timestamp":1444360152480}, {"from":"Cookie Monster","to":"Kermit","label":"friends","props":{},"timestamp":1444360152481}, {"from":"Cookie Monster","to":"Oscar","label":"friends","props":{},"timestamp":1444360152482} ] 'Query friends of Elmo with
getEdgesAPI:curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Elmo"}], "steps": [ {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]} ] } 'Now query friends of Cookie Monster:
curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Cookie Monster"}], "steps": [ {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]} ] } 'Users of Kakao Favorites will be able to post URLs of their favorite websites.
We will need a new label
postfor this data:curl -XPOST localhost:9000/graphs/createLabel -H 'Content-Type: Application/json' -d ' { "label": "post", "srcServiceName": "KakaoFavorites", "srcColumnName": "userName", "srcColumnType": "string", "tgtServiceName": "KakaoFavorites", "tgtColumnName": "url", "tgtColumnType": "string", "isDirected": "true", "indices": [], "props": [], "consistencyLevel": "strong" } 'Now, insert some posts of our users:
curl -XPOST localhost:9000/graphs/edges/insert -H 'Content-Type: Application/json' -d ' [ {"from":"Big Bird","to":"www.kakaocorp.com/en/main","label":"post","props":{},"timestamp":1444360152477}, {"from":"Big Bird","to":"github.com/kakao/s2graph","label":"post","props":{},"timestamp":1444360152478}, {"from":"Ernie","to":"groups.google.com/forum/#!forum/s2graph","label":"post","props":{},"timestamp":1444360152479}, {"from":"Grover","to":"hbase.apache.org/forum/#!forum/s2graph","label":"post","props":{},"timestamp":1444360152480}, {"from":"Kermit","to":"www.playframework.com","label":"post","props":{},"timestamp":1444360152481}, {"from":"Oscar","to":"www.scala-lang.org","label":"post","props":{},"timestamp":1444360152482} ] 'Query posts of Big Bird:
curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Big Bird"}], "steps": [ {"step": [{"label": "post", "direction": "out", "offset": 0, "limit": 10}]} ] } 'So far, we designed a label schema for your user relation data
friendsandpostas well as stored some sample edges.While doing so, we have also prepared ourselves for our timeline query!
The following two-step query will return URLs for Elmo's timeline, which are posts of Elmo's friends:
curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Elmo"}], "steps": [ {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]}, {"step": [{"label": "post", "direction": "out", "offset": 0, "limit": 10}]} ] } 'Also try Cookie Monster's timeline:
curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Cookie Monster"}], "steps": [ {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]}, {"step": [{"label": "post", "direction": "out", "offset": 0, "limit": 10}]} ] } '
The above example is by no means a full blown social network timeline, but it gives you an idea on how to represent, store and query relations with S2Graph.
We also provide a simple script under script/test.sh so that you can see if everything is setup correctly.
> sh script/test.sh