Quick Start with Docker

  1. install docker and VirtualBox(optional for OSX).
  2. register the IP of the docker-machine into /etc/hosts file(optional for OSX).
    • ex) 192.168.99.100 default
  3. open up docker terminal and go to root directory.
  4. Build a docker image of the s2graph in the project's root directory
    • sbt "project s2rest_play" docker:publishLocal
  5. Run MySQL and HBase container first.
    • change directory to dev-support. cd dev-support
    • docker-compose build
    • docker-compose up -d graph_mysql will run MySQL and HBase at same time.
  6. Run graph container.
    • in docker terminal, go to dev-support folder then run docker-compose up -d
    • you will find default image running on VirtualBox.
    • if you are on linux, then use your machine name.
  7. Run rest server with Play on local machine(optionally).
    • sbt "project s2rest_play" run -Dhost=default
    • you will see play application running on local machine.
  8. Run test cases.
    • sbt test -Dhost=default this will use hbase and mysql on default image at VirtualBox.

S2Graph should be connected with MySQL at initial state. Therefore you have to run MySQL and HBase before running it.

Quick Start with Vagrant

S2Graph comes with a Vagrantfile that lets you spin up a virtual environment for test and development purposes. (On setting up S2Graph in your local environment directly, please refer to Quick Start in Your Local Environment.)

You will need VirtualBox and Vagrant installed on your system.

With everything ready, let's get started by running the following commands:

> git clone https://github.com/apache/incubator-s2graph s2graph
> cd s2graph
> vagrant up
> vagrant ssh

// in the virtual environment..
> cd s2graph
> activator "project s2rest_play" run

Finally, join the mailing list by sending a message to [email protected]!

Quick Start in Your Local Environment

This section describes how to set up S2Graph on your local machine on top of HBase and MySQL.

S2Graph consists of a number of modules.

  1. S2Core is the core library for common classes to store and retrieve data as edges and vertices.
  2. Root Project is the Play-Framework-based REST API.
  3. Spark contains spark-related common classes.
  4. Loader has spark applications for data migration purposes. Load data to the HBase back-end as graph representations (using the S2Core library) by either consuming events from Kafka or copying straight from HDFS.
  5. Asynchbase is a fork of https://github.com/OpenTSDB/asynchbase. We added a few functionalities to GetRequest which are yet to be merged to the original project. Here are some of the tweaks listed:
    • rpcTimeout
    • setFilter
    • column pagination
    • retryAttempCount
    • timestamp filtering

There are some prerequisites for running S2Graph:

  1. An SBT installation
    • > brew install sbt if you are on a Mac. (Otherwise, checkout the SBT document.)
  2. An Apache HBase installation
    • > brew install HBase if you are on a Mac. (Otherwise, checkout the HBase document.)
      • Run > start-hbase.sh.
    • Please note that we currently support latest stable version of Apache HBase 1.0.1 with Apache Hadoop version 2.7.0. If you are using CDH, checkout feature/cdh5.3.0. @@@We are working on providing a profile on HBase/Hadoop version soon.
  3. S2Graph currently supports MySQL for metadata storage.
    • > brew install mysql if you are on a Mac. (Otherwise, checkout the MySQL document.)
    • Run > mysql.server start.

With the prerequisites are setup correctly, checkout the project:

> git clone https://github.com/apache/incubator-s2graph s2graph
> cd s2graph

Create necessary MySQL tables by running the provided SQL file:

> mysql -uroot < ./s2core/migrate/mysql/schema.sql

Now, go ahead and build the project:

> sbt compile

You are ready to run S2Graph!

> sbt "project s2rest_play" run

Finally, join the mailing list by sending a message to [email protected]!

Your First S2Graph

As a toy problem, let's try to create the backend for a simple timeline of a new social network service. (Think of a simplified version of Facebook's Timeline. :stuck_out_tongue_winking_eye:) You will be able to manage "friends" and "posts" of a user with simple S2Graph queries.

  1. First, we need a name for the new service.

    Why don't we call it Kakao Favorites?

    curl -XPOST localhost:9000/graphs/createService -H 'Content-Type: Application/json' -d '
    {"serviceName": "KakaoFavorites", "compressionAlgorithm" : "gz"}
    '
    

    Make sure the service is created correctly.

    curl -XGET localhost:9000/graphs/getService/KakaoFavorites
    
  2. Next, we will need some friends.

    In S2Graph, relationships are defined as Labels.

    Create a friends label with the following createLabel API call:

    curl -XPOST localhost:9000/graphs/createLabel -H 'Content-Type: Application/json' -d '
    {
     "label": "friends",
     "srcServiceName": "KakaoFavorites",
     "srcColumnName": "userName",
     "srcColumnType": "string",
     "tgtServiceName": "KakaoFavorites",
     "tgtColumnName": "userName",
     "tgtColumnType": "string",
     "isDirected": "false",
     "indices": [],
     "props": [],
     "consistencyLevel": "strong"
    }
    '
    

    Check the label:

    curl -XGET localhost:9000/graphs/getLabel/friends
    

    Now that the label friends is ready, we can store friend entries.

    Entries of a label are called edges, and you can add edges with the edges/insert API:

    curl -XPOST localhost:9000/graphs/edges/insert -H 'Content-Type: Application/json' -d '
    [
     {"from":"Elmo","to":"Big Bird","label":"friends","props":{},"timestamp":1444360152477},
     {"from":"Elmo","to":"Ernie","label":"friends","props":{},"timestamp":1444360152478},
     {"from":"Elmo","to":"Bert","label":"friends","props":{},"timestamp":1444360152479},
    
     {"from":"Cookie Monster","to":"Grover","label":"friends","props":{},"timestamp":1444360152480},
     {"from":"Cookie Monster","to":"Kermit","label":"friends","props":{},"timestamp":1444360152481},
     {"from":"Cookie Monster","to":"Oscar","label":"friends","props":{},"timestamp":1444360152482}
    ]
    '
    

    Query friends of Elmo with getEdges API:

    curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d '
    {
       "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Elmo"}],
       "steps": [
         {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]}
       ]
    }
    '
    

    Now query friends of Cookie Monster:

    curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d '
    {
       "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Cookie Monster"}],
       "steps": [
         {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]}
       ]
    }
    '
    
  3. Users of Kakao Favorites will be able to post URLs of their favorite websites.

    We will need a new label post for this data:

    curl -XPOST localhost:9000/graphs/createLabel -H 'Content-Type: Application/json' -d '
    {
     "label": "post",
     "srcServiceName": "KakaoFavorites",
     "srcColumnName": "userName",
     "srcColumnType": "string",
     "tgtServiceName": "KakaoFavorites",
     "tgtColumnName": "url",
     "tgtColumnType": "string",
     "isDirected": "true",
     "indices": [],
     "props": [],
     "consistencyLevel": "strong"
    }
    '
    

    Now, insert some posts of our users:

    curl -XPOST localhost:9000/graphs/edges/insert -H 'Content-Type: Application/json' -d '
    [
     {"from":"Big Bird","to":"www.kakaocorp.com/en/main","label":"post","props":{},"timestamp":1444360152477},
     {"from":"Big Bird","to":"github.com/kakao/s2graph","label":"post","props":{},"timestamp":1444360152478},
     {"from":"Ernie","to":"groups.google.com/forum/#!forum/s2graph","label":"post","props":{},"timestamp":1444360152479},
     {"from":"Grover","to":"hbase.apache.org/forum/#!forum/s2graph","label":"post","props":{},"timestamp":1444360152480},
     {"from":"Kermit","to":"www.playframework.com","label":"post","props":{},"timestamp":1444360152481},
     {"from":"Oscar","to":"www.scala-lang.org","label":"post","props":{},"timestamp":1444360152482}
    ]
    '
    

    Query posts of Big Bird:

    curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d '
    {
       "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Big Bird"}],
       "steps": [
         {"step": [{"label": "post", "direction": "out", "offset": 0, "limit": 10}]}
       ]
    }
    '
    
  4. So far, we designed a label schema for your user relation data friends and post as well as stored some sample edges.

    While doing so, we have also prepared ourselves for our timeline query!

    The following two-step query will return URLs for Elmo's timeline, which are posts of Elmo's friends:

    curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d '
    {
       "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Elmo"}],
       "steps": [
         {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]},
         {"step": [{"label": "post", "direction": "out", "offset": 0, "limit": 10}]}
       ]
    }
    '
    

    Also try Cookie Monster's timeline:

    curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d '
    {
       "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Cookie Monster"}],
       "steps": [
         {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]},
         {"step": [{"label": "post", "direction": "out", "offset": 0, "limit": 10}]}
       ]
    }
    '
    

The above example is by no means a full blown social network timeline, but it gives you an idea on how to represent, store and query relations with S2Graph.

We also provide a simple script under script/test.sh so that you can see if everything is setup correctly.

> sh script/test.sh

results matching ""

    No results matching ""