Querying Data With GraphQL & Ballerina

Written by lafernando | Published 2021/02/11
Tech Story Tags: ballerinalang | ballerina | graphql | data | programming | data-analysis | querying | ballerina-programming

TLDRvia the TL;DR App

Introduction

GraphQL has become a prominent technology in implementing data APIs. It provides a convenient and intuitive approach for querying data. Let’s look at a sample use case using the Ballerina programming language and see how GraphQL compares to other traditional approaches such as implementing REST-style HTTP APIs. 

Use Case: E-commerce Data Query

Let’s take a typical e-commerce scenario of processing orders in an online store. The entity-relationship diagram below shows a typical representation that can be used in a relational database. This is of course a simplified representation of a real-life implementation. 
One approach for exposing such a data set would be to create a service with operations for each database table. This would be similar to the following.
  • getOrder(id): OrderInfo
  • getCustomer(id): CustomerInfo
  • getShipper(id): ShipperInfo
In this manner, we do get rather granular access to the data where we can query each table’s records as we need. However, we have to read the full table record at a time. If we have a large number of fields in a table, this may result in a larger message transported to the user, even if most of the fields may not be used by the application. This is an over-fetching scenario, but generally, it’s not a big problem in this type of situation. 
Now, let’s say the application needs to look up order details as well as the information about the customer. Then it needs to do two separate service operations “getOrder” and “getCustomer”. The following sequence diagram shows this interaction. 
Note that the service operations are done sequentially since we need the “OrderInfo” to look up the “customerId” to do the “getCustomer” operation. So for the application, this means two network round trips to look up both order and customer information. This becomes worse as we need more information, such as looking up “shipper” information also at once. For applications such as mobile apps where this communication is happening through a high-latency network, such as the Internet, it would hamper the user-experience. So ideally, we need to cut back on the number of service calls we do when a user interacts with an application. 
In order to reduce the number of service calls, we can have an operation such as “getFullOrderInfo” which will load all the data in connection with an order from the service and send them at once. This would definitely solve our multiple request problem. But, if we just have this single operation, even when we want to look up something like the order date, we will receive lots of unwanted data. This is a potentially problematic over-fetching scenario. If we want to properly fix this situation, we need separate individual operations for all the combinations that are possible, such as “getOrderAndCustomer”, “getOrderAndShipper”, etc. This is obviously not a practical solution for a service developer. If only there is an approach, where the application can dynamically query the service on which parts of the data set is required. This is exactly what GraphQL does. 

Solution: GraphQL

In GraphQL, we can define an object graph in our service, where a client can query the specific fields of an object. These fields can be queried at any nested level. Optionally, we can pass in parameters for these fields as well. A definition of these objects for our use-case can be shown below.
type Query {
   order(id: Int): Order
}
type Order {
   id: Int
   notes: String
   customer: Customer
   shipper: Shipper
}
type Customer {
   name: String
   address: Address
}
type Shipper {
   name: String
   phone: String
}
The above is actually written in the GraphQL schema format used to define object types. GraphQL “Query” is a special object type, which must exist for the schema. This is basically the root level object that a user will query. So in GraphQL queries, we provide the fields inside the “Query” object to look up the required data. 
Provided that we have a GraphQL service with the schema above, we would send the following query to get a similar effect to our earlier “getOrder” operation.
{
   order(id: 1) {
       notes,
       date
   }
}
Here, we instruct our service to lookup the “order” field from the root query object and pass in ‘1’ as the value for parameter “id”. This field returns an object type, so we need to list all the fields we require from the object, where we provide “notes” and “date” above. If we need to only look up the date field, our GraphQL query would be the following.
{
   order(id: 1) {
       date
   }
}
We can drill into more fields and get their values as well. The following query, looks up full order information, including the customer and shipper information. 
{
   orders(id: 2) {
       notes,
       date,
       customer {
           name,
           address
       },
       shipper {
           name,
           phone
       }
   }
}
Now that we understand the basics of how GraphQL works, let’s take a look at how to do the actual implementation using some code. We will use Ballerina for this task. It is a programming language that has GraphQL as part of its built-in language-level services support. 

Implementation: Ballerina GraphQL Services

In Ballerina, the GraphQL object structure is modeled using services. A Ballerina GraphQL service contains resource methods that map to the fields of the GraphQL objects and work as resolver functions to provide its data. The GraphQL schema is automatically derived from this service structure and its resources. 
NOTE: GraphQL support is available from Ballerina Swan Lake release and onwards.
The following code shows a simple GraphQL service we can write in Ballerina.
import ballerina/graphql;

service graphql:Service /query on new graphql:Listener(8080) {

   resource function get name() returns string {
       return "Jack";
   }

}
The code above exposes a GraphQL service at the endpoint. Its GraphQL schema is similar to the following.
type Query {
   name: String
}
We can send the following GraphQL query to lookup the exposed “name” field in the root query object. 
{
   name
}
Let’s run the Ballerina code above for a sample test run.
$ bal run demo.bal
Compiling source
    demo.bal
Running executable
[ballerina/http] started HTTP/WS listener 0.0.0.0:8080
A GraphQL request can be executed by sending an HTTP request similar to the following.
$ curl -X POST -H "Content-type: application/json" -d '{"query":"{name}"}' http://localhost:8080/query
{"data":{"name":"Jack"}}
The resource functions here can be provided with parameters to correlate with the GraphQL field parameters as well. Also, in the case of returning objects in fields, the resource method can return a service object to represent this. Let’s see how we implement our order information query scenario using a Ballerina service. 
We start with the Ballerina GraphQL service implementation, which represents the GraphQL root “Query” object fields. 
import ballerina/graphql;

service graphql:Service /query on new graphql:Listener(8080) {

   resource function get 'order(int id) 
                          returns Order|error => loadOrder(id);

}
Here, we have a single resource function “order”, which takes in the “id” parameter and returns an instance of the “Order” service class. The “loadOrder” function and the “Order” service class is implemented in the following way. 
function loadOrder(int id) returns Order|error {
   stream<record{}, error> rs = dbClient->query(`SELECT id, customerId,
                                                 shipperId, date, notes
                                                 FROM ORDERS WHERE id = 
                                                 ${id}`, OrderData);
   var rec = check rs.next();
   check rs.close();
   if !(rec is ()) {
       return new Order(<OrderData> rec["value"]);
   } else {
       return error(string `Invalid order: ${id}`);
   }
}

service class Order {

   private OrderData data;

   function init(OrderData data) {
       self.data = data;
   }

   resource function get notes() returns string {
       return self.data.notes;
   }

   resource function get date() returns string {
       return self.data.date;
   }

   resource function get customer() returns Customer|error {
       return check loadCustomer(self.data.customerId);
   }

   resource function get shipper() returns Shipper|error {
       return check loadShipper(self.data.shipperId);
   }

}
Here, we execute the required SQL query to load the “Order” table data and populate the “Order” object. Note that, we do not also load “customer” and “shipper” information right away, but rather, these are loaded lazily if and when it is required as expressed through the incoming GraphQL query. 
The “loadCustomer” function shown below is used in the “customer” resource function to load the customer information from the database and populate a “Customer” object. 
function loadCustomer(int id) returns Customer|error {
   stream<record{}, error> rs = dbClient->query(`SELECT id, name, address
                                                 FROM CUSTOMER WHERE id = 
                                                 ${id}`, CustomerData);
   var rec = check rs.next();
   check rs.close();
   if !(rec is ()) {
       return new Customer(<CustomerData> rec["value"]);
   } else {
       return error(string `Invalid customer: ${id}`);
   }
}

service class Customer {

   private CustomerData data;

   function init(CustomerData data) {
       self.data = data;
   }

   resource function get name() returns string {
       return self.data.name;
   }

   resource function get address() returns string {
       return self.data.address;
   }

}
Similarly, the “shipper” resource function is implemented to query the corresponding GraphQL object field. The full source code for the scenario above can be found here.
Let’s do a test run using our full Ballerina service implementation. We are using a MySQL database to provide the data. Let’s create and populate the database first. Navigate to the “ordersvc” directory which contains the Ballerina package and the database script. 
$ mysql -u root -p < db.sql
Let’s execute the default module of our Ballerina package in the following manner. 
$ bal run .
Compiling source
        laf/ordersvc:0.1.0
Creating balo
        target/balo/laf-ordersvc-any-0.1.0.balo
Running executable
[ballerina/http] started HTTP/WS listener 0.0.0.0:8080
Let’s send some GraphQL requests to the service. 
$ curl -X POST -H "Content-type: application/json" -d '{ "query": "{ order(id: 2) { notes, date, customer { name, address }, shipper { name, phone } } }" }' 'http://localhost:8080/query'
{
    "data": {
        "order": {
            "notes": "Street pickup",
            "date": "2021/01/25",
            "customer": {
                "name": "Nimal Perera",
                "address": "No 22, Galle Road, Colombo 02"
            },
            "shipper": {
                "name": "UPS",
                "phone": "(408)275-4415"
            }
        }
    }
}
$ curl -X POST -H "Content-type: application/json" -d '{ "query": "{ order(id: 1) { notes, customer { name, address } } }" }' 'http://localhost:8080/query'
{
    "data": {
        "order": {
            "notes": "Doorstep delivery",
            "customer": {
                "name": "Jack Smith",
                "address": "No 10, N 1st St, San Jose"
            }
        }
    }
}
Ballerina GraphQL services also support GraphQL introspection. For example, the following query can be executed to lookup the types available in the service. 
$ curl -X POST -H "Content-type: application/json" -d '{ "query": "{ __schema { types { name } } }" }' 'http://localhost:8080/query'
{
    "data": {
        "__schema": {
            "types": [{
                "name": "Order"
            }, {
                "name": "__TypeKind"
            }, {
                "name": "__Field"
            }, {
                "name": "Query"
            }, {
                "name": "__Type"
            }, {
                "name": "Customer"
            }, {
                "name": "Shipper"
            }, {
                "name": "__InputValue"
            }, {
                "name": "String"
            }, {
                "name": "Int"
            }, {
                "name": "__Schema"
            }]
        }
    }
}

Summary

GraphQL is a technology that makes data querying tasks much more efficient and intuitive for the users. Here, we have looked at how it solves potential problems such as data over-fetching and solves network latency problems that can arise in a services-based solution. Ballerina provides built-in support for implementing GraphQL services in a quick and easy manner, where the user can just concentrate on the business logic. 
For more information on Ballerina and its GraphQL support, check out the following resources:

Written by lafernando | Software architect and evangelist @ WSO2 Inc.
Published by HackerNoon on 2021/02/11