GraphQL Best Practice Guide

This quick guide introduces you to some best practices that you should employ when building solutions that consume a GraphQL API.

Introduction

This guide outlines some useful tips and tricks when working with a GraphQL API, (not just the Marketplacer APIs). It should prove equally useful for those developers that have worked with other types of API (e.g. REST), or for those developers new to consuming APIs.


1. Understand the basics 🧑‍🎓

Understanding the foundational basics of GraphQL will set you up for success in moving forward with your build.

GraphQL is significantly different from other API design patterns, (e.g. REST), so you should familiarize yourself with the core concepts. The official GraphQL site is a great place to start.

At a minimum you should be familiar with:

  • What is a Graph (nodes & edges)
  • Queries
  • Mutations
  • GraphQL Types
  • Pagination
  • Error responses
  • Authentication & Authorization

2. Read the API docs 📖

Knowledge is power! Familiarizing yourself with the specific API providers docs is critical if you want a low friction build.

I know, I know - no one likes to read docs, it’s always more enticing just to jump in. However you will save yourself a lot of time and effort, (and that of any support teams that you need to contact), if you take some time to read (some of!) the documentation offered by the API provider.

Most providers, (Marketplacer include), usually provide a quick start guide for their APIs, these are a good place to begin.

For consumers of the Marketplacer APIs, we’d suggest you read the following as a minimum:


3. Pick the best fit query or mutation 🧰

The right tool for the right job is a mantra used in many disciplines. Choosing the best fit query or mutation when building your solution is no different.

Pick the correct tool for the job! There may be more than 1 way to query the data you need, however depending on your use-case 1 approach may be more suitable than the other, (this circles back to tip #2).

Use-case

I want to retrieve a single Product (aka GraphQL Object Type) from the API, should I:

  • Use a query that can return multiple products, and apply filters to obtain the individual product
  • Use a node query that is designed just to return 1 Object type based on its ID

The answer to this question may vary depending on whether you have access to the ID of the product, if you do, then a node query may be the best fit.

In the case of Marketplacer we offer a range of queries (in particular those related to Products / Adverts), that have very different intended use-cases. Understanding the capabilities of these seemingly similar queries is important to building the best fit solution for your needs.


4. Provide variables as arguments 📥

Hard-coding values is never a good idea, indeed variables are usually one of the first concepts you learn about when starting out with software development - GraphQL is no different.

Observe the following 2 queries that both retrieve an order with an ID of 12345:

# Recommended ✅
query GetOrder($id : ID!){
  node(id: $id)
  {
    ... on Order {
      __typename
      id
      totalCents
      createdAt
    }
  }
}

# Not recommended ❌
query GetOrder{
  node(id: "12345")
  {
    ... on Order {
      __typename
      id
      totalCents
      createdAt
    }
  }
}

With the recommended approach you can see that we are passing the required order Id as a variable named $id (we need to populate this variable with the value 12345 when we call it).

Using variables has many advantages, not least: reusability, readability and supportability.


5. Use pagination 📄

Pagination allows you to consume result sets in a manageable way, ultimately increasing the responsiveness of your queries and the scalability of your integration.

The benefits of pagination are almost too numerous to mention, so much so that at Marketplacer we mandate its use when you can expect to return more than 1 result from a query. Benefits include, but are not limited to:

  • Faster query response times
  • Scalability
  • Client memory usage

How it works

Pagination allows you to retrieve a dataset by defining a page size (this is the maximum number of results permitted to be returned by each request). You then make requests to the API to obtain each “page” of data until you have processed the entire result set.

Each paginated response will typically identify:

  • The total size of the result set
  • Whether there is another page of data to return
  • If using cursors, provide the cursor to allow you to move to the next page

You can read more about how to paginate with the Marketplacer GraphQL APIs here.


6. Optimize Queries 🚀

With great power comes great responsibility. GraphQL gives you the power to tailor your queries to bring back the exact data you need - but do you really need everything?

The following 2 issues with REST APIs were identified by the creators of GraphQL (Facebook), and it was largely in response to these issues that Facebook created GraphQL:

  • Over fetching: REST APIs often give you more data that you need
    • I just want the Product ID and Price, instead I get every attribute of a product
  • Under-fetching: REST APIs often require you to make multiple requests to different resource types to build the complete dataset you need

The power of GraphQL is that you can tailor your queries to return just the data you need, this includes retrieving fields for nested / related objects:

query GetOrder($id : ID!){
  node(id: $id)
  {
    ... on Order {
      __typename
      id
      totalCents
      createdAt
      invoices
      {
        nodes
        {
          id
          legacyId
          seller{
            businessName
          }
          lineItems{
            id
            variantId
            variantBarcode
          }
        }
      }
    }
  }
}

In this query not only are we bringing back the Order object, but also the following objects:

  • Invoices
    • LineItems
    • Seller

Also as you can see we have selected only particular fields for each object - just the ones we need.

However, like any power, it can be abused: the more data you bring back, the larger the potential “cost” of that query, and the less optimal it becomes.

Optimizing queries is a bigger topic that just trimming the data you are requesting, and speaks also to the following concepts which are covered elsewhere in this guide:


7. Name all operations 📛

Naming your operations can assist with understanding the intent of the query or mutation, as well as being beneficial when debugging or tracing requests.

The following 2 queries do exactly the same thing, however 1 is named and 1 is anonymous:

# Recommended ✅
query GetInvoices{
  invoices {
    nodes {
      id
      legacyId
      createdAt
      seller {
        businessName
      }
      shipments {
        id
        dispatchedAt
      }
      lineItems {
        id
      }
    }
  }
}

# Not Recommended ❌
query {
  invoices {
    nodes {
      id
      legacyId
      createdAt
      seller {
        businessName
      }
      shipments {
        id
        dispatchedAt
      }
      lineItems {
        id
      }
    }
  }
}

As you can see from the recommended approach, we have a query named GetInvoices, while this may appear obvious in this case, more complex queries servicing more intricate use cases could have arguably more meaningful names.

Benefits of naming your operations (queries and mutations in this case) include:

  • They clarify the intent of your operation
  • They allow you to combine more than 1 operation in single query document, (you can only have 1 anonymous query per document)
  • They assist with tracing and debugging, you can use the operation name to help identify your queries and mutations in traces, logs etc.

8. Use Aliases 🥸

Do you want to return fields using a different name? Want to query the same thing 2x but using different arguments? If the answer is yes, then use aliases.

The following query returns an Advert object using the field names prescribed by the API:

query ($id: ID!) {
  node(id: $id) {
    ... on Advert {
      id
      title
      externalId
      description
    }
  }
}

Assuming we pass a valid value for $id, the we’d get something like this returned:

{
  "data": {
    "node": {
      "id": "QWR2ZXJ0LTEwMDA5MjkwOA==",
      "title": "Mobile Phone",
      "externalId": 2233668,
      "description": "Powerful phone"
    }
  }
}

Using aliases you can “rename” any of the selected fields, this renaming is then reflected in the response. The reworked query (using aliases) is shown below:

query ($id: ID!) {
  node(id: $id) {
    ... on Advert {
      marketplacerId: id
      productName: title
      sys_id: externalId
      description
    }
  }
}

The response would look as follows:

{
  "data": {
    "node": {
      "marketplacerId": "QWR2ZXJ0LTEwMDA5MjkwOA==",
      "productName": "Mobile Phone",
      "sys_id": 2233668,
      "description": "Powerful phone"
    }
  }
}

Here you can see that by using aliases we’re returning the same data, however the field names have changed.

We can extend the use of aliases when we want to return multiple sets of the same object, but based on providing different arguments. In the following example we are wanting to return the variants of an advert, but we want separate results for those that can be displayed, and those that cannot. The following example will not work as we cannot query variants twice:

query ($id: ID!) {
  node(id: $id) {
    ... on Advert {
      id
      title
      description
      descriptionHtml
      variants(displayable: true) { 
        nodes {
          id
          barcode
          published
        }
      }
      variants(displayable: false) { 
        nodes {
          id
          barcode
          published
        }
      }
    }
  }
}

To get round this, that’s right you guessed it, we use aliases:

query ($id: ID!) {
  node(id: $id) {
    ... on Advert {
      id
      title
      description
      descriptionHtml
      displayableVariants: variants(displayable: true) { 
        nodes {
          id
          barcode
          published
        }
      }
      nonDisplayableVariants:variants(displayable: false) { 
        nodes {
          id
          barcode
          published
        }
      }
    }
  }
}

9. Use Fragments 🧩

Fragments allow you to break otherwise long, complex queries into more manageable sections that can be reused.

In the following query we are not using fragments, (again I’ve omitted using pagination so we can focius on the concept at hand):

query publishedProducts {
  advertSearch {
    adverts {
      nodes {
        id
        title
        description
        variants {
          nodes {
            id
            barcode
            published
          }
        }
      }
    }
  }
}

However we may want to adopt a fragment to represent the variant fields we are interested in:

query publishedProducts {
  advertSearch {
    adverts {
      nodes {
        id
        title
        description
        variants {
          nodes {
            ...VariantPartial
          }
        }
      }
    }
  }
}
fragment VariantPartial on Variant {
  id
  barcode
  published
}

Here you can see we have implemented a fragment called VariantPartial to split-out the variant component from the main query.

As this was a fairly simple example, we’ve arguably made the query more complex than is needed, (you may not implement a fragment if this is all you were doing).

To make this a little more realistic (i.e. scenario whey you may use fragments) we can adapt the query to return both displayable and non displayable variants - this is the same example we saw in #8 where we introduced aliases, (indeed we continue to Aliases):

query publishedProducts {
  advertSearch {
    adverts {
      nodes {
        id
        title
        description
        displayableVariants: variants(displayable: true) {
          nodes {
            ...VariantPartial
          }
        }
        nonDisplayableVariants: variants(displayable: false) {
          nodes {
            ...VariantPartial
          }
        }
      }
    }
  }
}
fragment VariantPartial on Variant {
  id
  barcode
  published
}

This is a more realistic use of fragments given the benefits of reuse that we are seeing. Just imagine a query that’s considerably more complex, and fairly soon the benefits of fragments are very apparent.


10. Understand and handle errors 😵

Errors happen - get over it! But we need to be able to handle those errors gracefully when they occur.

The use of HTTP

Both REST and GraphQL are agnostic from the transport that can be used to drive them - i.e. there is no need to use HTTP with either REST or GraphQL. With that being said, the vast majority of both those API types run over HTTP, and it is with this view of reality that we move on to talking about errors.

Differences with REST

REST APIs (running over HTTP) typically make use of the underlying HTTP Status codes to convey either success or failure states of the request, for example:

  • 201 Created: is usually returned when you create a resource using a POST request
  • 204 No Content: is often used when returning a success response from updating a resource using a PUT request
  • 404 Not Found: is used when the resource you are accessing is not found
  • 400 Bad Request: is used when the request if malformed

The list goes on, however the main takeaway is that REST uses these codes to report on error (and success) conditions, they will also usually provide further information in the payload response.

But why are we talking about REST?

We have covered REST (albeit at a high level) as a primer for our conversation on GraphQL - it’s one of the differences between the 2 API types that often catches out developers experienced with REST, but new to GraphQL.

Error Responses

So how does GraphQL return error responses? For the purposes of this conversation GraphQL will return error responses in an errors collection as part of the JSON payload response.

The example below shows the JSON response when there is a syntactic error with the query we submitted:

{
  "errors": [
    {
      "message": "Field 'lowestPrce' doesn't exist on type 'Advert'",
      "locations": [
        {
          "line": 8,
          "column": 4
        }
      ],
      "path": [
        "query ad",
        "node",
        "... on Advert",
        "lowestPrce"
      ],
      "extensions": {
        "code": "undefinedField",
        "typeName": "Advert",
        "fieldName": "lowestPrce"
      }
    }
  ]
}

Here you can see the message contains the nature of the issue at hand, which in this case was the misspelling of the field lowestPrice, (we submitted lowestPrce).

In another example, the mutation is well-formed, (syntactically correct), but there is some kind of issue with the processing of the request:

{
  "data": {
    "orderCreate": null
  },
  "errors": [
    {
      "message": "Validation failed: Count on hand must be greater than or equal to 0",
      "locations": [
        {
          "line": 2,
          "column": 2
        }
      ],
      "path": [
        "orderCreate"
      ],
      "extensions": {
        "variant_id": "VmFyaWFudC05OTk4NA==",
        "count_on_hand": {
          "VmFyaWFudC05OTk4NA==": 1
        }
      }
    }
  ]
}

In this particular scenario we were actually attempting to create an order where there wasn’t sufficient stock holding for the item we wanted to purchase. Again you would need to interrogate the errors collection to understand the nature of the error.

In both these cases we received a HTTP 200 OK response, the errors need to be interrogated via the payload.

The take-away point here is that you will need to handle errors in GraphQL in a potentially different way to that of REST based APIs.


11. Adopt caching strategies 🗃️

Slow moving data by its nature does not change much - do you really need to request it again?

This tip centers round understanding the nature of the data that you’re requesting via the API:

  • Does it change rapidly - e.g. the stock position in a popular item
  • Does it move slowly - e.g. Enumeration values that you could use for filtering queries

In the case of slow moving data it makes sense to cache that data client side and use it from there. Not only would this be faster, it just cuts down on unnecessary calls to the API. Indeed, in the case where API’s are rate limited, or have some other “cost” attached, this makes a lot of sense.

This tip really could apply to any type of API, (not just GraphQL), but it’s one that makes the list as it really is low hanging fruit.


12. Use a GraphQL client 🔌

You can construct your GraphQL API calls as raw HTTP POST requests, but why make life difficult for yourself? There’s a range of GraphQL clients that do a lot of the heavy lifting for you.

As mentioned previously, while GraphQL is agnostic from HTTP, it is usually implemented on top of it. Breaking that down further, calls to a GraphQL endpoint are really nothing more than HTTP POST requests with a specifically-formed request body.

You can of course choose to implement your solution by making calls in this way, but you’re doing yourself a disservice.

That’s where GraphQL clients come in.

Clients come in all shapes and sizes, with feature sets ranging from the relatively basic, to much more elaborate.

What they all share in common is the general goal of abstracting away from the user (in this case the developer) some of the lower-level nuances of making GraphQL calls work.

Features of GraphQL clients include, but are not limited to:

  • Code generation: Ingesting the self-documenting GraphQL schema some clients can generate language-specific implementations of GraphQL types.
  • Error Handling: Provide a simplified, and consistent way to deal with errors - see #10 above
  • Batching and deduplicating requests: This reduces the number of calls to the API, and is particularly beneficial if you are making multiple requests in a short period of time

The client you choose will be driven by the programming language that you are using, so we can’t cover them all here - the point is that if you’re consuming GraphQL APIs, you should invest some time in learning to use a suitable client, and avoid the raw construction of calls.


13. Stay informed about changes 🪵

APIs change, and in the case of GraphQL APIs this change is often more of a continuous evolution, as opposed to versioned releases.

Versioning of GraphQL APIs is less common than say with REST, so it’s important to understand that the schema you’re using can shift beneath your feet, (i.e. you don’t usually get a clean jump from 1 version to another).

However because GraphQL is self-documenting, (changes appear immediately in the schema exposed via introspection), you can stay across these changes easily. Indeed some GraphQL clients (see #12), assist with this.

Deprecations

One special mention in this change story is the ability in GraphQL to mark fields as having been depreciated. This means that the field in question has either been replaced by an alternative, (and is no longer needed), or it is just no longer needed!

It is at the discretion of the API provider how long they maintain deprecated fields, but the long-term strategy should always be eventual removal of those fields- another good reason to stay across the ever-evolving GraphQL API.

In the case of Marketplacer, we monitor usage of deprecated fields, and would of course never remove any that were being used - this type of change is always carefully managed.

Change Logs

Marketplacer, like many API providers, also provides a change log that details not just what the change is, but why the change was made, as well as the use-case for introducing. In short, the change log provides more context around the change.

We advise all consumers of our APIs to subscribe to our Change Log.

In general, staying across the evolving changes of any API (GraphQL or otherwise) is critical in ensuring your integrations:

  • Continue work!
  • Leverage any new, innovative features the provider has added

14. Play nice: Fair use / rate limits 📈

If rate limits and or a fair use policy exist then adhering to them is not so much “best practice” but a mandatory requirement.

This one is a bit of a no-brainer, rate limits aren’t optional - you need to adhere otherwise the performance of your integration will suffer. Fair use policies are possibly a little more subjective and open to interpretation.

The point really is that you should make your integration as efficient as possible, indeed many of the tips we’ve already covered feed in to this idea:

  • Understand the basics
  • Read the API docs
  • Pick the query or mutation that closest fits your use-case
    • Pick the most efficient way to integrate - this could include using something like webhooks as opposed to GraphQL API calls
  • Use pagination
    • Attempting to being back “all data” is not only inefficient, but is resource intensive
  • Optimize queries
    • Essentially the same point as above, just a different lens
  • Understand errors
    • E.g. Retrying certain error conditions is pointless and could generate redundant traffic if you do so
  • Caching strategies
    • Avoid redundant calls to the API
  • Use a GraphQL client
    • Some clients (e.g. Apollo) allow you to batch calls, which could potentially be a more efficient way to consume resources

Rate limits can change and could potentially become more aggressive, as can fair use policies, by ensuring that you don’t come anywhere near activating these, you place your integration in a more advantageous state.