1{ 2 "errors": [ 3 { 4 "message": "The type 'Order' doesn't have a field 'foo'. Please check the Schema.", 5 "locations": [{"line": 4, "column": 5}], 6 "extensions": { 7 "description": "Field 'foo' in type 'Order' is undefined", 8 "queryPath": ["order", "foo"] 9 } 10 } 11 ], 12 "data": null 13}
GraphQL excels by suggesting to think more from the actual client requirements than from the sometimes prematurely abstract server-side grand conceptualizations. It's essential to strike the right balance between the perspectives of all stakeholders, foremost the clients, to evolve truly superb APIs; and APIs need to be truly superb to remain usable and extensible over time, as complexity inevitably will grow. While the GraphQL Schema allows clients to see what's available, clients have to specify what they need in actuality, i.e. which fields in particular, allowing the server to measure the success of every single field; and eventually remove fields that are not needed any more. This is crucial to know, especially when migrating APIs. This way to think about APIs should also hold true when a service needs to report business errors. Let's explore the options GraphQL gives us.
Technical Errors
In GraphQL, a response can contain not only the data
, but also a list of errors
to report technical problems. For example, when a request is syntactically invalid, the service should be as assistive as possible, so the client can easily understand what they can do to fix it, e.g. by responding:
If, on the other hand, there is a problem within a service, it should indicate to the client that either a) a retry is the appropriate reaction to a temporary glitch:
1{ 2 "errors": [ 3 { 4 "message": "We have a scheduled downtime. Please retry your request later.", 5 "extensions": { 6 "retryAfter": "2023-05-11T17:53:13.523Z" 7 } 8 } 9 ], 10 "data": null 11}
Or b), there's obviously a bug in the service, so it's nice to inform the clients if and what they can do to have it fixed.
1{ 2 "errors": [ 3 { 4 "message": "Sorry, we have a bug in retrieving the Customer. If you need to see our progress in fixing it, please follow the link below.", 5 "extensions": { 6 "queryPath": ["order", "customer"], 7 "bugfixUrl": "https://bugs.example.org/error-instance/0305a282-9941-4714-83a8-08e5a3353a12" 8 } 9 } 10 ], 11 "data": { 12 "order": { 13 "id": "2022/183435", 14 "orderDate": "2023-05-11T17:53:13.523Z", 15 "customer": null 16 } 17 } 18}
Note the queryPath
extension, which allows us to link an error to a specific field within the requested data. GraphQL allows for partial results, i.e. if only part of the request can't be fulfilled, e.g. customer
, the data is filled with what can be retrieved, while the errors explain the missing parts.
These examples are technical errors: automated handling (other than retrying) is not reasonable. They can happen anytime and everywhere, so they don't have to be part of the Schema.
But there's also a different class of reasons for some data to be unavailable. Maybe an order is being blocked and hidden because it was considered to be fraudulent. Let's call such situations business errors. They are not exceptions, as they are part of the normal operations of the system; nobody should worry; everything's fine; no stacktraces in the logs, please.
If we report such errors via the same errors
mechanism, the clients will ask us for the list of all possible errors and if we can introduce for example some code
extension, so they can properly react to them instead of parsing the human-readable message, which could change at any time. But this would be an out-of-band communication: clients would have to consult a second source beside the Schema, e.g. a wiki; and wikis, as the saying goes, are where information goes to die: they are bound to be outdated. We'd rather have business errors be directly part of the Schema.
Result Union
There is an excellent talk by Sasha Solomon, that inspired me to rethink the whole topic and write this blog post. She shows exactly why we need business errors to be result types made visible in the Schema, and provides an ingenious way to do so: the valid response value and all possible business errors are wrapped into a Union result:
1type Query { 2 order(id: String): OrderResult 3} 4union OrderResult = Order | OrderNotFound | OrderLocked 5type Order { 6 id: ID 7 orderDate: Date 8} 9type OrderNotFound ① { 10 positive: Boolean 11} 12type OrderLocked { 13 start: DateTime 14 reason: LockReason 15} 16enum LockReason { 17 FRAUD 18 MANUAL 19}
① | The positive field is just a dummy value that is always true . I define it here because I want all errors to be Types and not Scalars, so I can add real fields later, and a Type must have at least one field. |
A query could be:
1query order($id: String) {
2 order(id: $id) {
3 __typename
4 ... on Order {
5 id
6 orderDate
7 }
8 ... on OrderLocked {
9 reason
10 start
11 }
12 }
13}
The client needs to make sure to always add code that handles unknown __typename
values, so new errors can be added to the Union without breaking anything. When we add an OrderCancelled
response, the client can see that there's a new value for the __typename
, which they probably have to add special handling for.
(Side note, just for the sake of completeness: a very similar solution can be achieved with GraphQL Interfaces; but I won't go into details, as it doesn't add a lot to the argument.)
For nested errors, the result Types can be nested, too:
1type Order { 2 id: ID 3 orderDate: Date 4 customer: CustomerResult 5} 6union CustomerResult = Customer | CustomerBlocked 7type CustomerBlocked { 8 positive: Boolean 9}
One Way Street
The Union based solution is great: the Schema clearly documents what possibly can go wrong, business-wise, and what alternative data is available. But it doesn't mitigate one major problem: communication is one-way, server to client. The backend still doesn't see which clients handle which errors, as clients can react to any of the possible __typename
values without selecting even a single detail field, e.g. the OrderNotFound
. One could say that the __typename
is just the new code
extension made part of the Schema, which is a major step, but the backend can't see that some error is not needed any more and can be removed. Or that a specific client has started to handle a specific business error. The concept, as clean as it is, is somewhat impure around the edges. It regulates the communication from the server to the client, but doesn't improve the communication of the requirements from the client to the server, which is a central benefit of GraphQL.
Result Fields
In order to make the clients' requirements visible, one option is to declare the OrderResult
from above to be a regular wrapper Type instead of a Union, i.e. allow clients to select what business errors they want to handle:
1type OrderResult { 2 order: Order 3 error_orderLocked: OrderLocked 4 error_orderNotFound: OrderNotFound 5}
Normally, exactly one of these fields is set, which is, sadly, not visible in the Schema; it must be documented as a convention.
If an error situation arises that a client didn't expect, e.g. they don't know that orders can be locked, they don't select the error field. In order to prevent the situation that the order
field is null
and the client has no idea why, we fall back to reporting the error via the classic, technical errors
field. This allows us to add new business errors at any time, e.g. when we add the feature of cancelling orders, we add an error_orderCancelled
; and existing clients get it as technical errors
.
If it's still possible to retrieve the order
data and the new error is not critical, i.e. existing clients can safely ignore the fact that an order is, e.g., locked while reading, we can simply not report the error.
In both cases, if clients want to react to a new error, they can simply select the corresponding new error field. And on the other hand, if we see that no client selects an error field anymore, we can safely remove it.
Nesting works exactly like with the Union responses described above.
Breaking Changes
But introducing such a result wrapper Type (also if it's a Union) is a breaking change. We can't do that as long as clients still expect us to return the data directly. We could use the wrappers right from the beginning, every time, even before we know if we might need to add business errors later. But even a scalar field could need to be in a business error state: e.g. just the customer name could be blocked, i.e. we'd have to wrap it in a CustomerNameResult
. Doing so for every field doesn't make sense.
So we need a strategy to migrate existing queries, define a new field with a new name returning the new wrapper type, while the old field still returns the flat data. If the old field is not selected anymore, we can remove it. To have a standard naming convention for the wrapped fields, we could suffix the name with a word, e.g. append Result
to the old order
field to create orderResult
. But this can easily be confusing, e.g. a mutation createGame
would become createGameResult
, which sounds like something very different. So I'd opt for a single underscore, e.g. order_
; but that's not explaining itself. And would we do that also for fields where we do know from the beginning that they have some error states? It's not easy to make such an API consistent.
Flat
To prevent breaking changes by design, we could add the error fields directly to the return Type:
1type Query { 2 order(id: String): Order 3} 4type Order { 5 id: ID 6 orderDate: Date 7 # ... 8 error_orderLocked: OrderLocked 9 error_orderNotFound: OrderNotFound 10}
Again, if some error occurs that the client didn't select, we fall back to reporting it like a technical error.
Nesting works naturally, i.e. the Customer
type could have an error_customerBlocked
field.
At first glance, this solution looks perfect, but it has a major drawback: all data fields must be nullable.
Extra Queries
GraphQL allows us to issue several queries at once; so we can add separate queries for the business errors. In order to link them to the original query, we use a naming convention <query-path>error<code>
:
1type Query { 2 order(id: String): Order 3 order_error_orderLocked: OrderLocked 4 order_error_orderNotFound: OrderNotFound 5 order_customer_error_customerBlocked: CustomerBlocked 6}
An actual query could be:
1query order($id: String) { 2 order(id: $id) { 3 id 4 orderDate 5 customer { 6 name 7 } 8 } 9 order_error_locked { 10 reason 11 start 12 } 13 order_customer_error_customerBlocked { 14 reason 15 } 16}
Mangling the query path into a field name may feel messy, but it works.
Also here, if some error occurs that the client didn't select, we fall back to reporting it like a technical error.
Note that the return type of order
must be nullable now.
Conclusion
So these are the five options we have, and all have their merits:
Option | Visible in Schema (server→client) | Visible in Query (client→server) | Extensible (no breaking changes) | Side effects |
---|---|---|---|---|
Technical Errors | 𐄂 | 𐄂 | ✓ | |
Result Union | ✓ | 𐄂 | 𐄂 | |
Result Fields | (✓) | ✓ | 𐄂 | |
Flat | (✓) | ✓ | ✓ | All data fields must be nullable |
Extra Queries | (✓) | ✓ | ✓ | The data field must be nullable |
(✓)
means that, while the business errors are visible in the Schema, there is an extra convention on top, that you'll need to know in order to truly understand the Schema.
No option is perfect, so the question remains: which one should I choose? Is it just a matter of taste? The elegance in the Schema when using Unions is compelling, but the drawbacks are severe. I'm especially hesitant to prematurely introduce 'best practices' like result wrapper Types; I've already had too many best practices proven to not actually be best in all cases. My personal conclusion is that using Flat error fields provides the best balance between extensibility and visibility.
What do you think? Let's discuss!
More articles
fromRüdiger zu Dohna
Your job at codecentric?
Jobs
Agile Developer und Consultant (w/d/m)
Alle Standorte
More articles in this subject area
Discover exciting further topics and let the codecentric world inspire you.
Gemeinsam bessere Projekte umsetzen.
Wir helfen deinem Unternehmen.
Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.
Hilf uns, noch besser zu werden.
Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.
Blog author
Rüdiger zu Dohna
IT Consulting Expert
Do you still have questions? Just send me a message.
Do you still have questions? Just send me a message.