This is a 8th of the series blogs on deep dive into Apollo GraphQL from backend to frontend. A lot of information is Apollo GraphQL Doc or GraphQL Doc as well as their source code on Github — all tributes go to them. For my part, I would like to give you my “destructuring” of the original knowledge and my reflection on it, analysis on the code examples/source code as well as some extra examples.
One aspect of optimising Frontend performance is using a Frontend Cache, aka. how to reduce the back and forth request/response lifecycles. Luckily, Apollo Client already has its native implementation by storing the results of its GraphQL queries in a normalised, in-memory cache.
The cache itself has a plug and play nature with some default behaviours, but it also offers a fine granular control by overwriting its default configs. Essentially, you can:
- Specify custom primary key fields regarding how data is normalised.
- Customise the storage and retrieval of individual fields
- Customise the interpretation of field arguments
- Define supertype-subtype relationships for fragment matching
- Define patterns for pagination
- Manage client-side local state
According to the Doc, to customise cache behaviour, provide an options
object to the InMemoryCache
constructor.
Apollo Cache as an Abstraction Layer
You may wonder, how exactly is the Apollo cache mechanism working with all the details from above? Well, you are more than welcome to go into details, but from an architectural perspective, all you need to know is that Apollo Cache serves as an abstract layer on top of a data store. It handles the upcoming actions (mutations, queries, and subscriptions) and spit out pre-processed response from server after the response data going through the caching process.
It may remind you of a tool like Redux, and you are right, just like a redux store returns the data based on actions, apollo cache returns data based on e.g. queries & mutations. And obviously, the cache store is very similar to a Redux store.
Data normalisation and Custom Identifier
To understand how Apollo cache stores the data, we need to tap into the data normalisation. According to Wiki, normalisation is the process of structuring a database, usually a relational database, in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. Through the use of relationships (primary keys, foreign keys) and constraints, we can enforce unique data getting added to the database only.
The InMemoryCache normalises query results before saving them to the cache by:
- Create a global unique identifier for each object included in the response.
- Store the objects by its unique identifier in a flat lookup table in JSON-serializable format.
- Whenever an incoming object is stored with the same unique identifier as an existing object, the fields of those objects are merged.
The most important part is to calculate the unique identifier.
Assigning unique identifiers
By default, Apollo Client uses the id
+ __typename
to create the identifier. If an object doesn’t specify a __typename or one of id or _id, InMemoryCache falls back to using the object’s path within its associated query (e.g., ROOT_QUERY.allPeople.0 for the first record returned for an allPeople root query).These two values are separated by a colon (:
).
But of course, you can overwrite the default behaviour.
Custom identifiers
To create custom identifier, you define TypePolicy
for the type and include a keyFields
field in relevant TypePolicy
objects, like so:
const cache = new InMemoryCache({
typePolicies: {
AllProducts: {
keyFields: [],
},
Product: {
keyFields: ["upc"],
},
Person:
keyFields: ["name", "email"],
},
Book: {
keyFields: ["title", "author", ["name"]],
},
},
});
For the example above, the Book
type above uses a subfield as part of its primary key. The ["name"]
item indicates that the name
field of the previous field in the array (author
) is part of the primary key. The Book
's author
field must be an object that includes a name
field for this to be valid. So the resulting identifier string for a Book
object has the following structure:
Book:{"title":"Fahrenheit 451","author":{"name":"Ray Bradbury"}}
Understanding the storage mechanism, now the next step is how to interact with Cache. But before that, let’s categorise the default cache behaviours based on its operations.
Operations where cache can or cannot automatically update
Apollo Client is very smart in that for a lot of operations, it can update the cache for some of the operations, so you only need to manually update the cache for the rest of the operations.
In short, the cache can automatically update itself for queries, single mutations that update a single existing entity, and batch mutations that return the entire set of changed entities. But operations involve adding, removing or reordering entities cannot update automatically. Note that batch mutations that do not return the entire set of changed entities can also not update automatically.
Using a common ToDo app as an example.
- GetTodoById (automatic)
- GetAllTodos (automatic)
- UpdateTodoById(automatic)
- UpdateTodos(automatic)
- AddTodo (no)
- DeleteTo (no)
- GetAllTodosByFilter (no, as the todo order is different with
GetAllTodos
, and the dataset might come back different)
Reading and writing data to the cache
Apollo Client supports multiple strategies for interacting with cached data:
readQuery
/writeQuery
Enables you to use standard GraphQL queries for managing both remote and local data.readFragment
/writeFragment
Enables you to access the fields of any cached object without composing an entire query to reach that object.cache.modify
Enables you to manipulate cached data without using GraphQL at all.
In real life, it’s often see a combination of read+write. Like readQuery
and writeQuery
(or readFragment
and writeFragment
) to fetch currently cached data and make selective modifications to it.
const query = gql`
query MyTodoAppQuery {
todos {
id
text
completed
}
}
`;const data = client.readQuery({ query });const myNewTodo = {
id: '6',
text: 'Start using Apollo Client.',
completed: false,
__typename: 'Todo',
};client.writeQuery({
query,
data: {
todos: [...data.todos, myNewTodo],
},
});
readFragment:
const todo = client.readFragment({
id: 'Todo:5',
fragment: gql`
fragment MyTodo on Todo {
id
text
completed
}
`,
});Unlike readQuery, readFragment requires an id option. This option specifies the unique identifier for the object in your cache. If you don't know about the identifier, using the utility function cache.identify()
writeFragment:
client.writeFragment({
id: 'Todo:5',
fragment: gql`
fragment MyTodo on Todo {
completed
}
`,
data: {
completed: true,
},
});
cache.modify
The modify
method of InMemoryCache
enables you to directly modify the values of individual cached fields. But unlike writeQuery
and writeFragment
: modify
circumvents any merge
functions you've defined, which means that fields are always overwritten.
The modify
method takes the following parameters:
- The ID of a cached object to modify.
- A map of modifier functions to execute, one for each field.
- Optional
broadcast
andoptimistic
boolean values to customise behaviour
Example: Adding an item to a list:
const newComment = {
__typename: 'Comment',
id: 'abc123',
text: 'Great blog post!',
};cache.modify({
fields: {
comments(existingCommentRefs = [], { readField }) {
const newCommentRef = cache.writeFragment({
data: newComment,
fragment: gql`
fragment NewComment on Comment {
id
text
}
`
}); // Quick safety check - if the new comment is already
// present in the cache, we don't need to add it again.
if (existingCommentRefs.some(
ref => readField('id', ref) === newComment.id
)) {
return existingCommentRefs;
} return [...existingCommentRefs, newCommentRef];
}
}
});
Garbage collection and cache eviction
According to the documentation, Apollo Client has a Garbage Collection mechanism whereby the default garbage collection strategy of the gc
method is suitable for most applications, but you can still use methods like evict
to provide more fine-grained control.
cache.gc: The gc
method removes all objects from the normalized cache that are not reachable:cache.gc();
cache.retain: You can use the retain
method to prevent an object (and its children) from being garbage collected, even if the object isn't reachable: cache.retain('my-object-id');
cache.release: If you later want a retain
ed object to be garbage collected, use the release
method: cache.release('my-object-id');
cache.evict: You can remove any normalised object from the cache using the evict
method: cache.evict({ id: 'global-identifier' })
Config the TypePolicy & FieldPolicy
TypePolicy
To customise how the cache interacts with specific types in your schema, you can provide an object mapping __typename
strings to TypePolicy
objects when you create a new InMemoryCache
object. A TypePolicy
object can include the following fields:
type TypePolicy = {
keyFields?: KeySpecifier | KeyFieldsFunction | false; queryType?: true,
mutationType?: true,
subscriptionType?: true,fields?: {
[fieldName: string]:
| FieldPolicy<StoreValue>
| FieldReadFunction<StoreValue>;
}
};type KeySpecifier = (string | KeySpecifier)[];type KeyFieldsFunction = (
object: Readonly<StoreObject>,
context: {
typename: string;
selectionSet?: SelectionSetNode;
fragmentMap?: FragmentMap;
},
) => string | null | void;
FieldPolicy
Inside the TypePolicies, you can supply field level policies for each individual type (object) that you’d like to configure the cache policies for.
A field policy can include:
- A
read
function that specifies what happens when the field's cached value is read - A
merge
function that specifies what happens when field's cached value is written - An array of key arguments that help the cache avoid storing unnecessary duplicate data.
The most important is the read & merge function. Going back to the cache as a store we mentioned at the beginning of this blog, a read policy defines how data goes out of the store, and merge defines how data gets stored.
When using together, read and write function can serve as a Field level middleware that can do whatever you want after the data comes in and before it goes out. This will be illustrated in Apollo Client pagination in our later blog.
Note that there’s a list of helper functions to pass into the read and merge functions as documented here. FieldPolicy
API reference.
Read: When your client queries for object with the FieldPolicy defined, the field is populated with the read
function's return value, instead of the field's cached value. For example, you can even define a read
function for a field that isn't even defined in your schema.
const cache = new InMemoryCache({
typePolicies: {
Person: {
fields: {
userId() {
return localStorage.getItem("loggedInUserId");
},
},
},
},
});
merge
: A common use case for a merge
function is to define how to write to a field that holds an array.:
const cache = new InMemoryCache({
typePolicies: {
Agenda: {
fields: {
tasks: {
merge(existing = [], incoming: any[]) {
return [...existing, ...incoming];
},
},
},
},
},
});Note that existing is undefined the very first time this function is called for a given instance of the field, as the cache does not yet contain any data for the field so we are providing existing = [] default parameter.Also note that you can't push the incoming array directly onto the existing array. It must instead return a new array.
Key arguments: A keyArgument array indicates which arguments are key arguments that are used to calculate the field's value. Specifying this array can help reduce the amount of duplicate data in your cache. Otherwise, the query with the same request data but different variables will be regarded as different objects when saved to the cache.
Let’s say your schema’s Query
type includes a monthForNumber
field. This field returns the details of particular month, given a provided number
argument (January for 1
and so on). The number
argument is a key argument for this field, because it is used when calculating the field's result:
const cache = new InMemoryCache({
typePolicies: {
Query: {
fields: {
monthForNumber: {
keyArgs: ["number"],
},
},
},
},
});
FieldPolicy
API reference
Here are the list for the FieldPolicy
type and its related types in TypeScript:
type FieldPolicy<
TExisting,
TIncoming = TExisting,
TReadResult = TExisting,
> = {
keyArgs?: KeySpecifier | KeyArgsFunction | false;
read?: FieldReadFunction<TExisting, TReadResult>;
merge?: FieldMergeFunction<TExisting, TIncoming> | boolean;
};type KeySpecifier = (string | KeySpecifier)[];type KeyArgsFunction = (
args: Record<string, any> | null,
context: {
typename: string;
fieldName: string;
field: FieldNode | null;
variables?: Record<string, any>;
},
) => string | KeySpecifier | null | void;type FieldReadFunction<TExisting, TReadResult = TExisting> = (
existing: Readonly<TExisting> | undefined,
options: FieldFunctionOptions,
) => TReadResult;type FieldMergeFunction<TExisting, TIncoming = TExisting> = (
existing: Readonly<TExisting> | undefined,
incoming: Readonly<TIncoming>,
options: FieldFunctionOptions,
) => TExisting;interface FieldFunctionOptions {
cache: InMemoryCache;
args: Record<string, any> | null;
fieldName: string;
field: FieldNode | null;
variables?: Record<string, any>;
isReference(obj: any): obj is Reference;
toReference(
objOrIdOrRef: StoreObject | string | Reference,
mergeIntoStore?: boolean,
): Reference | undefined;
readField<T = StoreValue>(
nameOrField: string | FieldNode,
foreignObjOrRef?: StoreObject | Reference,
): T;
canRead(value: StoreValue): boolean;
storage: Record<string, any>;
mergeObjects<T extends StoreObject | Reference>(
existing: T,
incoming: T,
): T | undefined;
}
How Many Ways to Update Cache?
To this point, you might be confused by so many different ways to interact with the Cache object. So how many ways we can use to update, e.g. an addTodo mutation operation?
- Well, you can surely use:
readQuery+writeQuery
- You can also use
cache.modify
- You can also go define the
FieldPolicy
for Todo object on theTypePolicies
We can see an example of option 2 and 3:
Option 2: const [addTodo] = useMutation(ADD_TODO, {
update(cache, { data: { addTodo } }) {
cache.modify({
fields: {
todos(existingTodos = []) {
const newTodoRef = cache.writeFragment({
data: addTodo,
fragment: gql`
fragment AddTodo on Todo {
id
type
}
`
});
return [...existingTodos, newTodoRef];
}
}
});
}
});Option 3:
const cache = new InMemoryCache({
typePolicies: {
Mutation: {
fields: {
addTodo: {
merge(_, incoming, { cache }) {
cache.modify({
fields: {
todos(existing = []) {
return [...existing, incoming]
},
},
})
return incoming
},
},
},
},
},
})
So while Option 1&2 are ad-hoc actions per each individual operations from bottom up, Option 3 is from top down. There is no right or wrong for either way, as long as it suits your need.
That’s so much of it!
Happy Reading!