How We Built a Safe Public GraphQL API with Minimal Code Changes
Raz Tal
September 9, 2024
Share

Our mission at PointFive to eliminate waste across cloud providers — and beyond — is more than just a task; it’s the driving force behind everything we do. This relentless pursuit of efficiency shapes our R&D and informs every solution we design and build. 

So, when the time came to expose our Go-powered GraphQL API to our users, we knew we needed the leanest approach possible, one that would be cost-effective to develop, easy to maintain, and seamless in operation. Our goal was to create a system where exposing new resolvers is effortless, while internal ones remain securely hidden without adding unnecessary overhead.

About our GraphQL Setup

At PointFive, we opted for a schema-first approach for our GraphQL API. 

In a schema-first approach, the GraphQL schema is defined upfront, allowing developers to describe their types, queries, and mutations using a declarative syntax. This approach offers flexibility, enabling the use of directives to easily manage visibility—whether to expose or hide certain parts of the schema.

By leveraging this flexibility, we adopted a directive-based system that allowed us to annotate our schema with custom directives such as @expose and @hide.

This setup provided us with a clean way to define what should be publicly accessible and what should remain internal, all while minimizing code changes and reducing maintenance overhead.

The Directives Approach

To control what is exposed or hidden in our graph, we implemented a directive-based approach with two key directives:

  1. @expose Directive: Explicitly exposes new types, queries, and mutations. By default, these are hidden, but their subcomponents are automatically exposed.
  2. @hide Directive: Fields, enums and inputs - subcomponents of types, queries and mutations are exposed by default when their parent objects are exposed. We use ‘@hide’ to keep specific elements private, ensuring sensitive or unnecessary data remains internal.

This dual-directive system allows us to manage our schema visibility efficiently, exposing exactly what we want while keeping everything else under wraps.

Here’s a simplified example of how we use these directives:

1type Query {
2   getUserData: User @expose
3   getInternalData: InternalData
4}
5
6
7type User @expose {
8    id: ID
9    name: String
10   email: String @hide 
11}
12
13
14type InternalData {
15   sensitiveInfo: String
16}

In this example:

  • The getUserData query is exposed, making it available to the public API.
  • The getInternalData query is hidden by default, keeping it internal.
  • The User type exposes the id and name fields but hides the email fields. 

Why Not GraphQL Introspection Filtering?

At first glance, introspection filtering seemed like an ideal solution for hiding non-exposed types, queries, and mutations. For those unfamiliar with the concept, GraphQL introspection allows clients to query the schema itself, revealing the available types and operations. However, as we delved deeper, it became clear that introspection filtering was more complicated and flawed than it initially appeared.

One significant issue was that the GraphQL engine’s “did you mean” suggestions could inadvertently expose non-public queries. Even when a resolver wasn’t intended for exposure, similar-sounding queries could trigger suggestions, potentially revealing sensitive information. 

This not only led to internal inconsistencies but also introduced security vulnerabilities that we couldn’t ignore. These issues, along with other complexities, made it evident that introspection filtering wasn’t the lean, efficient solution we needed.

That said, we did find a specific use for introspection filtering when managing directives — more on that later!

How We Implemented GraphQL Filtering

Recognizing the limitations of introspection filtering, we adopted a more decisive approach: manually refining the schema before it ever becomes public. This strategy centers around the use of @expose and @hide directives, which act as placeholders within our schema to dictate what should be publicly accessible and what should remain hidden. 

These directives don’t contain any logic themselves. Instead, they are embedded in the schema as “markers”, which we later read and modify accordingly before finalizing the schema for use by the GraphQL server.

Though we identify the types, queries and mutations marked with @expose to determine what should be available externally, the process goes beyond simple exposure of the root elements. We refine those elements by removing internal directives not meant for public use and filtering out unnecessary arguments, fields, and types. This ensures that what we expose is both deliberate and secure, with all irrelevant or sensitive parts removed.

To further protect our API, we apply introspection filtering to hide internal directives like @expose and @hide, along with other sensitive schema components. This keeps the inner workings of our system hidden, safeguarding our internal logic and operations.

While @expose governs what is broadly accessible, the @hide directive allows us to fine-tune visibility at a more granular level. Fields within exposed types, queries, or mutations are included by default, but for enums, inputs, and types, they must be exposed. The @hide directive lets us selectively conceal specific subcomponents that should remain internal, providing the flexibility to hide or reveal as needed.

Implementation in Go

Here’s a simplified look at how we implemented this approach in Go:

1func newPublicExecutableSchema(c generated.Config, schema *ast.Schema) graphql.ExecutableSchema {
2   c.Schema = &ast.Schema{
3      Query:    filterQueriesAndMutations(schema.Query),
4      Mutation: filterQueriesAndMutations(schema.Mutation),
5      Directives:    schema.Directives,
6      Types:         filterTypes(schema.Types),
7      PossibleTypes: filterImplementsAndPossibleTypes(schema.PossibleTypes),
8      Implements:    filterImplementsAndPossibleTypes(schema.Implements),
9   }
10   return generated.NewExecutableSchema(c)
11}
  • The function constructs a new public executable schema by filtering out any fields, types, queries and mutations that are not explicitly exposed.
  • Directives are filtered by Introspection Filtering, since if we filter them here, we won’t be able to use them as we intend.

Directive-Based Exposure Control

To determine which Type should be exposed, we check if the “expose” directive exists and “hide” is not present, which gives us the default hidden behavior for Types.

1func mustExposeTypesByDirectives(directives ast.DirectiveList) bool {
2   // Returns true if "expose" directive exists and "hide" is not present
3   return directives.ForName(middlewares.ExposeDirective) != nil && directives.ForName(middlewares.HideDirective) == nil
4}

To determine which Field should be exposed, we only need to check if the “@hide” directive is not present, which presents us with the default exposed behavior.

1func shouldExposeFieldsByDirectives(directives ast.DirectiveList) bool {
2   // Returns true if "hide" directive is not present
3   return directives.ForName(middlewares.HideDirective) == nil
4}

** Note:

  • Types refers to Queries, Mutations, GraphQL Types, Inputs and Enums.
  • Fields refers to Type fields, Enum fields and Argument fields.

Filtering Queries and Mutations

To filter queries and mutations, we implemented the following logic:

1func filterQueriesAndMutations(definitions *ast.Definition) *ast.Definition {
2   collectedFields := make([]*ast.FieldDefinition, 0, len(defs.Fields))
3   for _, fd := range definitions.Fields {
4      if mustExposeTypesByDirectives(fd.Directives) || fd.Name == "__schema" {
5         fd.Directives = filterDefinitionDirectives(fd.Directives)
6         fd.Arguments = filterDefinitionArguments(fd.Arguments)
7         collectedFields = append(collectedFields, fd)
8      }
9   }
10   definitions.Fields = collectedFields
11   if len(defs.Fields) == 0 {
12      definitions.Fields = nil
13   }
14   return definitions
15}
  • This function iterates over all queries and mutations checking each field, directive, and argument within a query or mutation definition to determine if it should be exposed based on the presence of the exposure directives. 
  • Only fields marked for exposure are retained in the public schema.

Filtering Types

Similarly, we filter types using the following approach:

1func filterTypes(types map[string]*ast.Definition) map[string]*ast.Definition {
2   collectedTypes := make(map[string]*ast.Definition)
3   for t, def := range types {
4      if !lo.Contains(builtInTypes, def.Name) && !mustExposeTypesByDirectives(def.Directives) && def.Kind != ast.Scalar {
5         continue
6      }
7      switch def.Kind {
8      case ast.Object, ast.Interface, ast.Union, ast.InputObject, ast.Scalar:
9         def.Fields = lo.Filter(def.Fields, func(fd *ast.FieldDefinition, _ int) bool {
10            return shouldExposeFieldsByDirectives(fd.Directives)
11         })
12      case ast.Enum:
13         def.EnumValues = lo.Filter(def.EnumValues, func(ev *ast.EnumValueDefinition, _ int) bool {
14            return shouldExposeFieldsByDirectives(ev.Directives)
15         })
16      }
17      collectedTypes[t] = def
18   }
19   return collectedTypes
20}
  • This function ensures that only types with the @expose directive or built-in types (e.g., Query, Mutation) are exposed. It also filters out any internal fields or enum values that should remain hidden.

You can explore our repository to check out the Go module, graphql-schema-filter, which you can download and integrate into your project if you use a similar GraphQL setup. Additionally, see the full working example under the examples/ folder to understand how to implement this in your own schema. By filtering the schema at runtime, we create a public GraphQL schema that aligns with our visibility rules, effectively managing exposure and keeping our development process lean and secure.To ConcludeWe set out to create a system where users could easily expose new resolvers while still keeping internal ones hidden and secure. Though we explored introspection filtering, we ultimately decided to manually refine the schema before it goes public. This approach allowed us to maintain security, reduce complexity and ensure that our API remains efficient and easy to manage.

If you ever had to publicize your GraphQL schema, let us know!

Drop us a message on LinkedIn or X.

Share
Stay connected
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Find out more
How PointFive Enabled Cloud Cost Ownership and Action for Nubank Engineers
Read more
Our Future in Cloud Cost Optimization: A New Milestone
Read more
PointFive Secures $20M In Series A Funding to Accelerate Multi-Cloud Support
Read more
STARTING POINT

Discover deeper cloud efficiency with PointFive.

Book a Demo