04. Entities & @key — referential identity

Federation의 모든 동작이 @key 한 디렉티브에서 비롯된다. @key를 잘못 이해하면 federation 전체가 마법처럼 보이고, 정확히 이해하면 너무 단순하게 느껴진다.

이 문서는 Why @key exists에서 시작해서 _entities까지 한 호흡으로 추적한다.

한 줄 답

@key는 데이터베이스의 primary key가 아니라, GraphQL 그래프의 referential identity다. “이 type은 다른 subgraph가 참조하거나 확장할 수 있다”는 선언이고, 그 referential identity 위에서 모든 federation 합성이 일어난다. 런타임에 이 약속은 spec 메타 쿼리 _entities(representations:) + __resolveReference 한 쌍으로 구현된다.

Why — @key는 왜 필요한가

문제 상황: Product가 Product subgraph에 있고, Review가 Review subgraph에 있다. 클라이언트가 { productById(id: "p1") { reviews { rating } } }를 보냈다.

router는 Review subgraph에 reviews를 물어봐야 한다. 그런데 Review subgraph는 Product의 어떤 식별자를 받아야 reviews를 찾을 수 있나?

→ 모든 subgraph가 합의된 식별자를 알아야 한다. 그게 @key다.

# Product subgraph
type Product @key(fields: "id") {
  id: ID!
  name: String!
}
 
# Review subgraph
type Product @key(fields: "id") {
  id: ID!   # 같은 key 선언!
  reviews: [Review!]!
}

→ @key(fields: "id")가 두 subgraph 모두에서 동일해야 router가 *“이건 같은 Product다”*라고 인식할 수 있다. 다르면 composition error.

왜 primary key가 아닌가

데이터베이스의 PK: 한 테이블 안에서 행을 식별. 내부 구현 디테일.
federation의 @key: 그래프 전체에서 동일 identity를 합의. 외부 계약.

PK는 AUTO_INCREMENT integer일 수도 있지만 — @key는 클라이언트에 보일 수 있는, 다른 subgraph가 다룰 수 있는 무언가여야 한다. 보통 UUID, slug, business key 같은 것.

⚠️ DB PK를 그대로 @key로 쓰는 것은 흔하지만 — 의미는 다르다. PK는 우연이고, @key는 의도다.

How — @key의 4가지 형태

(1) 단일 필드 key

type Product @key(fields: "id") {
  id: ID!
  name: String!
}

→ 가장 흔한 케이스. ID 하나로 identity 결정.

(2) Compound key (복합 키)

type Order @key(fields: "userId orderNumber") {
  userId: ID!
  orderNumber: Int!
  total: Float!
}

→ 두 필드의 조합으로 identity. e.g., 주문은 (사용자, 주문번호) 조합으로 유일.

(3) Nested key

type Listing @key(fields: "host { id } id") {
  id: ID!
  host: User!
}
 
type User {
  id: ID!
}

→ 중첩된 path도 가능. 자주 쓰지는 않음.

(4) Multiple keys (대안적 identity)

type User @key(fields: "id") @key(fields: "email") {
  id: ID!
  email: String!
  name: String!
}

→ 같은 entity를 id로도 email로도 참조 가능. 다른 subgraph는 둘 중 자기에게 편한 걸 쓰면 됨.

Spec — `_entities` 메타 필드

Federation의 모든 entity 해석은 spec이 정의한 두 메타 필드에서 일어난다.

`Query._entities(representations: [_Any!]!): [_Entity]!`

router가 부분 쿼리를 보낼 때 사용. 시그니처:

type Query {
  # spec이 정의 — 사람이 안 씀
  _entities(representations: [_Any!]!): [_Entity]!
}
 
scalar _Any   # 임의의 JSON
union _Entity = Product | Review | User | ...  # 합성된 모든 entity의 union

동작 예시 — router → review subgraph

Client: { productById(id:"p1") { reviews { rating } } }

Router의 Step 2 (Review subgraph에 보내는 쿼리):

query ReviewsForProducts($reps: [_Any!]!) {
  _entities(representations: $reps) {
    ... on Product {
      reviews { rating }
    }
  }
}

# variables:
{ "reps": [{ "__typename": "Product", "id": "p1" }] }

→ Review subgraph는 Product에 대해 자기가 아는 필드(reviews)만 해석한다. 이게 _entities의 정체.

`__resolveReference` — subgraph의 응답 함수

_entities가 들어오면 — Review subgraph는 각 representation에 대해 __resolveReference를 호출한다.

// Review subgraph resolvers
const resolvers = {
  Product: {
    __resolveReference(reference, context) {
      // reference = { __typename: "Product", id: "p1" }
      return { id: reference.id };
      // 또는 DB 조회: return await db.products.findOne(reference.id);
    },
    reviews(product, _, ctx) {
      return ctx.reviewsByProductId(product.id);
    },
  },
};

→ __resolveReference는 *“이 reference를 너는 어떻게 해석하니?”*에 대한 답. → 보통 DB 조회를 안 해도 된다 — reference.id만 받아서 *그 id에 대한 자기 필드 (reviews)*를 평가하면 충분.

왜 spec이 이렇게 설계됐나

설계자(Apollo)의 의도:

Subgraph가 다른 subgraph의 전체 type을 몰라도 됨 — Review는 Product의 id 외 모든 필드를 모름.
Router가 부분 쿼리만 보내도 됨 — 효율적 fan-out.
Sub-resolver chain이 일반 resolver처럼 동작 — __resolveReference 이후엔 일반 GraphQL 실행.
DataLoader 사용 가능 — __resolveReference가 batch 가능한 자리. N개 product의 reference가 한 번에 오면 N+1을 피할 수 있다.

What — 실전 예시 (3 subgraph)

전형적 e-commerce 구조:

# === Product subgraph ===
type Product @key(fields: "id") {
  id: ID!
  name: String!
  price: Float!
  category: Category!
}
 
type Category @key(fields: "id") {
  id: ID!
  name: String!
}
 
type Query {
  productById(id: ID!): Product
  products(filter: ProductFilter): [Product!]!
}

# === Review subgraph ===
type Review @key(fields: "id") {
  id: ID!
  rating: Int!
  body: String!
  author: User!
  product: Product!
}
 
# Product를 *참조만* — Review가 기여하는 필드는 reviews뿐
type Product @key(fields: "id") {
  id: ID!
  reviews(first: Int): [Review!]!
  averageRating: Float!
}
 
# User도 *참조* — author 연결용
type User @key(fields: "id") {
  id: ID!
}

# === User subgraph ===
type User @key(fields: "id") {
  id: ID!
  name: String!
  email: String!
}
 
type Query {
  me: User
  userById(id: ID!): User
}

클라이언트 쿼리와 router 동작

query ProductPage {
  productById(id: "p1") {
    name
    price
    reviews(first: 3) {
      rating
      author { name }
    }
    averageRating
  }
}

router의 query plan:

Product subgraph: { productById(id:"p1") { __typename id name price } }
Review subgraph (Step 1 결과 받아서): { _entities(representations:[{__typename:"Product", id:"p1"}]) { ... on Product { reviews(first:3) { __typename id rating author { __typename id } } averageRating } } }
User subgraph (Step 2 결과의 author들 받아서): { _entities(representations:[{__typename:"User", id:"u1"}, ...]) { ... on User { name } } }
Router merge: 셋의 응답을 원래 쿼리 모양으로 합쳐 클라이언트에 반환.

→ 3-hop. latency * 3-ish. 이게 federation의 런타임 비용.

What-if — 흔한 함정

함정 1: `@key`를 안 붙임

type Product {   # @key 없음
  id: ID!
  name: String!
}

→ 이 type은 그냥 type이고 entity가 아니다. 다른 subgraph가 참조도 확장도 못 함. composition error 또는 그래프 단절.

함정 2: `@key`가 두 subgraph에서 다름

# Product subgraph
type Product @key(fields: "id") { id: ID! }
 
# Review subgraph
type Product @key(fields: "sku") { sku: String! }

→ composition error: 같은 type이 다른 identity를 주장. 반드시 일치해야 함 (혹은 multiple keys로 둘 다 선언).

함정 3: `__resolveReference` 빼먹음

→ Review subgraph가 Product를 contribute하지만 __resolveReference 안 만들면 — router가 Product를 어떻게 해석할지 모름. 런타임 에러.

함정 4: `__resolveReference`에서 모든 필드를 DB에서 가져옴

// 나쁜 예
Product: {
  __resolveReference(ref) {
    return db.products.findOne(ref.id);  // 전체 필드 fetch
  }
}

→ Review subgraph는 reviews만 알아야 하는데 DB에서 Product 전체를 가져오면 N+1 발생. 보통 { id: ref.id }만 반환하고 각 필드 resolver에서 lazy load.

함정 5: Compound key의 부분만 제공

representations: [{ __typename: "Order", userId: "u1" }]   # orderNumber 빠짐

→ key가 복합인데 부분만 받으면 entity 식별 불가. composition은 통과하더라도 런타임 null 반환.

Insight — 흥미로운 이야기

“@key는 데이터 모델링의 의무를 그래프 레벨로 끌어올린 것”

모놀리식 GraphQL에선 entity identity를 resolver가 알아서 처리했다 — 그냥 DB PK로 join하면 됐으니까. Federation은 그 암묵적 약속을 명시적 SDL 선언으로 끌어올렸다. 그 덕에 컴파일 타임에 무결성 검증이 가능해졌고, 동시에 팀 간 합의의 표면이 SDL에 박혔다.

→ 데이터 모델링이 거버넌스의 일부가 된 셈.

“Apollo 디렉티브 spec이 그래프 DBMS의 영감을 줬다”

Neo4j, ArangoDB 같은 그래프 DB가 federation의 @key 모델에 영향을 받았다고 공언한다. referential identity를 schema level로 모델링하는 패턴은 그래프 데이터베이스의 vertex identity 개념과 거의 동형이다.

“왜 _entities가 underscore로 시작하나”

GraphQL spec은 예약된 메타 필드가 __로 시작한다 (e.g., __schema, __type). Federation은 spec이 아니지만 spec처럼 보이고 싶어서 _(언더스코어 1개)로 절충. user-defined fields와 충돌하지 않는 명명 규칙 — 사소하지만 영리한 결정.

“DataLoader가 _entities와 완벽히 궁합 맞는다”

Router는 한 번의 _entities 호출에 N개 representation을 묶어 보낸다. 그래서 subgraph의 __resolveReference를 DataLoader로 감싸면 — 진짜로 batch + cache가 동작한다. federation의 N+1 해법도 결국 03 챕터의 DataLoader.

요약 + Mermaid

@key는 primary key가 아니라 referential identity — “이 type을 외부에서 참조/확장할 수 있다”는 선언. 런타임에 이 약속은 _entities(representations:) + __resolveReference 쌍으로 구현된다. Single key, compound key, multiple keys 세 형태를 안다면 federation entity 모델은 끝이다.

03. Federation v2 기본 — subgraph + supergraph + router 05. @shareable & @override — 소유권 이전 도구