Enterprise Knowledge Graph

1 Clarifying Questions & Scope

Dimension	Clarification	Assumption
Data Sources	Where does graph data come from?	HR systems, AD/LDAP, ITSM, CMDB, ticketing systems
Entity Types	What kinds of nodes?	Person, Team, Role, Application, Permission, Document, Ticket
Query Patterns	What questions does the agent ask?	"Who manages X?", "Who has access to Y?", "What team owns Z?"
Freshness	How current must the graph be?	Real-time for org changes, daily for CMDB/app data
Scale	How large?	50K nodes/customer, 500K edges/customer, 350 customers

2 Back-of-Envelope Estimation

        Scale Numbers
        17.5M nodes total (350 customers x 50K nodes avg)
175M edges total (350 customers x 500K edges avg)
1M graph queries/day across all tenants
<50ms for 2-hop traversal queries
Graph size per tenant: ~500 MB (50K nodes, 500K edges)
Total graph storage: ~175 GB

      

3 High-Level Architecture

  ENTERPRISE KNOWLEDGE GRAPH
  ═══════════════════════════════════════════════════════════════════

  DATA SOURCES                SYNC PIPELINE               GRAPH STORE
  ┌──────────┐              ┌──────────────┐           ┌──────────────┐
  │ HR System│──webhook──-->│              │           │              │
  │ (Workday)│              │   Change     │           │   Neo4j      │
  ├──────────┤              │   Detection  │           │              │
  │ AD/LDAP  │──webhook──-->│      +       │──────────>│   Nodes:     │
  ├──────────┤              │   Conflict   │           │   Person     │
  │ ITSM     │──polling──-->│   Resolution │           │   Team       │
  │ (CMDB)   │              │      +       │           │   Role       │
  ├──────────┤              │   Validation │           │   Application│
  │ Ticketing│──polling──-->│              │           │   Permission │
  └──────────┘              └──────────────┘           │   Document   │
                                                       │   Ticket     │
                                                       └──────┬───────┘
                                                              │
                                                    ┌─────────v────────┐
                                                    │   Query Engine   │
                                                    │   Cypher queries │
                                                    │   Cached paths   │
                                                    └─────────┬────────┘
                                                              │
                                                    ┌─────────v────────┐
                                                    │   AI Agent       │
                                                    │   Context-aware  │
                                                    │   decisions      │
                                                    └──────────────────┘

4 Deep Dive 1: Graph Schema

Node Types

Node Type	Key Properties	Source System
Person	name, email, employee_id, department, location, title, status	HR (Workday), AD/LDAP
Team	name, team_id, type (engineering/ops/support), size	HR, ServiceNow
Role	name, role_id, level (viewer/editor/admin), scope	IAM, AD
Application	name, app_id, type (SaaS/internal), criticality, owner_team	CMDB
Permission	permission_id, type (read/write/admin), scope, expiry	IAM, AD, App-specific
Document	title, doc_id, type, created_by, space, last_modified	Confluence, SharePoint
Ticket	ticket_id, type, status, priority, assignee, created_date	ServiceNow, Jira

Edge Types (Relationships)

Edge Type	From → To	Properties
MANAGES	Person → Person	since_date
MEMBER_OF	Person → Team	role_in_team, since_date
HAS_ROLE	Person → Role	granted_date, granted_by
HAS_ACCESS_TO	Person/Role → Application	access_level, granted_date, expiry
OWNS	Team → Application	ownership_type (primary/secondary)
CREATED_BY	Document/Ticket → Person	created_date

Example Graph Path

  EXAMPLE: "Who can approve Jane's access to Salesforce?"
  ═══════════════════════════════════════════════════════

  (Jane)──MEMBER_OF──>(Engineering Team)
    │                       │
    │                       └──OWNS──>(Internal Tools)
    │
    └──MANAGES──>(Sarah Chen - Manager)
                      │
                      └──HAS_ROLE──>(Approver Role)
                                         │
                                         └──HAS_ACCESS_TO──>(Salesforce)

  Traversal: Jane → MANAGES → Sarah → HAS_ROLE → Approver
  Answer: "Sarah Chen (Jane's manager) can approve Salesforce access.
           She has the Approver role with admin-level access."

  ANOTHER EXAMPLE: "What systems does the Security team own?"
  ═══════════════════════════════════════════════════════

  (Security Team)──OWNS──>(Okta)
        │           └──>(CrowdStrike)
        │           └──>(Vault)
        │           └──>(PagerDuty)
        │
        └──MEMBER_OF──>(Alice - Lead)
        └──MEMBER_OF──>(Bob - Engineer)
        └──MEMBER_OF──>(Carol - Analyst)

5 Deep Dive 2: Sync Pipeline

Dual Sync Strategy

Webhooks (real-time): HR system fires webhook when employee is hired, transferred, or terminated. AD fires webhook on group membership changes. Graph updated within seconds.
Nightly batch (completeness): Full reconciliation job runs nightly. Pulls complete data from all sources. Detects any changes missed by webhooks. Resolves conflicts.

Conflict Resolution

Source of Truth Hierarchy

When two sources disagree about the same fact, the source system of record wins:

Org structure (manager, department): HR system (Workday) is truth
Group memberships: AD/LDAP is truth
Application ownership: CMDB is truth
Permissions: IAM system is truth

If Workday says Jane reports to Sarah but ServiceNow says Jane reports to Mike, Workday wins. Always.

Change Detection

Hash comparison: Hash each node's properties. Compare with stored hash. Only update if changed. Reduces write amplification by 90%.
Tombstoning: Deleted entities aren't removed immediately. Marked as `status: inactive` with deletion timestamp. Purged after 30 days. Allows rollback and audit.
Edge validation: When a Person node is deactivated (employee leaves), automatically deactivate their MANAGES, MEMBER_OF, HAS_ACCESS_TO edges. Cascade rules per edge type.

6 Deep Dive 3: Agent Integration

How the AI Agent Uses the Graph

1 Context Queries

When a user starts a conversation, the agent enriches context by querying the graph:

  User: jane@acme.com starts a conversation

  Agent queries graph:
  ─────────────────────────────────────────
  MATCH (p:Person {email: "jane@acme.com"})
  OPTIONAL MATCH (p)-[:MEMBER_OF]->(t:Team)
  OPTIONAL MATCH (p)-[:MANAGES]->(m:Person)
  OPTIONAL MATCH (p)-[:HAS_ACCESS_TO]->(a:Application)
  RETURN p, t, m, a

  Result enriches conversation context:
  "Jane is in Engineering, managed by Sarah,
   has access to Jira, GitHub, Salesforce."

2 Approval Routing (Traverse MANAGES)

Query: "Who should approve this request?" → Traverse MANAGES edges up the org chart until finding someone with the required approval authority.
Example: Jane requests admin access to production DB. Traverse: Jane → Sarah (Manager, can approve read-only) → VP Engineering (can approve admin). Route to VP.

3 Permission Checking (Traverse HAS_ACCESS_TO)

Query: "Does Jane have access to Salesforce?" → Traverse HAS_ACCESS_TO from Jane to Salesforce. Check access_level property.
Transitive access: Jane is MEMBER_OF Sales Team, Sales Team HAS_ACCESS_TO Salesforce. Jane has access through team membership.

4 Smart Suggestions

Similar ticket routing: "5 similar tickets from Jane's team were resolved by the Database team" → Suggest routing to Database team.
Access recommendations: "90% of Jane's team members have access to Tableau. Jane doesn't." → Proactively suggest access request.
Expert finding: "Who on the Security team has the most resolved tickets for VPN issues?" → Graph traversal + ticket aggregation.

7 Scaling & ML

Scaling Strategies

Per-tenant graph partitioning: Each tenant gets their own Neo4j database (or labeled subgraph). No cross-tenant traversal possible.
Read replicas: Agent queries hit read replicas. Writes go to primary. Replication lag <1s.
Query caching: Cache frequent traversal results in Redis. "Jane's manager" doesn't change often — cache with TTL 1 hour. Cache invalidated on webhook updates.
Index strategy: Composite indexes on (tenant_id, email), (tenant_id, app_name), (tenant_id, team_name) for fast lookups.

ML Enhancements

Graph embeddings: Use Node2Vec or GraphSAGE to create vector embeddings of nodes. Similar people/teams cluster together. Enables "find people similar to X" queries.
Link prediction: Predict missing edges. "Jane's team all have Tableau access except Jane" → Predict SHOULD_HAVE_ACCESS edge. Surface as recommendation.
Anomaly detection: Detect unusual patterns. "User has admin access to 50 applications" → likely over-provisioned. "Manager has 100 direct reports" → likely data quality issue.
Community detection: Identify informal communities (people who collaborate across team boundaries). Use Louvain or Label Propagation algorithms. Helps agent route cross-team requests.

8 Cheat Sheet

Enterprise Knowledge Graph — Key Numbers

17.5M nodes, 175M edges across 350 tenants
1M graph queries/day, <50ms for 2-hop traversal
7 node types: Person, Team, Role, Application, Permission, Document, Ticket
6 edge types: MANAGES, MEMBER_OF, HAS_ROLE, HAS_ACCESS_TO, OWNS, CREATED_BY
Neo4j with per-tenant partitioning
Webhooks (real-time) + nightly batch (completeness)
Conflict resolution: source system of record wins
Agent uses graph for: context, approval routing, permission checks, suggestions
Graph embeddings (Node2Vec) for "similar people" queries
Link prediction for access recommendations

Table of Contents