When a defense contractor starts asking about AI — and they all are now — the first question is almost always about the vendor. Which model? Which platform? What does Microsoft charge? These are the wrong first questions.

The right first question is: which Azure tenant are you running this in? And the second question is: have you drawn the system boundary, and does your SSP reflect it?

I've deployed a production enterprise AI assistant inside a CMMC-scoped GCC High tenant — Azure OpenAI, Azure AI Search, CosmosDB, Bot Framework, the full stack. This is not a POC. It runs in production, it handles queries against a corpus of documents that includes controlled information, and it was built from the ground up to support an SSP entry that could survive a DCSA review. Here is what I learned doing that, and what contractors considering similar deployments need to understand before they start.

The Fundamental Difference Between GCC High and Commercial Azure

GCC High is not a configuration of commercial Azure. It is a separate cloud environment, operated by a separate Microsoft entity — Microsoft Government Cloud — with separate endpoints, separate data centers, a separate Microsoft Entra ID tenant infrastructure, and a separate authorization boundary. When CMMC requirements specify that CUI must be protected in a FedRAMP High-authorized environment, they mean GCC High (or an equivalent DoD Impact Level 4/5 authorized cloud). They do not mean commercial Azure, regardless of how carefully you configure it.

This is the part that catches contractors who have done their Azure research in the commercial world. The FedRAMP authorization that matters is not the FedRAMP Moderate authorization on commercial Azure. It is the authorization on the specific environment where your data lives. Commercial Azure has FedRAMP authorizations. Those authorizations do not extend to your GCC High requirements. You need to be in GCC High — full stop.

The practical consequence: if you built your AI capability in a commercial Azure tenant because that's where your development team already works, and you're now trying to figure out how to make it compliant, the answer is not "configure more security controls." The answer is "migrate to GCC High or build a new deployment there." The environments are not interchangeable.

What's Technically Different About GCC High

Endpoints: GCC High uses *.us endpoints throughout. Authentication is login.microsoftonline.us, not login.microsoftonline.com. Microsoft Graph is graph.microsoft.us, not graph.microsoft.com. Every integration you build — every OAuth flow, every API call, every service connection — needs to be written against GCC High endpoints. Code written for commercial Azure that assumes commercial endpoints will fail in GCC High, often in ways that aren't immediately obvious during testing if you're testing from a commercial context.

Microsoft Graph: The GCC High Graph endpoint has a different API surface than commercial Graph. Some APIs available in commercial are not available in GCC High, or are available with different permissions models. Applications that rely heavily on Graph — Teams integrations, SharePoint document access, user directory queries — need to be validated against GCC High Graph specifically. Don't assume parity.

Service availability: Not every Azure service available in commercial is available in GCC High, and availability in GCC High often lags commercial by months to years for new features and services. This is the most operationally consequential difference. Azure OpenAI is available in GCC High — but the model lineup is narrower than commercial, the deployment regions are limited, and new model releases appear in GCC High after commercial. Azure AI Search (formerly Azure Cognitive Search) is available. Azure Machine Learning has limited availability. Some cognitive services are simply not available.

Data residency: GCC High data is stored in U.S. data centers operated by U.S. persons who have undergone personnel security screening. This is the compliance requirement the environment is built around. Commercial Azure offers data residency configurations, but the personnel controls are different. For CMMC and for any program with ITAR implications, this distinction matters.

Azure AI Services in GCC High: What's Available and What Isn't

Before you architect anything, validate current service availability in GCC High against your specific deployment region. The situation is not static — availability improves over time — but designing for commercial availability and then discovering gaps at deployment time is an expensive mistake.

As of this writing: Azure OpenAI is available in GCC High with GPT-4 model variants (specific deployments vary by region; USGov Virginia and USGov Arizona are the primary regions). Azure AI Search is available and is the correct vector store choice for GCC High RAG deployments. Azure Cosmos DB is available. Azure Blob Storage is available. Azure Monitor and Log Analytics are available. Azure Virtual Networks and private endpoints are available and are required for compliant architectures.

What is restricted or unavailable: DALL-E image generation, some newer GPT-4o variant features, Azure OpenAI Assistants API (availability lags commercial), some Azure Cognitive Services categories (Face API, certain language services). Azure Bot Service has availability in GCC High but the feature set lags commercial.

The practical design rule: architect against GCC High availability from day one. Do not build in commercial and migrate. Build a proof of concept in GCC High — even a minimal one — before committing to an architecture that depends on services you haven't validated in the target environment.

The Compliant RAG Architecture

A compliant retrieval-augmented generation deployment in GCC High for CMMC-scoped work looks like this. Every component is in GCC High. Nothing touches commercial Azure. Nothing is publicly accessible.

Document corpus and storage: Azure Blob Storage in GCC High. Access controlled via Azure RBAC tied to Entra ID (GCC High tenant) groups — no anonymous access, no shared access signatures with broad permissions. Versioning enabled. Soft delete enabled. Containers locked to specific service identities. If the documents contain CUI, the storage account is in scope for every relevant NIST SP 800-171 control family — AC, AU, CM, IA, SC.

Vector store and retrieval: Azure AI Search in GCC High. Private endpoint — the service has no public internet access. The index contains vector embeddings of the document corpus, generated at ingestion time. Search is invoked by the application layer using the service's API key or managed identity — never directly by end users. Index updates are triggered by a pipeline that processes new documents from Blob Storage.

AI model endpoint: Azure OpenAI in GCC High. Private endpoint — no public internet access. The model deployment (GPT-4 variant) receives context from the retrieval layer and generates responses. System prompt enforces scope and behavioral constraints. The Azure OpenAI resource is accessed by the application layer using a managed identity, not an API key stored in code or configuration.

Conversation history: Azure Cosmos DB in GCC High. Conversation history is CUI if the documents being queried contain CUI — it stores the questions users asked and the context retrieved. Treat it accordingly. Retention policy defined and enforced. Access controlled to the application identity only.

Authentication: Azure Entra ID in the GCC High tenant. All users authenticate via Entra before accessing the application. Multi-factor authentication enforced. Role-based access controls determine which users can query which document sets. Service-to-service communication uses managed identities — no credentials in code.

Network: All services on a private virtual network in GCC High. Private endpoints for every service — Azure OpenAI, Azure AI Search, Cosmos DB, Blob Storage. Network Security Groups with explicit allow rules: deny all inbound from internet, allow specific application-tier traffic, allow Azure Monitor traffic for logging. No public internet exposure for any backend service. Application front-end may be exposed, but sits in front of the private network and handles its own authentication before passing requests to backend services.

Audit logging: Azure Monitor and Log Analytics in GCC High. Diagnostic settings enabled on every service — every API call to Azure OpenAI logged, every search query logged, every storage access logged, every authentication event logged. This is required for NIST SP 800-171 AU control family compliance. The AU.3.045 requirement for audit record review means someone needs to actually look at these logs on a defined schedule — not just collect them.

The SSP System Boundary Diagram You Must Draw Before Writing Code

Before writing code, before provisioning resources, draw the boundary. Your SSP needs a system boundary diagram that shows: every Azure service in scope, data flows between services, external interfaces (end-user endpoints, authentication flows, external system integrations), and the network perimeter. The diagram should be specific enough that an assessor looking at it can understand how CUI flows through the system and what controls exist at each junction.

If you cannot draw this diagram, you cannot defend your scope to a C3PAO assessor. "We use Azure OpenAI in GCC High" is not a system boundary. "Azure OpenAI (private endpoint, USGov Virginia, accessed via managed identity from App Service in the same VNet) receives queries from the application layer after user authentication via Entra ID MFA; responses are returned to the application and conversation history is written to Cosmos DB (private endpoint, same VNet); no CUI leaves the GCC High boundary" is a system boundary that can be documented, assessed, and defended.

What the Assessor Will Ask

A C3PAO assessor reviewing an AI deployment in a CMMC Level 2 assessment will look for specific things. Is this system documented in the SSP? Are all Azure services listed in the system component inventory? Are the network controls — private endpoints, NSGs, VNet configuration — documented and consistent with the actual configuration? Is there an audit trail showing who queried what and when? Is there an acceptable use policy specifically covering the AI system? Is there documentation of how the model was evaluated for appropriate use with CUI?

They will also look at whether the SSP was amended when the AI system was deployed or whether it still describes the pre-AI environment. A legacy SSP with a paragraph about "we might use AI in the future" is a finding. The SSP must describe the system as it actually exists, including the AI components, their data flows, and the controls applied to them.

Getting the architecture right is necessary but not sufficient. The documentation needs to match the architecture, and the architecture needs to be defensible under each of the NIST SP 800-171 Rev 3 control families in scope.

Building AI in GCC High for CMMC-scoped work?

Fulcrum Advisory has deployed this architecture in production — and written the SSP amendment, acceptable use policy, and data handling procedures to go with it.

Schedule a Call