In Part 3 we ran the RFP evaluator on Azure OpenAI. Now we point it at Google Vertex AI and Gemini, again by changing only configuration. The interesting difference is authentication: GCP uses Application Default Credentials (ADC), which works quite differently from an Azure API key. The graph stays exactly as it was.
Why Google Vertex AI
Vertex AI is the natural fit for teams already on Google Cloud who want Gemini models, GCP-native IAM, and regional control. Gemini 2.5 Flash is fast and inexpensive for high-volume proposal scoring.
Step 1 — Project and API
gcloud config set project YOUR_PROJECT_ID
gcloud services enable aiplatform.googleapis.com
Step 2 — Authentication (ADC)
You have two clean options. For local development, use your own credentials:
gcloud auth application-default login
For a server or container, create a service account and a key file:
gcloud iam service-accounts create rfpeval-sa \
--display-name "RFP evaluator"
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member "serviceAccount:rfpeval-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--role "roles/aiplatform.user"
gcloud iam service-accounts keys create gcp-service-account.json \
--iam-account rfpeval-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com
export GOOGLE_APPLICATION_CREDENTIALS=$PWD/gcp-service-account.json
The minimal role is roles/aiplatform.user. Keep the key file out of git (our .gitignore already excludes gcp-service-account*.json).
Step 3 — Configure .env
LLM_PROVIDER=vertex
GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID
GOOGLE_CLOUD_LOCATION=us-central1
VERTEX_CHAT_MODEL=gemini-2.5-flash
VERTEX_EMBEDDING_MODEL=text-embedding-005
# plus: GOOGLE_APPLICATION_CREDENTIALS pointing at the service-account JSON
The factory from Part 2 returns, when LLM_PROVIDER=vertex:
from langchain_google_vertexai import ChatVertexAI
ChatVertexAI(
model=settings.vertex_chat_model,
project=settings.google_cloud_project,
location=settings.google_cloud_location,
temperature=settings.temperature,
)
Step 4 — Run it (with Docker)
Mount the service-account key into the container and point ADC at it:
docker compose run --rm \
-v $PWD/gcp-service-account.json:/app/sa.json:ro \
-e GOOGLE_APPLICATION_CREDENTIALS=/app/sa.json \
--env-file .env \
app python -m rfpeval.cli evaluate samples/sample_proposal.md --rfp sample_rfp
Same pipeline, same weighted scoring and human shortlist gate — now running on Gemini.
Vertex vs Azure: what actually differs
| Aspect | Azure OpenAI | Google Vertex AI |
|---|---|---|
| Auth | API key or Entra ID | ADC (service account / user creds) |
| Model reference | Deployment name | Model id (e.g. gemini-2.5-flash) |
| Region setting | Resource region | GOOGLE_CLOUD_LOCATION |
| Enablement | Resource + deployments | Enable aiplatform API + IAM role |
Troubleshooting & common errors
| Error | Cause | Fix |
|---|---|---|
PermissionDenied: 403 |
Service account lacks the role | Grant roles/aiplatform.user on the project |
API [aiplatform.googleapis.com] not enabled |
Vertex API disabled | gcloud services enable aiplatform.googleapis.com |
DefaultCredentialsError |
ADC not found in the container | Mount the key and set GOOGLE_APPLICATION_CREDENTIALS |
| Model not found in location | Model unavailable in the region | Use a supported region (e.g. us-central1) |
429 / quota exceeded |
Vertex quota | Request a quota increase or add backoff |
What’s next
Two cloud providers down, one config switch each. In Part 5 we go fully self-hosted: Ollama on an Azure GPU VM, for confidential bids that must never leave your infrastructure.
Frequently asked questions
How is Vertex authentication different from Azure?
Azure uses an API key (or Entra ID); Vertex uses Application Default Credentials — either your gcloud user login locally, or a service-account JSON via GOOGLE_APPLICATION_CREDENTIALS on a server.
Which IAM role does the service account need?
roles/aiplatform.user is sufficient to call Vertex AI models.
Did the application code change versus Azure?
No. Only .env (and mounting the credentials). The factory returns a ChatVertexAI model and the graph is unchanged.
Conclusion
Google Vertex AI joined the system with another one-line provider switch; the only real new concept was ADC-based authentication. Continue to Part 5: self-hosting with Ollama on an Azure GPU VM.
Independent educational project; not affiliated with any employer; not procurement or legal advice.

Leave a Reply