Run the RFP Evaluator on Google Vertex AI (Part 4)

Google Vertex AI for the RFP evaluator (Part 4)

In Part 3 we ran the RFP evaluator on Azure OpenAI. Now we point it at Google Vertex AI and Gemini, again by changing only configuration. The interesting difference is authentication: GCP uses Application Default Credentials (ADC), which works quite differently from an Azure API key. The graph stays exactly as it was.

Why Google Vertex AI

Vertex AI is the natural fit for teams already on Google Cloud who want Gemini models, GCP-native IAM, and regional control. Gemini 2.5 Flash is fast and inexpensive for high-volume proposal scoring.

Step 1 — Project and API

gcloud config set project YOUR_PROJECT_ID
gcloud services enable aiplatform.googleapis.com

Step 2 — Authentication (ADC)

You have two clean options. For local development, use your own credentials:

gcloud auth application-default login

For a server or container, create a service account and a key file:

gcloud iam service-accounts create rfpeval-sa \
  --display-name "RFP evaluator"

gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
  --member "serviceAccount:rfpeval-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
  --role "roles/aiplatform.user"

gcloud iam service-accounts keys create gcp-service-account.json \
  --iam-account rfpeval-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com

export GOOGLE_APPLICATION_CREDENTIALS=$PWD/gcp-service-account.json

The minimal role is roles/aiplatform.user. Keep the key file out of git (our .gitignore already excludes gcp-service-account*.json).

Step 3 — Configure .env

LLM_PROVIDER=vertex
GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID
GOOGLE_CLOUD_LOCATION=us-central1
VERTEX_CHAT_MODEL=gemini-2.5-flash
VERTEX_EMBEDDING_MODEL=text-embedding-005
# plus: GOOGLE_APPLICATION_CREDENTIALS pointing at the service-account JSON

The factory from Part 2 returns, when LLM_PROVIDER=vertex:

from langchain_google_vertexai import ChatVertexAI

ChatVertexAI(
    model=settings.vertex_chat_model,
    project=settings.google_cloud_project,
    location=settings.google_cloud_location,
    temperature=settings.temperature,
)

Step 4 — Run it (with Docker)

Mount the service-account key into the container and point ADC at it:

docker compose run --rm \
  -v $PWD/gcp-service-account.json:/app/sa.json:ro \
  -e GOOGLE_APPLICATION_CREDENTIALS=/app/sa.json \
  --env-file .env \
  app python -m rfpeval.cli evaluate samples/sample_proposal.md --rfp sample_rfp

Same pipeline, same weighted scoring and human shortlist gate — now running on Gemini.

Vertex vs Azure: what actually differs

Aspect Azure OpenAI Google Vertex AI
Auth API key or Entra ID ADC (service account / user creds)
Model reference Deployment name Model id (e.g. gemini-2.5-flash)
Region setting Resource region GOOGLE_CLOUD_LOCATION
Enablement Resource + deployments Enable aiplatform API + IAM role

Troubleshooting & common errors

Error Cause Fix
PermissionDenied: 403 Service account lacks the role Grant roles/aiplatform.user on the project
API [aiplatform.googleapis.com] not enabled Vertex API disabled gcloud services enable aiplatform.googleapis.com
DefaultCredentialsError ADC not found in the container Mount the key and set GOOGLE_APPLICATION_CREDENTIALS
Model not found in location Model unavailable in the region Use a supported region (e.g. us-central1)
429 / quota exceeded Vertex quota Request a quota increase or add backoff

What’s next

Two cloud providers down, one config switch each. In Part 5 we go fully self-hosted: Ollama on an Azure GPU VM, for confidential bids that must never leave your infrastructure.

Frequently asked questions

How is Vertex authentication different from Azure?

Azure uses an API key (or Entra ID); Vertex uses Application Default Credentials — either your gcloud user login locally, or a service-account JSON via GOOGLE_APPLICATION_CREDENTIALS on a server.

Which IAM role does the service account need?

roles/aiplatform.user is sufficient to call Vertex AI models.

Did the application code change versus Azure?

No. Only .env (and mounting the credentials). The factory returns a ChatVertexAI model and the graph is unchanged.

Conclusion

Google Vertex AI joined the system with another one-line provider switch; the only real new concept was ADC-based authentication. Continue to Part 5: self-hosting with Ollama on an Azure GPU VM.

Independent educational project; not affiliated with any employer; not procurement or legal advice.

MUASIF80 Avatar
Previous

Leave a Reply

Your email address will not be published. Required fields are marked *