Jonatas Silva

Tech Lead · Platform Admin

Recruiter Walkthrough

A guided tour by Jonatas Silva — the problem, the risks, the architecture, and how I'd execute in the first 90 days.

Demo mode

Jonatas Silva · Tech Lead Engineer · Full Stack · Agentic AI & Multi-tenant SaaS

Lexora AI Marketing OS — a production-shaped prototype I designed and built for law-firm AI marketing.

7+ years building production systems · 4+ years leading engineering teams · specialized in tenant isolation, async AI pipelines, and billing at scale.

I treat AI calls as engineered systems, not fetch requests: prompt chaining, fallback logic, output validation, per-client token cost tracking, and observability. This prototype demonstrates those patterns in a compliance-sensitive, multi-tenant context — everything is mock-driven, but the entities, flows and architecture mirror systems I ship in production.

7+ years production4+ years tech leadMulti-tenant SaaSBullMQ / RedisAgentic AI

Professional profile

Tech Lead Engineer & Full Stack Developer specialized in multi-tenant SaaS with strict tenant isolation, agentic AI pipelines, async job architectures, and billing systems — owning products end to end from architecture to production.

Multi-tenant SaaS

Strict tenant isolation — scoped queries, middleware resolvers, indexed tenantId, RLS-style policies shipped in production.

Agentic AI systems

Prompt chaining, fallback logic, output validation, and orchestration across Claude, OpenAI, Gemini and Perplexity.

Async job architecture

BullMQ + Redis workers, retries, idempotency, dead-letter handling, and alerting for AI/content pipelines.

Billing & observability

Stripe checkout, webhooks, tier gating, per-client token cost tracking, audit trails, and integration health monitoring.

Jonatas Silva

Tech Lead Engineer

Contact

Languages

Portuguese (native)EnglishFrench

Relevant experience

Background that maps directly to this prototype.

Tech Lead Engineer & Full Stack Developer

2018 — Present

Backend ownership and architecture of multi-tenant SaaS products with AI pipelines, async job systems, and billing — leading delivery from data model to production.

Multi-tenant data models with strict isolation (tenantId scoping + middleware resolver)
BullMQ + Redis async pipelines with retries, idempotent jobs, and failure handling
Agentic AI: OpenAI, Gemini, prompt chaining, fallback, structured output validation
Subscription billing: checkout, tier gating, webhooks, active/overdue/blocked state machines
Per-tenant audit trails, RBAC, rate limiting, and production observability
CI/CD on AWS — Docker, PM2, EC2/S3, zero-downtime releases, Prisma migrations

Tech Lead & Systems Engineer

2022 — 2025

Transmaion Transportes

Technical leadership of the engineering team — architecture, mentoring, and production reliability of internal platforms.

Led multi-developer team: code reviews, onboarding, architectural decisions
Real-time operational systems with async jobs, observability, and retry/alerting
Third-party API integrations with per-source health monitoring

Key projects → Lexora screens

Production systems I've shipped that informed this prototype.

UPVEND

Multi-tenant SaaSBullMQAI

ERP/commerce SaaS — per-subdomain tenant isolation, BullMQ workers, plan checkout + webhooks, AI assistant via secured endpoints.

See related screen

Caixaly

Financial SaaSOpenAIBilling

Multi-tenant financial platform with master admin panel, OpenAI assistant with heuristic fallback, plans/tier model — live in production.

See related screen

TouchFind

Granular RBACAuditTier gating

Industrial SaaS — shared-schema isolation, resource:action permissions, full audit trail, subscription state machine enforced via middleware.

See related screen

Sales Launch

Agentic AIPrompt chainingBullMQ

AI sales-training platform — OpenAI + Gemini, dynamic scenarios, streamed responses, provider fallback, output evaluation rubrics.

See related screen

Core stack

Core

TypeScriptNode.jsReactNext.jsPostgreSQLSupabasePrismaBullMQRedisStripe

AI & observability

ClaudeOpenAIGeminiPerplexityLangfuseLangSmithn8n

Platform & DevOps

NestJSExpressAWS EC2/S3/ECSDockerCI/CDWordPress REST

MBA Data Science, AI & Analytics — USP ESALQ (Dec/2026) · BSc Computer Science (2021) · LGPD & GDPR

The walkthrough

Ten talking points, each linked to the live screen that proves it.

1 · The product problem

Law firms publish marketing content that must be jurisdiction-compliant, fast and cheap. Today that runs through fragmented N8N flows + an Express/ECS service: hard to version, weak retries, no per-tenant cost visibility, and real legal risk if a bad claim ships. Lexora consolidates this into one observable platform — the same class of problems I've solved across UPVEND, Caixaly, and TouchFind.

See the dashboard

2 · Key technical risks

Non-deterministic AI output (hallucinations, malformed JSON, refusals), runaway token cost, cross-tenant data leakage, provider rate limits, and compliance violations that can't be 'rolled back' once published. Each risk has an explicit mitigation in the system — patterns I've applied in Sales Launch and production AI assistants.

Risk telemetry

3 · Proposed architecture

A Next.js App Router monolith (UI + API + Server Actions) fronting Supabase Postgres with RLS, Redis-backed BullMQ workers for async pipelines, AI observability tracing, Stripe for event-driven billing, and adapters for Claude/Gemini/Perplexity/DALL-E, CourtListener and WordPress.

Architecture map

4 · Replacing N8N with native workers

Every workflow becomes a typed BullMQ job contract: code-reviewed, versioned, retried with backoff, and pushed to a DLQ on exhaustion. Jobs carry tenant_id and emit tokens/cost/trace data — eliminating the visual-flow black box I've replaced in multiple production systems.

Workers & DLQ

5 · Multi-tenant isolation

Isolation lives in Postgres. FORCE ROW LEVEL SECURITY binds every row to auth.jwt() ->> 'tenant_id'. A forgotten WHERE clause can't leak data; cross-tenant attempts return 0 rows and log a critical audit event — the same discipline I enforce with tenantId scoping + middleware resolvers.

RLS isolation

6 · Per-tenant AI cost tracking

Each LLM call produces a trace with input/output tokens and cost, attributed to the tenant via the metering queue. Budget guardrails warn at 85% and hard-stop at 100%, reconciled with Stripe usage records — modeled after Caixaly and UPVEND billing flows.

Billing & usage

7 · Failures & retries

Layered resilience: retry with exponential backoff + jitter → fallback to a secondary provider → fall back to cached research and flag for human review. Exhausted jobs land in the DLQ with full payload and stacktrace for replay.

Retry in action

8 · AI quality & compliance monitoring

Beyond logs: structured traces capture evaluation, compliance and hallucination scores per generation. Non-deterministic test suites gate publishing; jurisdiction rule packs block prohibited claims before they ship.

Compliance layer

9 · Migrating without breaking prod

Strangler-fig: run BullMQ alongside N8N, migrate one pipeline at a time behind feature flags, shadow-run for parity, then cut over. RLS and observability land before legacy is removed, so we always have a rollback path.

Migration plan

10 · First 30/60/90 days

Stabilize and instrument first, migrate the riskiest pipelines next, then harden security, billing and compliance — finishing by decommissioning N8N and consolidating ECS into the monolith.

See the plan

30 / 60 / 90 Day Plan

How I'd de-risk and deliver in the first quarter.

First 30 days

Phase 1

1Map current pipelines & failure modes
2Identify critical failures and bottlenecks
3Stand up operational metrics & dashboards
4Define typed job contracts
5Build the base RLS layer

First 60 days

Phase 2

1Migrate highest-risk pipelines to BullMQ
2Implement DLQ + retry strategy
3Integrate AI observability tracing
4Implement billing metering

First 90 days

Phase 3

1Remove critical N8N dependencies
2Compliance hardening per jurisdiction
3Per-tenant monitoring & alerting
4Executive dashboards
5Regression testing for AI outputs

Thanks for reviewing.

This prototype was designed and built by Jonatas Silva. The UI exists to make the architecture legible — I'm happy to walk through any pipeline, the retry/DLQ model, the RLS design, or the AI observability schema in depth.

jonatasfelipe68@hotmail.com +55 14 99116-4027 LinkedIn

Recruiter Walkthrough

A guided tour by Jonatas Silva — the problem, the risks, the architecture, and how I'd execute in the first 90 days.

Demo mode

Jonatas Silva · Tech Lead Engineer · Full Stack · Agentic AI & Multi-tenant SaaS

Lexora AI Marketing OS — a production-shaped prototype I designed and built for law-firm AI marketing.

7+ years building production systems · 4+ years leading engineering teams · specialized in tenant isolation, async AI pipelines, and billing at scale.

7+ years production4+ years tech leadMulti-tenant SaaSBullMQ / RedisAgentic AI

Professional profile

Multi-tenant SaaS

Strict tenant isolation — scoped queries, middleware resolvers, indexed tenantId, RLS-style policies shipped in production.

Agentic AI systems

Prompt chaining, fallback logic, output validation, and orchestration across Claude, OpenAI, Gemini and Perplexity.

Async job architecture

BullMQ + Redis workers, retries, idempotency, dead-letter handling, and alerting for AI/content pipelines.

Billing & observability

Stripe checkout, webhooks, tier gating, per-client token cost tracking, audit trails, and integration health monitoring.

Jonatas Silva

Tech Lead Engineer

Contact

Languages

Portuguese (native)EnglishFrench

Relevant experience

Background that maps directly to this prototype.

Tech Lead Engineer & Full Stack Developer

2018 — Present

Backend ownership and architecture of multi-tenant SaaS products with AI pipelines, async job systems, and billing — leading delivery from data model to production.

Multi-tenant data models with strict isolation (tenantId scoping + middleware resolver)
BullMQ + Redis async pipelines with retries, idempotent jobs, and failure handling
Agentic AI: OpenAI, Gemini, prompt chaining, fallback, structured output validation
Subscription billing: checkout, tier gating, webhooks, active/overdue/blocked state machines
Per-tenant audit trails, RBAC, rate limiting, and production observability
CI/CD on AWS — Docker, PM2, EC2/S3, zero-downtime releases, Prisma migrations

Tech Lead & Systems Engineer

2022 — 2025

Transmaion Transportes

Technical leadership of the engineering team — architecture, mentoring, and production reliability of internal platforms.

Led multi-developer team: code reviews, onboarding, architectural decisions
Real-time operational systems with async jobs, observability, and retry/alerting
Third-party API integrations with per-source health monitoring

Key projects → Lexora screens

Production systems I've shipped that informed this prototype.

UPVEND

Multi-tenant SaaSBullMQAI

ERP/commerce SaaS — per-subdomain tenant isolation, BullMQ workers, plan checkout + webhooks, AI assistant via secured endpoints.

See related screen

Caixaly

Financial SaaSOpenAIBilling

Multi-tenant financial platform with master admin panel, OpenAI assistant with heuristic fallback, plans/tier model — live in production.

See related screen

TouchFind

Granular RBACAuditTier gating

Industrial SaaS — shared-schema isolation, resource:action permissions, full audit trail, subscription state machine enforced via middleware.

See related screen

Sales Launch

Agentic AIPrompt chainingBullMQ

AI sales-training platform — OpenAI + Gemini, dynamic scenarios, streamed responses, provider fallback, output evaluation rubrics.

See related screen

Core stack

Core

TypeScriptNode.jsReactNext.jsPostgreSQLSupabasePrismaBullMQRedisStripe

AI & observability

ClaudeOpenAIGeminiPerplexityLangfuseLangSmithn8n

Platform & DevOps

NestJSExpressAWS EC2/S3/ECSDockerCI/CDWordPress REST

MBA Data Science, AI & Analytics — USP ESALQ (Dec/2026) · BSc Computer Science (2021) · LGPD & GDPR

The walkthrough

Ten talking points, each linked to the live screen that proves it.

1 · The product problem

See the dashboard

2 · Key technical risks

Risk telemetry

3 · Proposed architecture

Architecture map

4 · Replacing N8N with native workers

Workers & DLQ

5 · Multi-tenant isolation

RLS isolation

6 · Per-tenant AI cost tracking

Billing & usage

7 · Failures & retries

Retry in action

8 · AI quality & compliance monitoring

Compliance layer

9 · Migrating without breaking prod

Migration plan

10 · First 30/60/90 days

Stabilize and instrument first, migrate the riskiest pipelines next, then harden security, billing and compliance — finishing by decommissioning N8N and consolidating ECS into the monolith.

See the plan

30 / 60 / 90 Day Plan

How I'd de-risk and deliver in the first quarter.

First 30 days

Phase 1

1Map current pipelines & failure modes
2Identify critical failures and bottlenecks
3Stand up operational metrics & dashboards
4Define typed job contracts
5Build the base RLS layer

First 60 days

Phase 2

1Migrate highest-risk pipelines to BullMQ
2Implement DLQ + retry strategy
3Integrate AI observability tracing
4Implement billing metering

First 90 days

Phase 3

1Remove critical N8N dependencies
2Compliance hardening per jurisdiction
3Per-tenant monitoring & alerting
4Executive dashboards
5Regression testing for AI outputs

Thanks for reviewing.

jonatasfelipe68@hotmail.com +55 14 99116-4027 LinkedIn