June 3, 2026

Robots & Pencils Hits Three AWS Summits in June with One Message: It’s Time to Launch and Scale AI

Robots & Pencils

The AWS Advanced Tier Services Partner will meet with enterprise and public sector teams across Los Angeles, New York City, and Washington, D.C.

AWS releases generative and agentic AI services faster than most enterprises can deploy them, and that gap is widening every quarter they wait. In June, the Applied AI Engineering Partner will be on the ground at three AWS Summit events, ready to sit down with enterprise and public sector leaders who are done with pilots and ready to take generative and agentic AI live.

The AWS Summit series is where enterprise technology leaders gather to see what is possible on AWS. As an AWS Advanced Tier Services Partner and one of 11 inaugural AWS Pattern Partners selected from the AWS Partner Network, Robots & Pencils is the nimble, high-velocity alternative to traditional global systems integrators. Every engagement starts with the outcome the client needs and earns it with evidence. Velocity Pods, Robots & Pencils’ atomic delivery units, are small teams of senior practitioners that take AI solutions from concept to production on Amazon Bedrock and Amazon Bedrock AgentCore in weeks, compared to 6 to 12 months with a traditional systems integrator.

“AWS has built the full-stack agentic AI infrastructure. The technology is ready. What we do is put it to work fast,” said Adrian Bird, VP of AWS Partnership at Robots & Pencils. “We work with your teams to build the business architecture that takes enterprise AI live before the next board meeting.”

“Enterprises don’t need another aircraft carrier full of consultants in the AI era,” said Jeff Kirk, EVP of Applied AI at Robots & Pencils. “They need a speedboat, a forward deployed team small enough to be surgical, senior enough to build for scale with, and fast enough to prove ROI before the next budget cycle.”

What the Robots & Pencils Builds on AWS

Agentic AI with AWS is the core of the work. AI agents built on Amazon Bedrock and Amazon Bedrock AgentCore that work alongside human teams in live not in a sandbox.

Cloud Modernization gets organizations off aging infrastructure and onto a serverless, AWS-native foundation built to scale, not just to run.

Cloud and AI App Development is where engineers and creatives build in the same room, producing cloud-native applications with generative AI in the architecture from day one.

For enterprise and public sector teams attending any of the three summits, the Robots & Pencils team is ready to talk.

Schedule time with the Robots & Pencils team:

AWS Summit Los Angeles, June 10

AWS Summit New York City, June 17

AWS Summit Washington, D.C., June 30 – July 1

May 25, 2026

AWS Summit Warsaw 2026: What We Saw, Who We Met, and What It Confirmed

Ukraine Team Members

Our Ukraine team spent May 6 at the AWS Summit (EXPO XXI) in Warsaw. Here is what we saw, what surprised us, and what we are bringing back.

EXPO XXI is a short ride from the center of Warsaw, and on a May morning you can feel the conference before you see it. The queue outside was long but moving fast. Hoodies, lanyards, laptop bags. The crowd skewed more senior than you might expect at a free regional event.

Our plan was deliberate. Show up early, split the agenda, cover more ground in parallel, regroup over coffee, and most importantly, validate our approach.

*Robots & Pencils team members in attendance, from left to right: Bohdan Popovych, Rostyslav Volskyi, and Stanislav Makar.*

Robots & Pencils team members in attendance, from left to right: Stanislav Makar, Rostyslav Volskyi, and Bohdan Popovych.

Agentic AI is the AWS Headline.

The opening keynote made one thing clear. Agentic AI is the organizing thesis for everything AWS is building in 2026.

Three names anchored the story. Kiro, the agentic IDE that got a fresh push at re:Invent 2025, featured prominently with its spec-driven development model, sequenced task generation, and agents that produce tests alongside code. Nova 2, the model powering more of the AWS AI surface, continues its region-by-region rollout. AWS Transform, their modernization platform for mainframe, VMware, and .NET workloads, framed as the agentic path into enterprise legacy systems.

Real customer stories on stage. Real numbers. Real screenshots. The European Sovereign Cloud and the EMEA AI Hub got dedicated time, which landed well with the Warsaw audience. The framing was consistent throughout: the shift from AI tools you prompt to AI agents that reason, plan, and act is underway. The question for builders is how you instrument, evaluate, and trust what those agents do.

That question got a very good answer in the next session.

The Session That Landed: AgentCore Evaluations in Production

Right timing matters at a conference, and the AgentCore deep-dive landed at exactly the right moment. AWS spent the spring pushing AgentCore Evaluations hard. It went GA on March 31, 2026, and the Warsaw session put it directly in front of European builders.

The plain-language version of what it does: a managed service that continuously monitors agent quality against real production traces, not just test suites. You are shipping agents. You need to know they work. Handing someone a scorecard you hand-rolled for each project is not a sustainable answer. This is.

The built-in evaluators cover what matters in production:

Correctness. Did the agent get the answer right?

Helpfulness. Was the response useful to the person asking?

Tool selection accuracy. Did the agent pick the right tool for the step?

Safety. Did anything in the output violate policy?

Goal success rate. Did the multi-step task complete?

Context relevance. Did the retrieved context match the question?

On top of those you can configure custom evaluators. LLM-as-judge with your own prompt and model, or code-based evaluators running on Lambda. The same framework handles hallucination detection and JSON schema validation without forcing two different toolchains.

The detail that made us lean forward: full OpenTelemetry compatibility. The evaluator scores flow into existing dashboards alongside session count, latency, token usage, and error rates. You can alert on agent quality the same way you alert on a CPU spike.

For anyone building agents on behalf of enterprise customers, this solves the credibility problem. “How do you know it works in production” is no longer a hand-waving moment.

The Best Conversation Happened at the Espresso Machine

One of the more useful exchanges of the day started while waiting for coffee.

AWS set up a cloud-ordered espresso bar on the expo floor. You scanned a QR code, placed your order in a small web app, and the espresso machine queued it. When the drink was ready, the screen showed your name. No line. No barista small talk. Beautifully on-brand for a cloud event, and genuinely better than the alternative.

Serverless Coffee Bar - AWS Summit Warsaw Robots and Pencils

While we waited, a conversation started with a Senior Solutions Architect at AWS. It turned into one of the most useful exchanges of the day. The topic was whether Lambda is a credible runtime for agentic workflows. The honest answer is: it depends on whether you have state.

An agent is not a request and a response. It is a long, branching workflow with LLM calls, tool invocations, and occasional human-in-the-loop steps. Lambda durable functions, which AWS shipped in late 2025 and has been shaping for agentic use cases since, address this directly. Each LLM call and each tool invocation becomes a checkpointed step inside a single Lambda. If execution times out mid-loop, the next invocation replays from the last checkpoint and skips completed steps. No Step Functions wiring. No custom state store. No DIY replay logic. The orchestration lives in the function code, in the language you already use.

The Java SDK went GA in April 2026. Durable functions are now available in sixteen additional regions.

The Best Hour of the Day: Knowledge Graphs

Two talks on knowledge graphs stood out as the strongest technical content of the summit. The first was delivered by Dmytro Romantsov, Senior SRE at Miro, on their internal AI agent built over an organizational graph. The talk was technically dense and honest: he walked through what failed before the team settled on a graph-backed architecture, what the graph actually contains, how updates flow into it, and where the approach delivers measurably better results than the pre-graph baseline.

After the session, we walked over to talk to him. Small-world moment: we share a first language, switched off English immediately, and the conversation opened up. The core thesis from both the talk and the follow-up conversation was consistent. Enterprise AI agents are only as good as the organizational knowledge they can reason over. A graph gives that knowledge structure, updateability, and query depth that flat retrieval cannot match. That is not a new idea, but watching it validated independently at Miro’s scale makes the argument more concrete.

The second strong graph talk came from an SLB engineer in DEV207, on context graphs for explainable AI agents. The framing that stuck: the difference between a state clock and an event clock. Most pipelines today reflect the current state of a system. A context graph that also captures decision events can answer “why did this happen, and in what order.” That is the kind of explainability enterprise buyers are starting to require as agents move from pilot to live.

Asking Honest Questions About AWS Transform

The AWS Transform booth was busy. The team arrived with a direct question about IBM RPG support and walked through the answer methodically with a Solutions Architect for Migration and Modernization at AWS.

The most telling moment was watching an AWS specialist type the same question into their own tool in front of us. The answer came back: yes, with limitations, followed by pages of caveats. Informative in its own way.

The bottom line is that AWS Transform is production-grade for COBOL, Java-to-JavaScript migrations, VMware modernization, and mainframe workloads. RPG support is real but not ready for complex production use cases. We left with clarity on where the tool genuinely shines and where the right path is a combination of other tools and hand-rolled pipelines. That kind of honest answer is the second-best outcome at a conference. It tells you your reasoning was sound.

The VMware migration angle, by contrast, is genuinely strong. Broadcom’s license changes are creating real urgency for customers running on VMware infrastructure. Worth flagging for relevant engagements.

The Compute Thesis: AWS is Sizing Infrastructure for Self-Managed AI

A theme ran underneath the agentic-AI headline all day: AWS is provisioning compute to match the shape of AI demand, and the demand right now for these kinds of workloads is high.

Two sessions made the same point from opposite ends of the price spectrum. Comarch walked through a real migration from x86 to AWS Graviton-based instances, with meaningful cost reductions and measured performance gains. The honest part of their talk: Graviton is not a flag flip. If you have native code, JNI bindings, or JIT-tuned hotspots, you pay for the migration before you see the savings.

On the other end of the spectrum: Meta’s agreement to deploy AWS Graviton processors at scale, starting with tens of millions of Graviton cores, announced ten days before the summit and explicitly framed around CPU-intensive agentic AI workloads — real-time reasoning, code generation, and multi-step task orchestration.

For Robots & Pencils, this opens a third option alongside Bedrock and direct provider APIs. For clients with data-residency constraints, predictable high-volume workloads, or smaller open-weight models where managed-API margins make self-managed attractive, the playbook is now well-documented and accessible. Independent benchmarks on Llama 3.1 8B have Graviton4 delivering roughly 2x the tokens per dollar of comparable x86 options for that model class.

A Practitioner’s Checklist for 2026

The session that generated the most useful signal for client-facing conversations was DEV209, delivered by Tomasz Dudek, Data and AI Team Lead at Chaos Gears and an AWS Machine Learning Hero. The premise was simple: AI has been mainstream for over three years. He has watched hundreds of Amazon Bedrock projects pass through his hands. Most near-failures trace back to a small set of repeatable mistakes.

The talk was the inverse of a vendor pitch. Here is exactly how teams stall before the first line of code. Here is what to do instead. He closed with 13 numbered tips for approaching AI projects in 2026. The final line: “Have evals, really.”

It was good to hear a practitioner at that level land on the same conclusions we have been operating on. The teams doing this work at scale are converging on the same principles, and the list mapped closely to how we already approach agent quality on client engagements. Confirmation from that angle is worth having.

The Parts That Were Just Fun

Not everything at a summit is a session worth writing home about. But a few moments, in addition to the Serverlesspresso bar, which was cool enough to warrant a second mention, stood out for the right reasons.

The AWS Drive Your Data Formula 1 simulator was exactly what it looked like: two Fanatec rigs, full wraparound LED screens, a Canada time-trial, and a results board you could compete on. The pitch underneath was real telemetry and lap analytics. The booth’s job was to draw a crowd, and it absolutely did. The team took turns.

And the Ukrainian-speaking community was well-represented at this summit. Several familiar-sounding conversations happened in unexpected corners of the expo. That part mattered.

What the Day Confirmed

The most useful thing a conference can do is sharpen your picture of where the tools are today versus where they are heading. Warsaw 2026 did that well.

Agentic AI is no longer a roadmap commitment from AWS. It is the organizing logic of everything they showed. Agent evaluation infrastructure is production-ready and instrumented the way mature engineering teams expect. The compute story has matured to a point where self-hosting is a genuine option for the right workloads, not just a theoretical one. Knowledge graphs as a foundation for enterprise AI agents are getting independent validation at scale. And the practitioners who have been doing this work longest are converging on the same principles around evaluation, quality gates, and shipping agents that are honest about what they know.

None of that surprised us. All of it was good to see confirmed.

Warsaw 2026 delivered real technical depth on agentic AI, agent evaluation, and knowledge graphs. The team went in with specific questions and came back with sharper answers, a few useful new contacts, and a strong argument for cloud-ordered coffee at the next internal engineering day.

Robots & Pencils is an AWS Advanced Tier Services Partner and AWS Pattern Partner. Request an AI Briefing today.

Written by Bohdan Popovych: Robots & Pencils Ukraine Engineering Manager, Rostyslav Volskyi: AWS Certified Solutions Architect and Amazon Web Services Developer, and Stanislav Makar: AWS Certified Solutions Architect – Professional.

February 3, 2026

Robots & Pencils Achieves Amazon Web Services (AWS) Advanced Tier Partner Status

Robots & Pencils

Milestone reinforces Robots & Pencils’ strength in building and operating cloud-native and AI-enabled systems on AWS

Robots & Pencils, an applied AI engineering partner known for high-velocity delivery and measurable business outcomes, today announced it has achieved AWS Advanced Tier Partner status. The designation marks a significant milestone in the company’s continued growth, delivering production-grade systems on AWS for enterprise organizations.

Explore how Robots & Pencils delivers applied AI and cloud-native systems built to operate at enterprise scale.

AWS Advanced Tier Recognition Reflects Production-grade Delivery at Enterprise Scale

AWS Advanced Tier Partner status recognizes organizations with demonstrated depth in AWS delivery, technical expertise, and customer success at scale. Partners at this level consistently design, build, and operate secure, resilient, and scalable systems running in production. Robots & Pencils earned this designation through sustained enterprise delivery, a strong bench of AWS-certified engineers, and real-world workloads running on AWS.

A High-Velocity Alternative to Traditional Global Systems Integrators

Robots & Pencils operates as a nimble alternative to traditional global systems integrators, pairing AWS-native architecture with a relentless focus on velocity and impact. Small, senior, highly focused teams move quickly from idea to execution, delivering production-ready systems that create value early and improve continuously through real-world learning.

“Our AWS Advanced Tier designation continues to build on strong momentum as we strengthen our relationship with AWS,” said Len Pagon Jr., Chief Executive Officer of Robots & Pencils. “In December, we were selected as one of 11 AWS Partners out of more than 190,000 to join the new Pattern Partner program. In January, we launched our Bellevue, WA Innovation Center next door to AWS HQ to work closely with the AWS GenAIIC team. Now, as an AWS Advanced Tier Services Partner, we have access to additional AWS funding for proofs of concept and substantial resources to support our clients.”

Building and Operating Generative and Agentic AI Systems that Scale

Enterprise leaders continue accelerating modernization and AI initiatives that demand speed, reliability, and operational maturity.

AWS Advanced Tier status signals that Robots & Pencils brings disciplined execution and rapid iteration to move confidently from strategy to production and scale. Clients gain earlier insight, reduced delivery risk, and steady progress aligned to clear business outcomes.

“Enterprises are moving quickly toward generative and agentic AI systems that operate as an enablement layer of their business,” said Jeff Kirk, Executive Vice President of Applied AI at Robots & Pencils. “AWS provides a powerful foundation, and our role is to design and deliver AI systems that integrate securely, perform reliably, and scale with confidence. Advanced Tier recognition affirms our ability to move AI from ambition into production.”

Certified Talent and Engineering Discipline Behind AWS Advanced Tier Recognition

The AWS Advanced Tier recognition also reflects the company’s investment in certified talent, security discipline, and operational excellence.

“AWS Advanced Tier status is earned through consistency, accountability, and a deep commitment to craft,” said Nicholas Waynik, Vice President of Engineering at Robots & Pencils. “Our engineers invested the time to master the platform, earn the certifications, and apply that knowledge to systems running in production. This recognition reflects the standards they hold themselves to every day and the trust our clients place in their work.”

Enterprise-ready AI Architectures Delivering Impact Across Industries

Robots & Pencils also holds AWS Pattern Partner recognition, highlighting its strength in delivering repeatable, enterprise-ready architectures for applied AI and intelligent systems. This capability complements Advanced Tier status by reinforcing architectural leadership alongside delivery maturity, while maintaining a clear focus on execution that performs in production.

Robots & Pencils delivers application modernization, cloud-native product development, data platforms, and AI-enabled systems on AWS across Consumer Products and Retail, Education, Energy, Financial Services, Healthcare, and Manufacturing. These solutions support regulated environments, high-growth digital products, and complex enterprise ecosystems where reliability and scalability matter every day.

Request an AI briefing to evaluate how applied AI can deliver velocity and impact within your organization.

February 2, 2026

Your Churn Model Works Perfectly. So Why are Your Customers Still Leaving?

Rushi Pol

There’s a pattern that keeps showing up in retail AI projects. A data science team spends months building a churn prediction model. They tune it, validate it, and present impressive accuracy metrics to leadership. The model goes into production. And six months later, when someone asks what happened to the churn rate, the uncomfortable answer is, “Nothing changed.”

The model works. It predicts churn beautifully. It just doesn’t prevent it.

This might seem like an implementation problem. Maybe the marketing team didn’t act on the predictions quickly enough, or the retention offers weren’t compelling enough. But the issue runs deeper than that. The problem starts with how the project was framed in the first place.

When Churn Prediction Becomes Theater

Here’s what prediction theater looks like in practice: Your churn model flags a high-risk customer on Monday morning. The prediction appears in a dashboard. Someone from marketing reviews it during Thursday’s retention meeting and adds the customer to next week’s email campaign. The customer cancels their subscription on Tuesday. Five days after the model predicted it. Three days before marketing acted on it. The model performed perfectly. It predicted an outcome. But prediction without intervention is just expensive surveillance.

This pattern repeats because organizations optimize for the wrong outcome: prediction accuracy instead of churn reduction. Accuracy is measurable, improvable, and requires no workflow changes. You can plot ROC curves and present F1 scores in quarterly reviews. Prevention requires rebuilding operations across marketing automation, customer service systems, and approval workflows.

Why Accurate Churn Prediction Rarely Changes Outcomes

The constraint is intervention capacity, not model accuracy. Improving your model from 85% to 87% accuracy doesn’t mean anything if you can only act on 20% of the predictions. When intervention capacity is the bottleneck, marginal accuracy improvements deliver zero business value. It’s like building a faster fire alarm when what you actually need is a sprinkler system. For many retailers, the real constraint shows up in the approval process. Attractive retention offers often require VP sign-off, which can introduce multi-day delays and make timely intervention difficult.

Prevention requires event-driven architecture, where systems respond immediately to customer actions within seconds or minutes instead of waiting for batch processing cycles that run nightly or weekly. When a customer shows churn signals like cart abandonment, a subscription cancellation attempt, or declining engagement, the system must detect the signal, assess the situation, and intervene automatically while the customer is still engaged. This is a very different approach from prediction systems that generate reports for human review.

The Architecture of Churn Prevention

Netflix offers one of the most familiar examples of what prevention architecture looks like in practice. Looking at how their system works makes the four components of effective prevention clear.

Signal detection: The system continuously monitors viewing behaviors, like declining watch time, increased browsing without watching, and longer gaps between sessions. These signals indicate churn risk before the customer consciously decides to cancel.

Intelligence layer: When signals trigger, the system calculates subscriber lifetime value, checks recent engagement patterns, and determines if intervention is warranted. Not every signal gets an intervention. The system only acts when the data suggests it will work.

Automated intervention: Within seconds, the recommendation engine adjusts what content appears, emphasizing shows with high completion rates for similar subscribers. This happens without dashboard review or marketing approval, allowing the system to act while the customer is still engaged.

Outcome measurement: The system tracks whether the interventions worked. Did the subscriber watch the recommended content? Did engagement increase? The algorithm continuously learns which recommendations retain which subscriber segments.

This automated prevention architecture contributes to Netflix maintaining an industry-leading monthly churn rate hovering between 1-3% over the past two years, well below the streaming industry average of approximately 5%. Over 80% of content watched on Netflix comes from these algorithmic recommendations. The distinction is critical: Netflix built a model to predict which subscribers might leave and the systems that automatically present compelling reasons to stay at the moment of decision.

This same prevention architecture applies just as effectively to physical products. Customer signals still appear in real time through behaviors like cancellation attempts, delayed reorders, or changes in purchase patterns. Systems can evaluate context such as purchase history and customer value, decide whether intervention makes sense, and respond immediately with relevant offers, guidance, or incentives. By measuring outcomes and learning which responses work for different customers, physical product businesses can intervene at the moment decisions are forming rather than after churn has already occurred.

What Makes Churn Prevention Smart

Problems emerge when components are skipped. A subscription box retailer might implement automated cancellation prevention while leaving out the intelligence layer, the business logic that prevents gaming. Without assessing customer value, limiting offer frequency, or recognizing behavior patterns, every customer who clicks ‘cancel’ receives the same discount. The system works on the surface, but over time it teaches customers how to exploit it. What started as a retention tactic turns into a habit, margins erode, and prevention stops doing the work it was meant to do.

This gaming scenario raises the immediate question marketing teams ask: “Doesn’t automation mean losing brand control?” Not if the intelligence layer encodes your judgment as guardrails. No discount over XX%. No offers conflicting with active campaigns. VIP customers (top X% LTV) escalate to human review before any automated intervention. Win-back offers only after a defined cooling period. Your brand standards become executable rules that prevent the system from going rogue, while still acting faster than manual review workflows.

Operational Readiness Comes Before Modeling Sophistication

Before building a churn model, map the complete intervention workflow:

How will predictions trigger actions across channels? (If the answer involves a weekly dashboard review, you’re building a prediction theater.)

What systems enable real-time personalization? (Can you respond when customers show churn signals?)

Who has the authority to modify customer treatment dynamically? (Automated systems with guardrails, or manual approval workflows?

What is the acceptable latency between prediction and intervention? (Minutes? Hours? Days?)

Clear answers to these questions determine readiness. Building prediction models without intervention infrastructure creates sophisticated systems that generate insights teams cannot act on at retail speed.

Building AI Systems That Act Before Customers Leave

The goal is simple. Prevent customers from leaving in the moment when they are making that decision.

The shift from prediction to prevention requires AI-powered systems that can detect signals, assess customer value, and execute personalized interventions automatically and without human review delays. This works when you encode human judgment into systems that can act at machine speed. The intelligence layer (LTV assessment, discount frequency limits, pattern detection, and margin guardrails) separates effective prevention from expensive automation theater.

Here’s how to start:

Step 1: Choose one high-value churn segment (not the largest, but the one where retention has the highest dollar impact).

Step 2: Map signal to intervention: Document every step from customer signal to executed action. Where does latency creep in?

Step 3: Cut one manual approval step. If every offer needs manual sign‑off, you eliminate any chance of quick action. Let AI automate and accelerate that step.

Step 4: Measure what matters: Track retention rates and customer lifetime value, not model accuracy.

The technical challenge of predicting churn is no longer the constraint. Durable advantage now comes from leaders who design organizations that act, decisively and automatically, at the moment of customer decision.

The pace of AI change can feel relentless with tools, processes, and practices evolving almost weekly. We help organizations navigate this landscape with clarity, balancing experimentation with governance, and turning AI’s potential into practical, measurable outcomes. If you’re looking to explore how AI can work inside your organization—not just in theory, but in practice—we’d love to be a partner in that journey. Request an AI briefing.

Key Takeaways

Architecture determines outcomes. Event-driven systems enable real-time intervention, while batch systems document churn after it happens.

Intervention capacity creates the true bottleneck. Automated prevention systems scale decision-making with the customer base.

The intelligence layer makes prevention smart. LTV assessment, discount limits, and margin guardrails prevent gaming while maintaining brand control.

FAQ

What’s the difference between churn prediction and churn prevention?
Churn prediction identifies which customers may leave. Churn prevention intervenes automatically to change customer behavior before they leave. Prediction relies on analytics. Prevention relies on decision automation and real-time execution.

Why do accurate churn models fail to reduce churn rates?
Prediction accuracy creates no value without intervention capacity. When models identify more at-risk customers than teams can act on, marginal accuracy delivers zero impact.

What makes a churn prevention system different architecturally?
Prevention systems use event-driven architectures that automate the full loop: signal detection, intervention selection, execution, and outcome measurement.

How should retail organizations measure churn AI success?
Track retention improvement, customer lifetime value growth, intervention response rates, and cost per retained customer. Model accuracy measures technical quality. Business impact requires retention metrics.

February 2, 2026

Context Engineering is the Part of RAG Everyone Skips

Nilesh Patwardhan

This moment is familiar. A “simple” policy question comes up, and the conversation slows to a halt. Not because the answer is unknowable, but because it’s buried somewhere in a 100-page PDF, inside a binder no one wants to open, on an intranet that technically exists but rarely helps when it matters.

Under time pressure, people do what people always do. They ask around. They rely on memory. They make the best call they can with what they recall.

That’s the situation many organizations quietly operate in. Field teams losing meaningful time every shift just trying to locate procedures. Compliance leaders increasingly uneasy with how often answers came from tribal knowledge. The documents exists. Access technically exists. What’s missing is usable context.

When Policy Knowledge Exists but Usable Context Does Not

The obvious move is to build a RAG (Retrieval-Augmented Generation) assistant.

That’s where the real work begins.

What we didn’t fully appreciate at first was that this wasn’t a retrieval problem. It was a context construction problem.

The challenge wasn’t finding relevant text. It was deciding what the model should be allowed to see together. In hindsight, this had less to do with RAG mechanics and more to do with what we’ve come to think of as context engineering: deliberately designing the context window so the model sees complete, coherent meaning instead of fragments.

Where the “Obvious” Solution Fell Short

We didn’t start naïvely. We explored modern RAG patterns explicitly designed to reduce context loss. Parent–child retrieval, hierarchical and semantic chunking, overlap tuning, and filtered search strategies. These approaches are widely used in production for structured documents, and for good reason.

They did perform better than baseline setups.

But for these policy documents, the same failure mode kept showing up.

Answers were fluent. Confident. Often almost right.

Procedures came back incomplete. Steps appeared out of order. Exact wording, phone numbers, escalation paths, timelines – softened or blurred. And when the model couldn’t see the missing context, it filled the gaps with something plausible.

Why “Almost Right” Answers Are Dangerous in Compliance & Procedural Work

At that point, the issue was no longer retrieval quality.

It was context loss at decision time.

A procedure isn’t just information. It’s a sequence with dependencies. Even when parent documents were pulled in after similarity-based retrieval, the choice of which parent to load was still probabilistic, driven by embedding similarity rather than document structure.

In compliance-heavy environments, “coherent but incomplete” is an uncomfortable place to land.

This became the line we couldn’t ignore:

Chunking isn’t a neutral technical step. It’s a design decision about what context you’re willing to lose and when.

Chunking Is a Design Choice About Risk

Most modern RAG systems correctly recognize that context matters. Parent–child retrieval and hierarchical chunking exist precisely because naïve fragmentation breaks meaning.

What many of these systems still assume, though, is that similarity-first retrieval should remain the primary organizing principle.

Why Similarity-First Retrieval Breaks Policy Logic

For many domains, that’s a reasonable default. For large policy documents, it turned out to be the limiting factor.

Policy documents reflect how institutions think about responsibility and risk. They’re organized categorically. They use deliberate, constrained language like – within 24 hours, contact this number. And their most important procedures often span pages, not paragraphs.

When that structure gets flattened into ranked results, even if parent sections are expanded later – similarity still decides which context the model sees first.

And when surrounding context disappears, the model does what it’s trained to do: it narrates.

Not recklessly. Not maliciously.

Just helpfully.

That was the subtle failure mode we kept encountering – the system becoming a confident narrator when what the situation required was a careful witness.

Naming the Problem Changed the System

Once we framed this as a context engineering problem, the architecture shifted.

Instead of asking, “How do we retrieve the most relevant chunks?” we started asking a different question:

What does the model actually need to see to answer this safely and faithfully?

That reframing moved us away from similarity-first defaults and toward deliberate context construction.

In retrospect, this wasn’t a rejection of modern RAG techniques. It was a refinement of them.

The Design Decisions That Actually Changed Outcomes

Once the problem was named clearly, a small set of design decisions emerged as disproportionately impactful. None of these ideas are novel on their own. What mattered was how they were combined.

Classify First, Then Retrieve

Before touching the vector store, the system classifies what the user is asking about. An LLM determines the query category and confidence level.

When confidence is high, full pages from that category are loaded via metadata lookup – no embedding search required.

When confidence is low, the system falls back to chunk-based vector search, not as the default, but as a safety net for ambiguous or cross-cutting questions.

You can think of this as parent–child retrieval where the parent is selected deterministically by intent, rather than probabilistically by similarity.

Dual Document Architecture

Location-specific documents were separated from company-wide documents, each with its own taxonomy. “What’s the overtime policy?” and “Where’s the emergency exit?” require fundamentally different context.

Domain-Specific Taxonomy

Categories were designed to align with how policy documents are actually authored, not how users phrase questions. Categories were assigned at upload time, not query time, making retrieval deterministic and fast.

Token-Aware Page Loading

Even full pages can exceed context limits. Dynamic loading prioritizes contiguous pages and stops when the token budget is reached. The tradeoff was intentional: complete procedures beat partial matches.

The Big Lesson: Context Is the Real Interface Between Policy and AI Judgment

Context is easy to treat as plumbing – important, but invisible.

In reality, context is the interface between an organization’s reality and a model’s generative capability.

So yes, modern RAG techniques matter.

But in systems built around policy, safety, and compliance, the sequence in which they’re applied matters more than we usually admit. Not because it helps the model answer faster but because it helps the model answer without taking liberties.

If you’re building RAG for policy, compliance, or any domain where fidelity matters more than speed, it’s worth pausing to ask, “What context actually needs to be present?” That question alone can lead to systems that are simpler and ultimately more trustworthy than expected.

It’s also worth noting: These patterns are particularly relevant in environments where data residency or deployment constraints limit the use of cloud-hosted models. That constraint sharpened every design decision, and it’s a story worth exploring separately.

Key Takeaways

Context engineering determines whether RAG systems act as faithful readers or confident narrators.

In policy and compliance domains, retrieval order matters more than retrieval score.

Chunking decisions shape what the model can understand together, and therefore what it can answer safely.

Similarity-first retrieval works well for discovery, but procedural fidelity requires deterministic context selection.

Classifying intent before retrieval creates more trustworthy outcomes than relying on embeddings alone.

Systems designed around full procedures outperform systems optimized for partial relevance.

Local, open-source LLM deployments amplify the importance of disciplined context construction.

FAQs

What is context engineering in RAG systems?

Context engineering is the deliberate design of what information an LLM sees together in its context window. It focuses on preserving complete meaning, sequence, and dependencies rather than optimizing for similarity scores alone.

Why does retrieval order matter for policy documents?

Policy documents encode responsibility, timelines, and escalation paths across sections. When retrieval order fragments that structure, models produce answers that sound correct while missing critical steps or constraints.

Why do RAG systems hallucinate in compliance scenarios?

They usually do not hallucinate randomly. They infer missing steps when surrounding context is absent. This happens when procedures are split across chunks or retrieved out of sequence.

When should similarity-based retrieval be avoided?

Similarity-based retrieval becomes risky in domains where sequence and completeness matter more than topical relevance, such as safety procedures, regulatory policies, and escalation protocols.

How does classifying before retrieval improve accuracy?

Intent classification allows systems to load entire, relevant sections deterministically. This ensures the model sees complete procedures rather than fragments selected by embedding proximity.

Is this approach compatible with modern RAG architectures?

Yes. It refines modern RAG techniques by sequencing them differently. Vector search becomes a fallback for ambiguity rather than the primary organizing principle.

Does this approach require proprietary models or cloud infrastructure?

No. The system described was built using open-source LLMs running locally, which increased the importance of careful context design and eliminated data exposure risk.

January 9, 2026

Product, not PowerPoint: How to Evaluate Enterprise AI Partners

Tim Zeller

A practical framework for enterprise AI vendor selection that prioritizes functional product.

There is a simple truth in basketball: when someone claims they can dunk, you do not want their biography. You want to see them take off, rise above the rim, and throw it down. Until the ball goes through the hoop, everything else is just pregame chatter.

Traditional business pitches are no different. Slide after slide explaining talent, process, and commitment to excellence. Everyone insists they are fast, strategic, and powered by artificial intelligence. It all blends together.

And just as in basketball, none of it matters until you see the dunk.

Why Enterprise AI Partner Evaluation Has Changed

I have spent the last year watching something shift in how enterprise buyers evaluate technology partners. The change is not subtle. AI collapsed the timeline for what is possible. Engineers use artificial intelligence to automate repetitive tasks, reveal gaps, and support rapid iteration. User experience teams model real behavior and refine interactions in a fraction of the usual time. Designers explore and adapt visual directions quickly while matching a client’s brand and needs. At the strategy level, artificial intelligence helps teams explore concepts, identify edge cases, and clarify problems before anyone designs anything or writes code.

Teams can now build first versions far earlier than they once could. It is now possible to walk into a meeting with something real, rather than something hypothetical.

Traditional Evaluation Arrives Too Late

Yet enterprise evaluation still moves as if early builds take months. Teams can create quickly, but organizations are asked to decide slowly. Forrester’s 2024 Buyers’ Journey Survey reveals the scale of this shift: 92% of B2B buyers now start with at least one vendor in mind, and 41% have already selected their preferred vendor before formal evaluation even begins. Traditional vendor selection leans on slides that outline intent, case studies that point backward, and demos that highlight features. These keep judgment at arm’s length and often arrive too late to matter.

An early milestone changes that dynamic. A deck explains. A first version proves.

What Functional Products Reveal About AI Vendors

A healthcare technology company came to us through a partner referral. They needed to modernize their pharmacy network’s web presence, which included hundreds of independent pharmacy websites, each with unique branding and content, all needing migration into a modern, SEO-optimized content management system. They had already sat through multiple vendor presentations that week. Each promised speed, AI capabilities, and transformation.

At Robots & Pencils, we stopped presenting what we could do and started showing what we already built.

Building the Functional Product in 10 Days

Our team had a week and a half. Our engineers used AI agents to automate content scraping and migration. Our UX team modeled user flows and tested assumptions in days instead of weeks. Our designers explored visual directions that preserved each pharmacy’s brand identity while modernizing the experience. Our strategy team identified edge cases and clarified requirements before a single line of production code was written.

We walked into the meeting with a functional product.

The Client Demo: Testing Real Data in Real Time

The client entered one of their pharmacy’s existing URLs into our interface. They selected brand colors. They watched our AI agents scrape content, preserve branding structure, and generate a modern, mobile-responsive website in real time. Within minutes, they were clicking through an actual functioning site built on a production-grade CMS with an administrative backend. This was not a mockup or a demo, but a working system processing their real data.

The entire conversation shifted. They immediately started testing edge cases. What about mobile responsiveness? We showed them the mobile view that we had already built based on pre-meeting feedback. What about the administrative interface? We walked them through the CMS backend where content could be updated. They stopped asking, “Can you do this?” and started asking “What else can we build together?” and “How quickly can we expand this?”

After the meeting, their feedback was direct: “I appreciate the way you guys approached us. Going through the demo, it wasn’t just this nebulous idea anymore. It was impressive from a build standpoint and from an administration standpoint.”

Why Early Functional Products Prevent Partnership Failures

When clients see a working product, even in its earliest form, they lean forward. They explore. They ask questions. They do not want to return to a deck once they have interacted with actual software. And this is precisely why the approach works.

Most enterprise partnerships that fail do not fail because of weak engineering or design. They fail because teams hold different pictures of the same future, and those differences stay hidden until it is too late to course correct easily. A shared early version fixes that. Everyone reacts to the same thing. Misalignments surface when stakes are low. You learn how a partner listens, how they adjust, and how you work through ambiguity together. No deck presentation can show these things.

How Early Functional Delivery Transforms Vendor Selection

The Baseline Iteration Before Contract Signing

At Robots & Pencils, we think of this functional product as more than a prototype. It is the baseline iteration delivered before contract signing. It shapes how the partnership forms. The client comes into the work from the start. Their data, ideas, and context shape what gets built.

Why This Approach Stays Selective

Because this early delivery takes real effort and investment on our behalf, we keep the process selective. We reserve early functional product development for organizations that show clear intent and strong alignment. The early artifact becomes the first shared step forward, rather than the first sales step.

The Lasting Impact on Partnership Formation

When you start by delivering something meaningful, you set the tone for everything that follows. The moment that first version hits the court, the moment you see the lift, the rim, and the finish, the entire relationship changes.

In the end, the same lesson from basketball holds true. People do not remember the talk. They remember the dunk. And we would rather spend our time building something real than explaining why we could.

If you want to explore what it looks like to begin with real work instead of a pitch, we would love to continue the conversation. Let’s talk.

Key Takeaways

The evaluation gap is real: AI enables teams to build prototypes in 5-10 days, yet traditional enterprise evaluation still operates on 3-6 month timelines. This mismatch leaves buyers relying on promises instead of proof.

Most buyers decide before formal evaluation: 92% of B2B buyers start their journey with at least one vendor in mind, and 41% have already selected their preferred vendor before formal evaluation begins. Early proof matters more than polished presentations.

Partnership failures stem from misalignment, not capability: Enterprise AI implementations rarely fail due to weak engineering or design. They fail because teams hold different pictures of the same future, differences that stay hidden until it’s too late to course correct easily.

Functional products reveal what presentations cannot: A shared early version shows how a partner interprets requirements, handles real constraints, navigates tradeoffs, and collaborates under pressure. No deck can demonstrate these partnership dynamics.

Early functional delivery changes the conversation: When prospects interact with a working product built with their actual data, the conversation shifts from “Can you do this?” to “What else can we build together?” Trust forms through shared work, not sales process.

FAQs

How long does early functional delivery take to create?

Early functional product delivery typically takes 5-10 days, depending on complexity and data availability. At Robots & Pencils, we focus on demonstrating how we interpret requirements, handle real constraints, and collaborate under actual conditions rather than achieving feature completeness.

What makes this approach different from a proof of concept?

Unlike traditional proofs of concept, our baseline iteration is built with the client’s actual data and reflects real-world constraints from day one. It demonstrates partnership dynamics and problem-solving approach, not just technical capability.

Which types of organizations are best suited for this approach?

Organizations that show clear intent, strong alignment on objectives, and readiness to engage collaboratively benefit most from early functional delivery. This approach works best when both parties are committed to testing the partnership through real work rather than presentations.

Can this approach work for regulated industries like healthcare or financial services?

Yes. We’ve successfully delivered early functional products for healthcare technology companies and financial services organizations. The approach adapts to industry-specific requirements while maintaining rapid delivery timelines.

January 7, 2026

Robots & Pencils Opens Studio for Generative and Agentic AI in Bellevue

Robots & Pencils

The Seattle-area AI Studio is live, growing, and hiring engineers and builders ready to deliver impact at velocity.

Robots & Pencils, an applied AI engineering partner known for high-velocity delivery and measurable business outcomes, today announced the opening of its Studio for Generative and Agentic AI in Bellevue.

Candidates seeking high-impact engineering, data, and design roles can learn more at robotsandpencils.com/careers.

A Strategic Expansion to Meet Demand for Rapid Enterprise AI

The Studio in downtown Bellevue is fully operational and actively building its founding team as enterprise demand accelerates for AI systems that move from experimentation to production with speed, precision, and accountability.

The Studio expands Robots & Pencils’ AI-native delivery model and represents a significant step in the company’s U.S. growth, supported by global operations in Cleveland, Calgary, Toronto, Bogota, and Lviv. It adds meaningful capacity to support organizations launching AI-enabled products, platforms, and agentic systems at scale.

Strong Leadership Driving Focus and Velocity

The Studio in Bellevue operates under the leadership of Jeff Kirk, Executive Vice President of Applied AI at Robots & Pencils, and reinforces the company’s growing presence in the Pacific Northwest while serving global clients pursuing ambitious AI initiatives.

“This Studio is designed for builders who want real ownership and real impact,” said Kirk. “We are bringing together experienced teams who move quickly, think clearly, and take responsibility for outcomes. Our Studio model gives people the trust and focus to make strong decisions and deliver AI systems that translate directly into business value.”

Working with AWS to Accelerate Enterprise AI Delivery

As an Amazon Web Services Partner located near Amazon headquarters, the Studio in Bellevue supports clients building and scaling AI solutions on Amazon Bedrock, Amazon SageMaker, Amazon Bedrock AgentCore, Amazon Quick Suite, and related AWS services. This proximity strengthens collaboration and supports faster experimentation and production-ready delivery for complex enterprise environments.

Robots & Pencils was recently selected as one of 11 inaugural partners in the invite-only AWS Pattern Partners program. The program works with a select group of consulting partners to define how enterprises adopt next-generation AI and emerging technologies on AWS through validated, repeatable patterns.

This recognition acknowledges Robots & Pencils’ experience delivering production-grade AI architectures for enterprise customers. Working with AWS, the company supports secure and scalable AI delivery across regulated and high-impact industries while enabling teams to move with clarity and confidence from design through deployment.

A Destination for Elite AI Builders

The Studio for Generative and Agentic AI reflects Robots & Pencils’ long-standing commitment to talent density and engineering craft. Employees average fifteen years of experience and contribute patents, published research, and category-defining products across industries. The Studio in Bellevue offers engineers, applied AI specialists, product leaders, and user experience innovators the opportunity to shape a new hub while influencing high-stakes client work from the ground up.

“To support our substantial client demand, we need incredible GenAI talent and are significantly investing in how we work with AWS. Our Bellevue AI Studio places our teams in close proximity to AWS, creating an environment that supports knowledge sharing and enables us to tap into the Seattle-area hot bed of incredible, wicked-smart talent,” said Len Pagon Jr., CEO of Robots & Pencils. “The Bellevue location expands our ability to deliver applied AI outcomes at scale while creating an environment where experienced builders can do the most meaningful work of their careers. This expansion reflects confidence in our teams and the direction we are taking the company.”

Velocity Pods Deliver AI Products in Weeks

Teams in the Studio operate in industry-focused Velocity Pods supporting Education, Energy, Financial Services, Healthcare, Manufacturing, Transportation, and Retail and CPG. These pods launch AI generative and agentic products to market in 30-to-45-day cycles while addressing complex modernization and intelligent automation programs across the enterprise.

Now Hiring for AI Engineering Jobs in Bellevue

Robots & Pencils is actively staffing the Studio for Generative and Agentic AI in Bellevue and invites experienced engineers and builders to apply. Open roles span engineering, applied AI, product, and design.

Interested candidates can explore opportunities and submit applications at robotsandpencils.com/careers.

The Studio in Bellevue opens with momentum, leadership, and a clear mandate to build AI solutions that matter.

December 8, 2025

Build vs. Buy for Conversational AI Agents: Why the Future Belongs to Builders

Eric Ujvari

You can feel the shift the moment you try to deploy a conversational AI agent through an off-the-shelf platform. The experience looks clean and efficient on the surface, yet it rarely creates the natural, personal, assistive interactions customers expect. It routes and deflects with precision, although the user often leaves without real progress. For teams focused on modern customer experience, that gap becomes impossible to ignore.

Most “buy” options in conversational AI grew out of call center design. Their core purpose supports internal efficiency rather than meaningful customer support.

The Tools on the Market Prioritize Operations Over Experience

Commercial conversational AI platforms concentrate on routing, handle time, and contact center workflows. Their architecture directs intelligence toward internal productivity. Customers receive an experience shaped by legacy operational goals, which leads to uniform patterns across organizations.

Many buyers assume these tools match customer needs. Simple data points help reset that assumption.

First contact resolution for template bots often stays below 30 percent.

Drop-off rates rise when scripted flows fail to understand intent.

Satisfaction scores with call center bots lag behind human support.

A more experience-centric path creates a very different outcome. Picture a manufacturing technician on a production line who notices a calibration issue on a piece of equipment. A contact-center-oriented system assists the internal support team by surfacing documentation, troubleshooting steps, and recommended scripts. The support team responds quickly, although the technician still waits for guidance during a critical moment on the floor.

Whereas a true customer-facing agent engages directly with the technician. It reviews the equipment profile, interprets sensor readings, outlines safe adjustment steps, and highlights the specific parameters that require attention. The technician gains clarity during the moment of need. Production continues with confidence and momentum.

This direct guidance transforms the experience. The agent participates in the workflow as a real-time partner rather than a relay for internal teams.

Your Conversational Data Creates the Moat

Every customer question reflects a need. Every phrasing choice, pause, and follow-up captures intent. These patterns form the foundation of a truly assistive conversational AI system. They reveal friction, opportunity, and the natural language of your specific users.

SaaS solutions provide insights from these interactions, while the deeper value accumulates inside the vendor’s system. Their product evolves with your customer patterns, while your experience evolves at a slower pace.

Modern AI creates advantage through data, not through foundational models. Conversation data reinforces your knowledge of customers and shapes your ability to improve rapidly. Ownership of that data creates the moat that strengthens with every interaction.

Customization Creates the Quality Customers Feel

The visible layer of an AI agent, including the interface, avatar, or voice, offers the simplest design challenge. Real quality lives underneath. Tone calibration, workflow logic, domain vocabulary, and retrieval strategy shape the accuracy and trustworthiness of every response.

Generic templates often reach steady performance at a moderate level of accuracy. The shift into high-trust reliability grows from tuning against your specific customer language and your operational context. SaaS platforms hold the data, although they do not hold the lived knowledge required to interpret which interactions reflect success, friction, or emerging need. Your teams understand the nuance, which creates a tuning loop that only internal ownership can support.

A system that learns within the grain of your business always outperforms a template that treats your conversations as generic.

Building Thrives Through Modern Ecosystems

Building once required full-stack engineering and long timelines. Today, teams assemble ecosystems that include hosted models, vector databases, retrieval frameworks, and orchestration layers. This approach delivers speed and preserves data governance.

Many buyers assume building is slow. New modular tools make the opposite true.

Pre-trained models support fast launch.

Retrieval systems use your existing knowledge base.

Tooling reduces complexity and supports quick iteration.

Advantage grows from how your system comes together around your data. Lightweight architectures adapt quickly and evolve in rhythm with your customers.

The Strategic Equation Favors Builders

AI-native experience design has reshaped the traditional build vs. buy decision. Modern tooling accelerates internal development, and internal data governance strengthens safety. A build path creates forward momentum without relying on vendor roadmaps.

Differentiation comes from experience quality. Off-the-shelf bots produce uniform interactions across brands. Custom agents express your language, workflows, and service model.

Data stewardship defines long-term success in conversational AI. Ownership of the learning loop positions teams to adapt quickly, evolve responsibly, and compound knowledge over time.

The Organizations That Win Will Be the Ones That Learn Fastest

In the next wave of digital experience, leaders rise through insight and adaptability. Their advantage reflects what they learn from every conversation, how quickly they apply that learning, and how deeply their AI mirrors the needs of their customers.

Buying provides a tool. Building creates a learning system. And learning carries the greatest compounding force in customer experience.

Key Takeaways

Building conversational AI creates differentiated customer experiences. Internal teams shape the interaction model around real customer language, which creates clarity, momentum, and trust.

Conversational data creates the strongest moat. Every interaction provides signals about user needs, preferences, and friction points. Ownership of these signals fuels rapid improvement and compounds value over time.

Customization drives accuracy and confidence. High-performance conversational agents grow from tuning informed by lived operational context. Internal teams carry the insight required to elevate accuracy toward the highest tier of reliability.

Modern AI ecosystems accelerate a build strategy. Hosted models, retrieval frameworks, and orchestration layers provide reusable scaffolding. These tools empower teams to assemble fast, adapt continuously, and retain full data stewardship.

The build path strengthens resilience and innovation. Organizations that cultivate internal learning loops evolve faster than those using externally governed systems. The ability to learn directly from every conversation shapes the next generation of customer experience.

FAQs

What creates value in a conversational AI agent?

Value grows from the quality of the interaction. Conversational AI agents reach their potential when they draw from real customer language, understand business context, and evolve through continuous learning. Ownership of conversation data strengthens this process and elevates the customer experience.

Why do organizations choose to build conversational AI?

Organizations choose a build strategy to shape every element of the experience. Internal development allows teams to guide tone, safety, workflow logic, and response quality. This alignment creates reliable, natural, and assistive interactions that match customer expectations.

How does conversation data strengthen an AI agent?

Every user question reveals intention, preference, and behavior. These signals guide tuning, improve routing, and highlight gaps in knowledge sources. Data ownership empowers organizations to refine the agent with precision and create rapid compound learning.

How do modern AI tools support faster internal development?

Hosted large language models, retrieval infrastructures, vector databases, and orchestration frameworks provide ready-to-use building blocks. Teams assemble these components into a modular system designed around their data and their customer experience goals.

What advantages emerge when teams customize their AI agents?

Customization aligns the agent with domain language, operational processes, and brand voice. This alignment raises accuracy, builds trust, and creates a conversational experience that feels tailored and assistive.

How does a build approach create long-term strategic strength?

A build approach cultivates an internal learning engine. Every conversation sharpens the agent, strengthens customer relationships, and expands organizational knowledge. This compounding effect creates durable advantage in digital experience.

December 2, 2025

Accelerating Innovation with AWS: Robots & Pencils Selected as an AWS Pattern Partner

Robots & Pencils

Today, Robots & Pencils joins AWS as a launch partner in the AWS Pattern Partners program, an invite-only initiative that works with a select cohort of consulting partners to define how enterprises adopt next generation AI and emerging technologies on AWS.

As a Pattern Partner, Robots & Pencils brings proven success with emerging technologies on AWS, including AI/ML, Generative/Agentic AI, Robotics, Space Technology, and Quantum. The program focuses on accelerating enterprise adoption through repeatable, scalable patterns that encode tested ways to solve specific business problems, with architecture, controls, and delivery practices that have already been validated with customers.

For customers, selection of Robots & Pencils into this program signals that AWS has reviewed and endorsed both the outcomes and the operating model behind the work delivered in these domains. Enterprises that face pressure to modernize critical processes, adopt AI safely, and respond to new regulatory and security requirements gain access to patterns that have already delivered measurable results.

Pattern Partners also sets a clear horizon view for emerging technology. In the near term, it concentrates on AI/ML, Generative & Agentic AI patterns, including sub domains such as Process to Agent (P2A), Agent to Agent (A2A), Responsible AI, and RegAI. Over the midterm, the program extends these capabilities into connected environments that use Robotics, IoT, and Edge and Space Technology on AWS. For the long term, it explores Quantum and next generation enterprise innovations, aligning new capabilities with existing AWS investments in data, AI, and security as they mature into reliable patterns.

Our Pattern: Enterprise Document Intelligence Platform

At the heart of the participation of Robots & Pencils in Pattern Partners is a flagship pattern that the company is co-developing and scaling with AWS.

The Customer Problem

Organizations in Energy, Manufacturing, and Health & Wellness face a common set of challenges. Data and workflows sit in disconnected systems, which slows AI adoption and creates duplicated effort. Teams find it difficult to govern AI models and agents at enterprise scale, especially when regulations and internal standards move quickly. Talent and process gaps make it hard to adopt new technology in a way that satisfies risk, compliance, and operational leaders.

Our Joint Approach with AWS

Together with AWS, Robots & Pencils has designed the Enterprise Document Intelligence Platform. This pattern combines an architecture built natively on AWS using Amazon Bedrock, Amazon SageMaker, and Amazon Bedrock AgentCore, an operating model with clear roles, runbooks and guardrails for IT, data, security and business teams, and accelerators such as pre-built integrations, automations, policies, templates, dashboards and agents. This pattern is being refined through a time boxed incubation with a set of lighthouse customers. As it matures, it is packaged as a Pattern Package so that more joint customers can adopt it rapidly with consistent results.

Early Results

Early adopters are already reporting tangible outcomes from the Robots & Pencils’ Enterprise Document Intelligence Platform. With 2 million interactions across 100,000+ users, customers reported a 90% satisfaction score and 40% improved confidence in responses from the pattern.

As these results are validated across additional lighthouse customers, the Pattern Package becomes available to AWS field teams globally. This enables customers in new regions and sectors to benefit from the same proven approach without restarting design from the beginning.

How the Pattern Partners Program Works with Customers

When a customer engages Robots & Pencils through the Pattern Partners program, the engagement starts from a proven blueprint, not from scratch. The Pattern Package already encodes successful implementations, including architectures, guardrails, and playbooks. Customers receive coordinated support from AWS specialists, the AWS Consulting COE Pattern Partner team and experts from Robots & Pencils across consulting, engineering, and product.

The program design supports fast yet responsible experimentation. Customers can move from idea to live pilot while maintaining enterprise grade security, compliance and governance. The pattern also includes a clear path from pilot to scale, so organizations can extend from initial deployments to cross region and multi business unit rollouts with ongoing optimization.

Being part of the AWS Pattern Partners program allows Robots & Pencils to bring emerging AWS capabilities such as Generative AI and Agentic applications to customers earlier. Guardrails and controls stay clear and well defined. The company can turn its strongest customer successes into repeatable assets that benefit a wider set of organizations. Collaboration with AWS field teams, solution architects and service teams keeps the pattern aligned with the latest platform innovation. Robots & Pencils also contributes back to the broader AWS partner ecosystem by sharing learnings and raising the standard for how emerging technology is adopted. For customers, this approach reduces risk, increases predictability, and accelerates business impact from AWS investments.

Partner Perspective

“Joining AWS Pattern Partners is a strategic milestone for Robots & Pencils,” said Jeff Kirk, Executive Vice President of Applied AI, Robots & Pencils. “With our Enterprise Document Intelligence Platform, we turn our strongest customer wins into a clear, repeatable path to reduce onboarding time for customers in need of intelligent search, and increased confidence in the accuracy of the results, so customers can move from pilots to production with greater speed, control and confidence.”

AWS Perspective

“AWS created Pattern Partners to work with a select cohort of builders who can set the standard for how enterprises adopt emerging technology on AWS. Robots & Pencils brings deep expertise in KnowledgeOps, including RAG and compound systems, and a proven pattern in the Enterprise Document Intelligence Platform that is already delivering measurable outcomes for customers,” said Brian Bohan, Managing Director of Consulting COE, AWS. “We look forward to scaling this work together and bringing these benefits to more joint customers across industries.”

Next Steps

Customers interested in these patterns can speak with Robots & Pencils through Robotsandpencils.com/contact to review current challenges and identify which patterns are most relevant.

Those that want to explore Enterprise Document Intelligence Platform in depth or learn how the AWS Pattern Partners program could support their own roadmap can request a focused discovery session. In that conversation, AWS and Robots & Pencils work with stakeholders to map business challenges to the pattern, estimate potential impact, and define a practical path to adoption.

Together, AWS and Robots & Pencils look forward to turning critical business challenges into repeatable, scalable patterns for growth.

Key Takeaways

Robots & Pencils has been selected as a launch partner in the invite-only AWS Pattern Partners program, recognizing leadership in next generation AI and emerging technologies.
The joint Enterprise Document Intelligence Platform provides a validated, repeatable pattern that helps enterprises adopt Generative and Agentic AI with confidence.
The pattern incorporates AWS native services such as Amazon Bedrock, Amazon SageMaker, and Amazon Bedrock AgentCore, along with an operating model, runbooks, guardrails, and pre-built accelerators.
Early lighthouse customers report strong outcomes including high satisfaction, increased confidence in accuracy, and millions of successful interactions.
The program gives organizations a faster, clearer path from idea to pilot to full scale rollout, supported by coordinated teams across AWS and Robots & Pencils.
Enterprises gain access to patterns that reduce risk, increase predictability, and produce measurable business impact from AWS investments.

FAQs

What is the AWS Pattern Partners program?

It is an invite-only AWS initiative that works with a select group of consulting partners to define how enterprises adopt next generation AI and emerging technologies through validated, repeatable patterns.

Why was Robots & Pencils selected as a Pattern Partner?

AWS recognized the company’s proven outcomes across AI and emerging technologies, as well as its track record delivering measurable results with scalable architectures and operating models.

What is the Enterprise Document Intelligence Platform?

It is a jointly designed pattern that uses AWS native services and accelerators to help organizations unify data, streamline governance, and deploy Generative and Agentic AI across complex environments.

Which AWS technologies power the pattern?

Key services include Amazon Bedrock, Amazon SageMaker, and Amazon Bedrock AgentCore, along with AWS controls, security practices, and operational frameworks.

Who benefits most from this pattern?

Enterprises in sectors like Energy, Manufacturing, and Health and Wellness that face challenges with disconnected data, evolving regulations, and the need for responsible AI adoption at scale.

What results have early adopters seen?

Customers reported 2 million interactions across more than 100,000 users, a 90 percent satisfaction score, and a 40 percent improvement in confidence in response accuracy.

How does the program support faster innovation?

Organizations begin with a proven blueprint rather than a blank page. This accelerates pilots while maintaining enterprise grade governance and provides a clear pathway to large scale deployment.

How do customers engage?

Teams can connect through Robotsandpencils.com/contact to discuss current challenges or request a focused discovery session to understand fit, impact potential, and next steps.

What does this mean for long term innovation?

The program continually extends into new domains, guiding enterprises through emerging capabilities such as Robotics, IoT, Space Technology, and Quantum as they mature into reliable patterns.

December 1, 2025

Robots & Pencils Plans Seattle-area Expansion with Studio for Generative & Agentic AI

Robots & Pencils

The Bellevue, Washington investment opens pathways for forward deployed engineers and builders seeking career-defining work in applied AI.

Robots & Pencils, an applied AI engineering partner known for high velocity delivery and measurable business outcomes, today announced plans to open a Seattle-area Studio for Generative & Agentic AI office in downtown Bellevue in early January 2026. The expansion fuels the next phase of growth for the company’s AI-native Studio and strengthens North American delivery, as demand for AI-enabled product engineering accelerates across the United States. As an Amazon Web Services (AWS) Partner, the Bellevue location, with its proximity to Amazon headquarters, is a natural site to accelerate client AI solutions on Amazon Bedrock, Amazon SageMaker, Amazon Bedrock AgentCore, and more.

Candidates seeking high-impact engineering roles can learn more at robotsandpencils.com/careers.

The new Studio reflects a growing U.S. footprint supported by existing global operations in Cleveland, Calgary, Toronto, Bogotá, and Lviv. The Studio organizes cross-functional product, engineering, data, and design talent into vertical industry-focused pods that support sectors such as Education, Energy, Financial Services, Healthcare, Manufacturing, Transportation, and Retail/CPG. The presence in the Seattle area adds meaningful engineering capacity and enhances support for clients pursuing ambitious AI programs and large-scale modernization work.

“The investment in Bellevue and access to deep talent in the Pacific Northwest gives our teams and our clients a powerful new chapter,” said Len Pagon Jr., CEO of Robots & Pencils. “The engineering expertise in this region aligns perfectly with our Studio strategy. We see tremendous opportunities to grow our talent base, strengthen delivery, and help organizations reach AI outcomes that advance their businesses. Our teams are energized by this expansion and ready for the momentum ahead.”

Jeff Kirk, Executive Vice President of Applied AI at Robots & Pencils, will lead the Bellevue studio. “The Studio in Bellevue is a pivotal investment in our client and talent strategy,” said Kirk. “Engineers and builders in this region bring the experience and ambition that shape industry-defining solutions. Speed matters, and our Studio structure is designed for launching AI products to market every 30 to 45 days. The Seattle-area strengthens the engineering capacity required to deliver that velocity at scale. We look forward to building a team that thrives on complex challenges and produces work that matters.”

Robots & Pencils continues to invest in environments where elite talent can perform at the highest level. The company is known for its talent density, with teams averaging fifteen years of experience and contributing patents, published research, and category-shaping products across industries. The Studio creates space for engineers, applied AI specialists, product leaders, and user experience innovators to influence major client engagements and shape a new hub from the ground up. It anchors work in AI systems, agents and agentic workflows, digital modernization, intelligent automation, and data-driven product innovation.

Interested applicants can explore open roles at robotsandpencils.com/careers. The Studio is ready for builders who want to shape the next era of AI solutions with momentum and purpose.

Heard us onThe AI Daily Brief Podcast?

Robots & Pencils Hits Three AWS Summits in June with One Message: It’s Time to Launch and Scale AI

The AWS Advanced Tier Services Partner will meet with enterprise and public sector teams across Los Angeles, New York City, and Washington, D.C.

What the Robots & Pencils Builds on AWS

Related articles

Every Energy AI Initiative Stalls in the Same Three Places. Robots & Pencils Names Them.

Robots & Pencils Expands Retail and Consumer Goods Leadership with Appointment of Saul Delage as Client Partner

Robots & Pencils Goes All in on AWS with Appointment of Adrian Bird as Vice President of AWS Partnership

AWS Summit Warsaw 2026: What We Saw, Who We Met, and What It Confirmed

Agentic AI is the AWS Headline.

The Session That Landed: AgentCore Evaluations in Production

The Best Conversation Happened at the Espresso Machine

The Best Hour of the Day: Knowledge Graphs

Asking Honest Questions About AWS Transform

The Compute Thesis: AWS is Sizing Infrastructure for Self-Managed AI

A Practitioner’s Checklist for 2026

The Parts That Were Just Fun

What the Day Confirmed

Robots & Pencils is an AWS Advanced Tier Services Partner and AWS Pattern Partner. Request an AI Briefing today.

Related articles