Jacob Bank is the CEO of Relay.app, a modern agent builder platform that makes it simple and easy for anyone to build AI agents.
As part of eating his own dog food he’s personally deployed hundreds of agents within Relay to run his marketing, support, and executive assistant workflows.
This article explains how to think about agents as employees, how to get building them today and ways to create trust in agents in your organization while mitigating any risks.
It is a tutorial from a recent live class given by Jacob.
Let’s get into it now.
What you’ll learn from Jacob Bank
👉Sign up for more live classes
📽️Follow us on Youtube
Choosing between chatbot, co‑pilot, and agent
There are three distinct AI modalities you will use daily.
Each modality fits different responsibilities and failure modes.
Learn the differences and pick the right one for each task.
- Chatbot: interactive, synchronous, text-based. Use it for one-off research, ideation, or drafting tasks.
- Co‑pilot: an assistant inside a tool that helps create or edit artifacts. Use it when you are present and shaping the work in a document, prototype, or IDE.
- Agent: an autonomous doer. Use it when work must be proactive, integrated into systems, and repeated.
Use agents only when three conditions are true.
First, the agent must act proactively.
Second, it must integrate directly with your systems.
Third, the task must be repeated over days or weeks and improve with iteration.
“If you have something that’s proactive, integrated, and repeated, designing an AI agent is likely the right tool for the job.”
Examples that do not need agents include one-off PRDs and isolated drafting tasks.
Examples that do need agents include ongoing Reddit listening, automated competitor surveillance, or daily meeting follow ups.
Checklist: Agent readiness and rollout
This checklist can help as a rubric when evaluating whether to build an agent or use a chatbot/co‑pilot.
- Repetitive Task: repeated weekly or daily and would benefit from automation.
- Integration needed: Task requires integration with internal or public systems.
- Task can be tightly scoped: You can specify triggers and outputs in the what / when / how format.
- Low risk pilots: You can safely pilot the agent using public data or a limited internal scope.
- Plan B: You have a clear rollback plan and logging for every agent action.
- Accountable human: You identify an owner to monitor performance and iterate on prompts.
- Human supervisor: You define human-in-the-loop checkpoints for high-risk outputs.
- Metrics in place: You measure success criteria tied to time savings or revenue impact.
Pick your agent platform based on the technical skills of your team and your integration needs. There is no perfect product, only the one that best matches your constraints.
- n8n: best for technical teams who want open source, self-hosting, and custom integrations.
- Zapier: broadest integration coverage and enterprise trust profile. Good for conservative organizations.
- Make: flexible visual builder with many prebuilt connectors; review pricing and fit for scale.
- Relay, Relevance, Lindy: modern agent builders focused on teams that want to create autonomous agents quickly.
Many teams succeed by starting with Relay or Zapier to validate the value, then migrating to more flexible platforms once the workflows become core to operations.
How to design your org chart
It’s best to think of agents as teammates who act for you, not tools you visit.
An agent is like a human hire, in that it runs autonomously, integrates into your systems, and repeats tasks over time.
Start with: “If I could hire 5 great juniors tomorrow, what would I ask them to own?”
Look for work that is:
- High volume
- Boring for humans, but valuable for the business
- You can clearly describe what “good” looks like
Then literally sketch an org chart:
- Executive Assistant
- Support Triage
- Competitive Analyst
- Social Media Marketer
- Data Analyst
- User Researcher
- ….
Each bullet or box on that org chart is an AI agent you’ll eventually build.
Designing teams this way changes staffing decisions and unlocks large productivity gains.
- Why that matters: agents provide leverage that scales work per human employee drastically.
- Real example: Jacob runs marketing, support, HR, and recruiting internally at Relay with AI agents while a small core team of humans focuses on product development.
- Organizational impact: teams can either do more with the same headcount or maintain output with fewer people.
“I am all of those departments at our company.”
Similarly when you’re onboarding, deploying and running your agents, it’s best to adopt the mental model you use for hiring humans.
Give agents clear roles and permissions.
Treat them like contractors during onboarding and like teammates when they consistently deliver high quality work.
Write a task-oriented agent job description
You want to hire your agent the way you hire a human.
Write a clear, task-focused job description that the agent can execute.
Keep responsibilities short and deterministic.
What do you want it to do?
When should it wake up?
How should it do the work?
Examples:
Competitive analyst
- What: Track competitor activity
- When: When a competitor posts a new YouTube video, Linkedin post, or blog article
- How: Summarize the content, tag themes and post in a Slack channel
Social media marketer
- What: Repurpose my best content
- When: every week
- How: Find top-performing pots from ~6 months ago that can be reposted
The JD becomes your build spec:
What: the goal of your agent
When: the trigger from your agent
How: the steps that follow the trigger
Use the what / when / how pattern for every responsibility. This pattern maps directly to triggers and actions inside automation tools.
- What: name the task in two to five words.
- When: describe the trigger or timing condition that causes the agent to act.
- How: describe the output or steps the agent should perform.
More examples:
- Follow up after meetings: When a meeting ends — Use AI to review transcript and send a follow-up email.
- Analyze YouTube videos: When a competitor posts — Use AI to summarize and send the summary to Slack.
- Triage support email: When a new support email arrives — Categorize, classify as bug or feature, and respond if confident.
Limit each agent to three or four responsibilities during initial rollout.
“If you can’t build a workflow for it in 15 minutes, move on to the next one.”
Smaller agents are easier to reason about, test, and iterate. You can increase scope once trust and telemetry exist.
Six high-value agent job templates and responsibilities
Below are concrete job descriptions you can copy, adapt, and deploy today. Each description gives four responsibilities expressed in the what / when / how format.
Executive Assistant
- What: Triage email — When: on new incoming email — How: decide if reply is required and apply labels.
- What: Meeting briefings — When: before each meeting — How: summarize attendees, past correspondence, and talking points.
- What: Meeting follow‑ups — When: after meetings end — How: draft follow-up email and post internal Slack summary.
- What: Task reminders — When: daily — How: identify overdue tasks and notify via Slack.
Competitive Analyst
- What: Pricing tracker — When: monthly — How: capture competitor pricing changes and report deltas.
- What: Launch digest — When: weekly — How: summarize competitor product launches and implications.
- What: Review and sentiment scraping — When: monthly — How: search G2, Reddit, and other review sites and extract quotes and sentiment.
- What: Target customer analysis — When: monthly — How: scan case studies and social posts to infer competitor customer focus.
- What: Brand mentions — When: continuously — How: capture mentions, classify sentiment, and escalate issues.
- What: Content repurposing — When: new long-form content published — How: create clips and posts for other channels.
- What: Topic scouting — When: weekly — How: find emerging topics and high-performing formats in your niche.
- What: Evergreen recycling — When: monthly — How: reshare top-performing posts from six months ago with minimal edits.
Data Analyst
- What: Weekly metrics report — When: weekly — How: compile activation, churn, and sales metrics and highlight anomalies.
- What: Alerts — When: metric threshold breached — How: send change summary, root-cause hypotheses, and suggested next steps.
- What: Benchmarking — When: quarterly — How: compare your metrics to public benchmarks and highlight gaps.
- What: Customer segmentation — When: monthly — How: profile top-performing customers by industry, company size, and usage.
User Researcher
- What: Support-ticket insights — When: daily — How: synthesize recurring usability issues and tag severity.
- What: Forum listening — When: daily — How: report organic customer feedback from forums and social media.
- What: Interview synthesis — When: after each scheduled user interview — How: summarize themes, sentiment, and representative quotes.
- What: PRD impact review — When: PRD published — How: note which customer problems the PRD addresses and which it misses.
Project Manager
- What: Weekly status reports — When: weekly — How: compile progress, blockers, and next steps from ticketing systems.
- What: Overdue alerts — When: issue passes due date — How: notify owners and suggest re-prioritization.
- What: Action‑item logging — When: meeting ends — How: capture action items and assign follow-ups automatically.
- What: Progress analytics — When: weekly — How: generate burn-down charts and variance analysis.
Use these to seed early pilots that build credibility inside your organization.
Example: Building an Executive Assistant Agent in Relay
This follows the job description approach and translates responsibilities into workflows.
- Open the agent builder and click “Create agent”.
- Provide a short title, for example “Executive Assistant”.
- Enter responsibilities using the what / when / how pattern. Keep lines concise and deterministic.
- Create a first skill by copying a responsibility verbatim from your job description into the workflow co‑pilot. This becomes the zero state, or foundational prompt.
- Confirm the trigger, AI step, and actions the co‑pilot suggests. Edit conditionals as needed.
- Test the skill with representative events and iterate on the prompt.
- Turn the skill on once tests are reliable, and capture telemetry on every run.
Key developer details to check during build:
- Confirm triggers use the correct calendar, inbox, or public feed.
- Verify the AI step has access to necessary inputs from earlier steps.
- Implement conditional paths to avoid unwanted actions.
- Add a human approval step where actions carry high risk.
- Record outputs so you can evaluate agent decisions later.
“An agent works on your behalf on its own.”
Example prompt and workflow templates
Best practice: Keep them short and precise.
Examples:
- Meeting briefing zero state prompt: “Every morning, find meetings for the day, summarize attendees and recent correspondence, highlight decisions and actions, and send as a Slack DM.”
- Email triage zero state prompt: “When a new email arrives, determine whether a reply is required. If yes, prepare a draft and label it ‘to reply’. If no, categorize and file.”
- Competitor monitoring prompt: “Weekly, scan competitor sites for pricing or feature changes. Summarize differences and potential impact on our product.”
Modify these prompts with guard rails for privacy, rate limits, or format constraints.
Add test examples in the builder and refine your prompts until the test outputs match your expectations.
Testing, evals, and human‑in‑the‑loop patterns
Testing agents requires both simulations and human review.
Build tests from historical examples to validate agent behavior before live rollout, and always test before you deploy:
- Dummy data: Test prompts with representative historical events or sample messages.
- Evals: Create small eval suites of ten to twenty examples that capture edge cases.
- Metrics: Measure both classification accuracy and the appropriateness of generated content.
- Iterate: Use evaluation results to update prompts and rules iteratively.
Human-in-the-loop patterns that you should implement in any cases which carry risk:
- Inline approval: the AI drafts content and asks a human to accept, edit, or request refinement.
- Gate approval: a named approver must approve the draft before release.
- Preview-only: send outputs to a private channel for observation during early runs.
Example:
“When the AI produces the daily briefing, it’s going to ask the human to make one of three choices…”
Example human choices you could include are send, edit, or retry with refinement prompts.
Use these choices to train the agent and capture preferred patterns over time.
Creating trust & confidence: safety, permissions, and controls
Trust is one of the biggest barriers to deploying agents at scale. But it can be managed by treating agent security the same way you treat a human hire.
Just as you would with an intern, or new hire, limit your agent’s access strictly to the systems required for the responsibilities assigned:
- Least privilege: grant only the minimal permissions required for the task.
- Dedicated accounts: create agent-specific credentials when you need fine-grained control.
- Mandate OAuth for mailbox and calendar access: Never paste raw credentials into a third-party form.
- Operate with a public data first mentality: if IT resists, start with agents that only use public social and web data.
Adopting agents is a major organizational shift. Recommended strategies to manage risk, create confidence and get buy in:
- Start with public-data agents to build a business case for internal access.
- Document responsibilities, triggers, and failure modes for each agent.
- Offer training sessions for colleagues who will collaborate with agents.
- Assign an owner for each agent who monitors logs and runs weekly reviews.
Where there is risk, focus your security review on operational failure modes rather than model training concerns.
The largest risk is bad instructions that cause mass actions.
Example risk: “Could my AI agent send 1,000 emails to the wrong person?”
Mitigating that sort of risk:
- Conditions: Use conditional logic to prevent broad broadcasts until the agent is well‑trusted.
- Human in the loop: Add human approvals for high-stakes messages, such as customer-facing mass emails.
- Tests: Run initial pilots on low-risk channels like internal Slack or private mailing lists.
- Logs: log every action with rationale to enable rapid rollback and root-cause analysis.
The actions reduce risk more effectively than debating data locality or third-party training policies alone.
They address the operational harms that cause real business damage and cause businesses to lose faith in agents as colleagues.
Useful metrics to track for each agent
- Run frequency and latency: Track how often and how quickly the agent executes tasks.
- Success rate and human edits: Measure how often outputs are accepted, edited, or rejected.
- False positives and false negatives: For triage agents, capture classification error rates.
- Time saved per run: Estimate human hours saved by automating the task.
- Cost per run: Monitor compute credits and compare against value delivered.
Scaling agents, economics, and operational change
Agents change how you staff and prioritize.
They introduce a third option beyond hiring or reprioritizing: create automation to increase capacity.
- Small pilots often deliver outsized ROI: agents can produce high-quality work cheaply.
- Don’t hyper-optimize credits early: Focus on correctness, not micro-cost savings.
- Examples of scale: agents that synthesize support themes, schedule rotations, and follow up with customers.
Cost of running agents – is it worth it?
- Apples to apples: Compute credits are inexpensive versus the cost of human labor or external agencies.
- When to seek cost efficiencies: Optimize only when workflows run at high volume and incremental savings matter.
- Measuring ROI: Measure agent impact in saved hours, faster response times, improved quality, and revenue effects.
“300 AI credits is like 30 cents, and this wrote a fifteen-page report for you.”
What worked for Jason
Examples you can emulate:
- Marketing org chart: teams built per channel, each with three to ten agents handling creation, repurposing, and listening tasks.
- Support workflows: agents triage tickets, classify issues as bugs or feature requests, summarize daily themes, and schedule frontline rotations.
- Executive assistant: before meetings, agents prepare briefings. After meetings, agents draft follow-ups and post summaries to Slack.
- Sick day automation: a founder sends “sick day” in Slack and an agent informs all upcoming meeting participants with a polite reschedule message.
These examples show that agents can execute complex, multi-step processes without human intervention once trust is established.
Wrap up
Start small, iterate quickly, and focus on trust.
Design agents with precise job descriptions, rigorous tests, and human approvals for risky actions.
Deploy agents on public data to build momentum with conservative teams. Scale agents as trust and value become clear.
“Start using public data. Build confidence that this thing is actually valuable.”
Getting started today:
- Write one agent job description using what / when / how format.
- Choose a platform and build a single workflow in under 15 minutes.
- Run the workflow in preview mode against historical data and refine prompts.
- Add one human-in-the-loop gate for the first production run.
- Measure outcomes and iterate monthly.
More Hustle Badger Resources
More live classes
Articles
Cohorts
On demand courses
[Bitesize]
[In depth]
More Resources from Jacob