AI Practical Lab | AI In The Loop: Coaching

"There were lawsuits because we didn't have human judgment in the loop around governance." — Marty Murrilo

The real work in the room

Coaches and people leaders came into this working exchange carrying a question that doesn't have a clean answer yet: where exactly does AI make coaching sharper, where does it surface the biases we can't see in ourselves, and where does it need to be kept out entirely to protect the professional integrity that makes coaching worth paying for?

The work was getting stuck in the middle of all three. Not because the tools aren't capable — but because most practitioners and leaders haven't yet built the discipline to tell the difference between those three zones in real time. And with low-cost AI coaching products already in market, and hiring algorithms already producing lawsuits, the window for getting this right is narrowing.

People shaping the work

This live interactive discussion was facilitated by Marty Murrilo and Mike Hruska, joined by Mary Ann Kennedy, alongside a room of approximately 20 certified practitioners, people leaders, and learning professionals, all navigating the AI-in-coaching question from inside active practice.

Where the conversation started: human or reverse centaur?

Mike opened with a question the room kept returning to all morning.

"Are you working for it, or is it working for you?" — Mike Hruska

He framed it through what he called the Technology Centaur model: in a healthy human-AI relationship, the human stays in the lead and AI functions as a power multiplier. The reverse — where the algorithm manages the human, the way a delivery driver is routed and monitored by an app — is already happening in workplaces, often without people noticing.

The practical question for coaches and people leaders isn't whether AI is useful. It's whether you're directing it or being directed by it. Most people in the room were somewhere in between, which is exactly where the blind spots live.

The context problem: why most AI output is mediocre

Mike Hruska brought a live example that reframed how the room was thinking about AI quality. He described a group of executives at a global company who asked AI for ideas about AI in manufacturing. They got generic output and assumed the tool was the problem.

It wasn't. The problem was context.

Mike asked them to start over — this time giving the AI a name, a role, a personal backstory, and specific details about the people in the room.

"Pretend a 20-year-old intern named Sam. Tell Sam your name, age, degree, tenure — and ask Sam to just ask you questions." — Mike Hruska

The same group came back with 157 ideas. People said they had done more forward-looking strategic thinking in that working exchange than they typically did in dedicated planning time.

The signal for people leaders: the quality gap most teams are experiencing with AI isn't a tool problem. It's a context investment problem. The people getting the best output are the ones who spend time upfront building rich personal and organizational context before asking for anything.

AI as a mirror: what forensic transcript analysis actually reveals

Vilson Simon introduced the concept that generated the most sustained discussion of the morning — using AI not to summarize a coaching conversation, but to forensically analyze what the coach was missing during it.

"AI is augmenting my capacity to listen — and giving me feedback on what I'm missing." — Vilson Simon

He described a specific case: a CFO client he was ready to give up on after her second coaching exchange. His own judgment bias against the client was too strong, and she knew it. With the client's permission, he ran the transcript through Manus AI and asked for a forensic analysis of what was said and, more importantly, what wasn't being said clearly.

The analysis surfaced something he had filtered out entirely: the client was carrying unresolved childhood trauma, and was unconsciously treating the coaching relationship as therapy. Vilson knew he wasn't the right resource for that. But he couldn't see it in the room — because his bias was in the way.

The broader signal: coaches and managers have blind spots that live precisely in the places they're most confident. AI forensic analysis doesn't replace the judgment call. It creates the data that makes a better judgment call possible.

Vilson also named a subtler version of this problem — one rooted in language and culture rather than personal bias. As a native Portuguese speaker practicing in English, she discovered through AI analysis that he had been using a phrase that translates as gentle reassurance in Portuguese but lands as confrontational in English. The tool caught the drift his own fluency had hidden from him.

This matters for any people leader working across cultures, languages, or demographic contexts where the same words carry different social weight.

The governance gap: where human judgment isn't optional

Marty grounded the bias discussion in a harder, more consequential example. Several companies — Workday among those named in legal proceedings — fed years of successful hiring data into AI talent acquisition systems without auditing what values and assumptions were embedded in that data. The result was algorithms that systematically downgraded candidates based on gender markers in their applications, producing real legal liability.

The lesson isn't that AI can't support talent decisions. It's that AI will amplify whatever is already in your data — including the biases you haven't examined. Human judgment in the governance layer isn't a nice-to-have. It's what keeps the amplification pointed in the right direction.

The same logic applies to coaching. AI can surface patterns, flag language, and analyze what was missed. It cannot determine whether a client needs a coach, a therapist, or a direct conversation with their manager. That call belongs to the human in the room.

Protect or evolve: where the profession is actually heading

The most direct tension in the working exchange was the one between coaches who feel the pressure to protect professional standards and those who have already moved through that question toward deliberate integration.

One participant — a veteran coach approximately three years from what she expected to be retirement — described the shift plainly:

"I'm offloading the stuff I don't like doing — and I am completely re-engaged."

She had been planning to coast. Instead, by automating administrative work she disliked — feedback synthesis, session prep documentation, review drafts — she freed herself to do more of the high-value human work she'd entered the profession to do. She didn't know anymore if she'd retire on the schedule she'd planned.

The room's consensus, stated plainly by another participant: "Part of the way that we protect is by evolving."

That's the practical frame. Not protectionism. Not passive adoption. Deliberate integration — where AI handles the work that doesn't require you, and you bring full human judgment to the work that does.


What to try next

The closing of the working exchange used a Head / Heart / Pocket structure — one new thought, one new feeling, one new piece of tradecraft. Here's how that maps to practical moves worth testing this week.

Head — change how you set context before the first prompt. Before your next AI interaction — whether it's a coaching prep note, a feedback synthesis, or a planning prompt — spend three minutes building context first. Name the role, the person, the history, the outcome you're working toward. Don't start with the request. Start with the setup. Measure whether output quality changes.

Heart — run one piece of your own work through forensic analysis. With appropriate permissions in place, upload one coaching transcript, team debrief, or feedback conversation to an AI tool and ask it to surface what was not said clearly — not just a summary of what was. Look specifically for where your own framing or language may have shaped the conversation in ways you didn't intend. The early evidence signal: you find at least one thing you would have handled differently.

Pocket — map where human judgment is non-negotiable in your current workflow. Draw a simple line for your team or practice. On one side: the work AI can handle (admin compression, feedback drafts, prep notes, pattern analysis). On the other: the work that requires human judgment (cultural nuance, imposter syndrome, governance calls, relationship repair). If you can't draw that line clearly right now, that's the first piece of tradecraft to build.

Bring your version of this work into the ELE community

If you're navigating where AI fits in your coaching practice, your manager development work, or your talent decisions — and you're not sure where the line is — that's exactly the kind of real work the ELE community exists for.

Bring the challenge. Compare signals with trusted peers. Leave with practical next moves you can use.

Submit My Challenge Now: https://www.ele.llc/faqs/share-top-of-mind-talent-challenges

ShareCopy