Key Points (For Readers in a Rush)

  • Benchmarks don’t tell you whether an AI is good for your work—they measure abstract tasks, not school psychology practice.

  • Interview your AI the way you would an intern: try realistic tasks like summaries, parent letters, intervention ideas, or translations.

  • Use mock or synthetic data only—never input PII or PHI unless you have a FERPA DUA or HIPAA BAA.

  • Evaluate accuracy, clarity, tone, cultural responsiveness, consistency, and practical utility.

  • Different models excel at different tasks—there is no universal “best” AI.

  • Document your evaluation process so your district or agency understands how you vetted the tool.

  • A downloadable AI “Job Interview” Rating Form is available to guide your evaluation.

Before you experiment with any AI tools—especially if you plan to use them in clinical, counseling, or school-based practice—take a moment to review this resource:

🔗 AI Tool Evaluation Checklist

This checklist helps you determine whether a tool meets ethical, legal, and professional standards. It provides practical prompts to evaluate transparency, data handling, reliability, and alignment with NASP’s guidance on AI in school psychology.

And remember:

  • Never enter any personally identifiable information (PII) into an AI tool unless it is protected under a FERPA-compliant Data Use Agreement (DUA) with your district.

  • If you work in clinical or health-related settings, never enter protected health information (PHI) into a tool unless it is HIPAA-compliant and your agency has a current Business Associate Agreement (BAA) with the vendor.

  • Even “redacted” case details can be risky—AI tools can infer or reconstruct missing information.

  • When in doubt, always use mock or synthetic data for testing.

Once you’ve confirmed that a tool is appropriate for evaluation, you’re ready to “interview” your AI—just as you would a practicum student or colleague joining your team.

Why Benchmarks Aren’t Enough

AI companies love to advertise benchmark results—scores on exams like AIME (math), MMLU (general knowledge), coding challenges, or abstract reasoning tasks. These benchmarks matter to AI developers, but they don’t tell school psychologists what they really need to know:

Can this AI help with the real tasks I face every day?

As Ethan Mollick explains in One Useful Thing, AI performance varies significantly across tasks. One model may excel at writing, another at analysis, another at generating intervention ideas. A high benchmark score in abstract reasoning doesn’t guarantee clear parent communication or appropriate tone in a sensitive email.

In other words: Benchmarks won’t tell you whether you should use the model.

That’s where the concept of “giving your AI a job interview” becomes so powerful.

Benchmarking in the Real World

Mollick highlights a large OpenAI experiment in which human experts created realistic job tasks—multi-step projects in finance, law, retail, and more. AI systems and human professionals completed the same tasks, and independent evaluators graded the results blindly.

The findings were clear:

  • Different AI models perform differently depending on the domain.

  • Some AIs consistently take more risks; others are more conservative.

  • No single model is “best” across all real-world tasks.

For school psychologists, this matters.

The AI that produces great counseling scripts may not be the best at drafting a psychoeducational summary. The model that writes a warm parent letter might be unreliable when generating progress-monitoring guidance.

So instead of relying on public benchmarks, school psychologists should evaluate AIs based on their own tasks, using criteria that reflect their own professional standards.

How to “Interview” an AI for School Psychology

Here’s a practical, step-by-step approach adapted for school psychology.

Step 1. Define the Job Description

What do you actually want the AI to do?

Examples:

  • Draft a parent-friendly explanation of assessment results

  • Suggest evidence-based classroom supports

  • Rewrite a paragraph in clear, culturally responsive language

  • Generate a counseling activity aligned with a student goal

  • Summarize open-ended teacher input

  • Draft a communication that aligns with confidentiality expectations

Each task represents a different “role” the AI is interviewing for.

Step 2. Create Realistic Interview Tasks

Give the AI prompts similar to your everyday work—but using mock data at first using proper prompting techniques.

Try things like:

  • Report Summary Task

    “Write a three-sentence summary of cognitive results for a student (mock data provided). The audience is a parent with no psychology background.”

  • Intervention Task

    “Generate three classroom supports for a student with difficulty sustaining attention during independent work.”

  • Professional Communication Task

    “Draft a brief email explaining what happens next in the evaluation process.”

  • Translation Task

    “Rewrite this technical paragraph so a 6th-grade student can understand it.”

Run the same prompt through several models if you have access to more than one. You’ll quickly see differences.

Step 3. Evaluate the Results Like a Supervisor

Treat this like reviewing a trainee’s work. Look for:

  • Accuracy

    Does it reflect accepted school psychology practice?

  • Clarity

    Is the language accessible to families and teachers?

  • Tone

    Does it sound professional, respectful, and neutral?

  • Cultural/Linguistic Responsiveness

    Does it avoid stereotypes and use inclusive examples?

  • Consistency

    Repeat the same prompt—does the model stay stable?

  • Practical Utility

    Would you actually use or adapt the output?

A downloadable AI “Job Interview” Rating Form is available for this purpose.

Step 4. Rerun the Interview

AI models can produce different answers each time.

Run each task multiple times to see:

  • whether the model changes tone

  • whether its reasoning is stable

  • whether it introduces errors inconsistently

Reliability is part of competence.

Step 5. Make Your Hiring Recommendation

After reviewing accuracy, clarity, tone, and reliability:

  • Would you “hire” this AI tool for this specific task?

  • Would you use it only with excessive supervision?

  • Or is it not a good fit?

Different tasks may lead you to different conclusions.

Some AIs will be your best “assistant” for writing.

Others may shine in intervention brainstorming.

Others may be good only for outlining.

This is exactly the point:

There is no universal best AI—only the right AI for the right job.

A Final Word of Caution

Even the best AI tools require professional judgment and oversight.

To use AI ethically and responsibly:

  • Never input real student information into a system without FERPA- or HIPAA-compliant agreements.

  • Use mock or synthetic data for all evaluations.

  • Review and revise all AI-generated content before use.

  • Document your evaluation process so your district or supervisor understands how you vetted the tool.

When school psychologists treat AI as an intern whose work must be reviewed—not a black box to trust blindly—we move toward a model of practice where technology truly enhances, rather than undermines, our ethical and professional responsibilities.

AI Use Disclosure

Portions of this blog post were generated with the assistance of artificial intelligence tools to support drafting, organization, and editing. All content was reviewed, revised, and approved by the author to ensure accuracy, ethical alignment, and professional relevance. No student, client, or confidential information was entered into any AI system during the creation of this post.

Next
Next

Turning Lessons Into Lyrics: How to Use AI Music to Teach