AI for Program Evaluation: How School Psychologists Can Build Better Surveys Faster

AI in School PsychologySchool Psychology PracticeProgram EvaluationEducational Technology

Jun 30

Key Points (For Readers on the Go)

School psychologists often know they should conduct more program evaluation, but time and technical barriers get in the way.
AI tools can help draft survey items, organize response options, and prepare materials for survey-building platforms.
Codex may be able to help with the implementation layer, including logic, validation, response requirements, and quality checks.
This workflow is not limited to Qualtrics. It may apply to tools like Google Forms, SurveyMonkey, Microsoft Forms, REDCap, Jotform, or other approved platforms, depending on the features available.
When no sensitive or identifiable information is involved, Codex may also be able to help download survey results, analyze the data, and create simple visual summaries.
AI does not replace survey design, privacy review, ethical judgment, or careful testing.

Introduction: Why This Matters

AI for program evaluation may sound like a niche topic, but I think it has real implications for school psychologists.

School psychologists are constantly asked to make data-informed decisions. We support consultation, evaluate services, help design interventions, contribute to grants, participate in school improvement efforts, and support family-school collaboration. In theory, program evaluation should be a regular part of that work.

In practice, it often gets pushed aside.

Not because school psychologists do not value data. We do. The problem is time. Building a survey, setting up logic, testing pathways, and making sure the data will actually be usable can take hours. For many practitioners, that is enough of a barrier to stop the project before it starts.

A couple of months ago, I had one of those moments where I realized the workflow may be changing faster than I expected.

I had already been using ChatGPT to help create text files that could be uploaded into a survey platform. That alone was useful. But then it hit me: if ChatGPT can help create the survey file, why couldn’t Codex help finish more of the actual survey build?

So I tried it.

And honestly, it was one of those “wait, this changes things” moments.

The Program Evaluation Barrier Is Often Not the Idea

Most school psychologists can identify useful program evaluation questions quickly.

For example:

Did a professional learning session actually change staff confidence or practice?
Are families experiencing the evaluation process as clear and respectful?
Do caregivers understand their rights and next steps after an evaluation meeting?
Are teachers finding consultation useful?
What training do staff need next?

These are good questions. They are also practical questions.

The barrier is rarely the idea. The barrier is usually the build.

A survey has to be written, organized, uploaded, formatted, tested, and sometimes approved. Then it has to work correctly. Respondents should only see the questions that apply to them. Required items need to be required. Optional items need to remain optional. “Select up to three” should mean up to three, not exactly three.

That sounds small until you are 90 minutes into clicking through survey settings one item at a time.

What I Had Already Been Doing With ChatGPT

Earlier, I was using ChatGPT to help generate uploadable survey files.

That was already a major time saver.

Instead of manually entering every question, every response option, and every block, I could develop the survey content in ChatGPT, revise it, and generate a file that could be imported into the survey platform.

For long surveys, that can save a tremendous amount of time.

But creating the survey items is only the first layer.

The harder part often comes next.

This Is Not Really About One Survey Platform

In my case, I was working in Qualtrics. But the bigger point is not about Qualtrics.

The same general workflow could apply to many survey tools, including Google Forms, SurveyMonkey, Microsoft Forms, REDCap, Jotform, or other district- or university-approved platforms. The specific features vary by tool. Some platforms allow more complex branching, validation, embedded data, or carry-forward choices than others.

But the larger idea is the same.

AI can help move a survey from concept to implementation by supporting the tedious setup work that often slows people down.

That may include:

organizing questions into sections
setting required or optional responses
adding branching or skip logic
configuring “select up to” limits when the platform supports them
creating follow-up questions based on prior responses
checking whether the survey pathway makes sense
identifying places where the survey might confuse respondents
helping prepare a testing checklist before launch

The exact process will depend on the platform. A sophisticated research survey in Qualtrics or REDCap may require more complex logic than a quick staff feedback form in Google Forms. But for many school-based program evaluation tasks, the practical benefit is similar: AI can reduce the setup burden.

The Part I Had Been Missing

After the questions are in the survey platform, there is still a lot of setup work to do.

That includes:

setting response requirements
adding display logic
adding skip logic
validating “select up to three” or “select up to five” items
showing follow-up questions only when they are relevant
checking that consent and eligibility items are handled correctly
confirming that participants cannot accidentally skip critical items
testing whether each survey pathway works as intended

This is exactly the kind of work that is easy to underestimate.

It is also exactly the kind of work that can quietly introduce errors.

For example, if a professional development feedback survey includes role-specific questions, teachers, administrators, and related service providers may need to see different follow-up items. If a family feedback survey asks whether a caregiver attended an eligibility meeting, only those who attended should receive detailed questions about the clarity of that meeting.

That logic is not conceptually difficult. But in a survey platform, it can become tedious fast.

That is where Codex became surprisingly useful.

How Codex Helped With the Survey Build

In my case, I gave Codex access to the survey-builder link and provided a detailed prompt explaining what needed to happen. The platform happened to be Qualtrics, but the broader workflow is not limited to Qualtrics.

The same concept may apply to other survey tools, depending on what the platform allows and what kind of access Codex has.

I asked Codex to inspect the survey and configure the response requirements, display logic, skip logic, carry-forward choices, and validation rules.

For example, I told Codex that:

consent and eligibility items should use Force Response
most other answerable items should use Request Response
“select up to three” items should allow no more than three selections
“select up to five” items should allow no more than five selections
follow-up questions should appear only when they apply
role-specific questions should display only to the relevant respondent groups
disclosure or explanation items should appear only when a prior response makes them relevant

Codex was able to work through a large amount of the build.

To be clear, I would not blindly trust it. I still needed to preview the survey, test each pathway, and verify the logic. But Codex handled a lot of the tedious implementation work that would otherwise require clicking through the platform item by item.

That is a big deal.

The Next Step: From Survey Build to Data Review

There is another part of this workflow that may be just as important.

When there is no sensitive or identifiable information involved, Codex may also be able to help with the next stage: downloading survey results, analyzing the data, and representing findings graphically.

For example, after a professional development feedback survey, the data may not include student names, family information, disability status, health information, or educational records. In that kind of lower-risk context, Codex may be able to help export the results, summarize response patterns, identify common themes in open-ended comments, and create simple charts for a team meeting or report.

That could be useful for school psychologists who want to move from “we collected feedback” to “here is what the feedback shows.”

But this distinction matters.

I would not use this workflow casually with identifiable student, family, staff, health, disability, special education, or education record information. In those cases, privacy rules, district policies, data-use agreements, and ethical responsibilities come first.

AI can help with analysis and visualization when the data context is appropriate. It should not be used as a shortcut around confidentiality. It should also not be blindly followed.

Why This Matters for School Psychologists

This is not just a research shortcut.

This matters for school psychologists because so much of our work would benefit from small, practical data collection.

A school psychologist might use this kind of workflow for:

program evaluation
needs assessment
staff training feedback
parent or caregiver surveys
teacher consultation feedback
professional learning evaluations
grant reporting
service delivery planning

These do not always need to be large research studies. Sometimes the goal is simply to collect better information so a team can make a better decision.

For example, after a professional development session, a school psychologist might want to know whether staff found the training useful, whether they feel more confident, and what support they need next.

Or after an evaluation process, a school psychology team might want to know whether families understood the process, felt respected, and knew what would happen next.

These are manageable program evaluation questions.

But if building the survey takes too long, they may never happen.

This is where AI can reduce friction. A school psychologist could draft the purpose, generate survey items, revise the wording, create an uploadable survey file, and then use Codex to help configure the logic.

That does not remove the need for expertise. It makes it more likely that the evaluation actually gets done.

Practical Examples for School Psychologists

Example 1: Professional Development Feedback

After a training on behavior intervention planning, a school psychologist wants feedback from staff.

The survey might include:

confidence before and after the training
usefulness of specific training components
open-ended feedback
requests for follow-up coaching
role-specific questions for teachers, administrators, and related-service providers

Codex could help route staff to questions that match their role.

For example, teachers might receive questions about classroom implementation. Administrators might receive questions about systems-level support. Related-service providers might receive questions about collaboration and feasibility.

If the survey does not include sensitive or identifiable information, Codex may also be able to help after the survey closes. It could assist with downloading the data, summarizing response patterns, creating simple graphs, and preparing a brief summary for the team.

That does not mean the results should be accepted without review. The school psychologist still needs to check the data, interpret the findings, and make sure the conclusions are reasonable. But the workflow could reduce the time between collecting feedback and actually using it.

Example 2: Family Feedback After Evaluations

A school psychology team wants to improve the evaluation process for families.

The survey might ask caregivers whether:

the evaluation process was explained clearly
they understood their rights
the team communicated respectfully
the report was understandable
they knew what would happen next

This kind of survey must be handled carefully, especially around confidentiality, consent, language access, and family trust. AI can help with drafting and formatting, but the school psychologist still needs to ensure the survey is appropriate, respectful, accessible, and aligned with district expectations.

Codex might help organize the survey, build the sections, add skip logic, and create a testing checklist. For example, caregivers who did not attend a meeting should not receive detailed questions about the meeting itself. Caregivers who request follow-up should be routed to an appropriate contact process, depending on district policy.

However, this is also the kind of survey where privacy concerns are much higher. If responses could be linked to a student, caregiver, disability status, evaluation process, or educational record, the data should not be casually downloaded into or analyzed by an AI system without appropriate safeguards.

In this example, AI can help with structure and setup, but confidentiality and governance have to come first.

Here is the Video

Here is a short demonstration video that shows part of this workflow in action:

Why This Matters for Researchers

This is also highly relevant for researchers.

Survey development is often described as if it is a small technical step. In reality, the build process can become a major time burden.

Complex surveys often require:

branching logic
eligibility screening
embedded variables
validation rules
carry-forward choices
multiple respondent pathways
careful testing before launch

AI assistance does not replace research design. It does not replace construct clarity, sampling decisions, IRB review, item development, accessibility review, or psychometric thinking.

But it may reduce the mechanical burden of implementation.

That matters because researcher time is limited. If AI can help with the repetitive setup work, researchers can spend more time on the parts that actually require professional judgment:

clarifying constructs
improving item wording
reducing respondent burden
aligning items with research questions
reviewing response options
checking for bias
planning analyses
improving recruitment materials
testing data quality

That is where human expertise still matters most.

Ethical Considerations

This workflow is promising, but it also needs guardrails.

Privacy and Confidentiality

School psychologists should not paste identifiable student, family, staff, health, disability, special education, or education record information into general-purpose AI tools unless the tool has been approved for that use and appropriate agreements are in place.

For program evaluation, this means using hypothetical examples, de-identified content, or general survey structures whenever possible. If a survey involves student services, disability, mental health, discipline, family experiences, or special education processes, privacy review becomes especially important.

The same caution applies to data analysis. Codex may be able to help analyze and graph non-sensitive feedback data, but that does not mean it should be used with confidential or identifiable information.

Human Review Is Required

Codex may configure survey logic, but the school psychologist or researcher remains responsible for the final product.

Every pathway should be tested. Every required item should be checked. Every skip pattern should be previewed. Every validation rule should match the wording of the item.

A simple mistake can create real problems.

For example, if an item says “select up to five,” the validation should allow five or fewer selections. It should not require exactly five selections. That difference matters.

Governance and Approval

School psychologists should follow district policies, IRB requirements when applicable, and professional ethical standards.

Some program evaluation activities may not be research, but they still require thoughtful planning. If findings may be shared outside the district, published, or used beyond internal improvement, additional review may be needed.

AI can support the workflow, but it does not determine whether a project is program evaluation, quality improvement, research, or something else.

The Caution: Do Not Skip Testing

I do not want to oversell this.

Codex can help build a survey. That does not mean the survey is ready to launch.

There is also an important platform-specific caution. Not every survey tool supports the same features. Google Forms, for example, can handle sections and branching, but it does not offer the same level of advanced survey logic as Qualtrics or REDCap. SurveyMonkey and Microsoft Forms have their own strengths and limitations.

So the question is not “Can AI do everything in every platform?”

The better question is: “Can AI help reduce the setup burden in the platform I already use?”

The workflow should not be:

“Let AI do everything and move on.”

A better workflow is:

Use AI to draft.
Use AI to build.
Use human expertise to verify.
Use AI to summarize or visualize only when the data context is appropriate.

Before sending a survey, preview it as different types of respondents. Test the consent flow. Test the eligibility criteria. Test the branching logic. Test the validation. Submit sample responses and inspect the dataset.

That last step is important. A survey can look correct on screen but still produce messy or unusable data.

The Bigger Point

What surprised me was not that AI could help write survey questions.

I already knew that.

What surprised me was that Codex could help with the implementation layer: display logic, response requirements, validation, carry-forward choices, and survey quality control.

And in some lower-risk cases, it may also help with the next step: summarizing and graphing non-sensitive results.

That is a different kind of usefulness.

For school psychologists, this could reduce the friction between “we should collect data on this” and actually having a working survey.

For researchers, it could reduce the time spent on mechanical setup and increase the time available for design, analysis, interpretation, and dissemination.

That is the part that feels meaningful to me.

Not because AI replaces professional judgment. It does not.

But because it can lower the barrier to doing work that many school psychologists already know is important.

And in a field where time is always limited, reducing that barrier matters.

Final Takeaways

Program evaluation is important in school psychology, but survey-building often creates a practical barrier.
ChatGPT and Codex can help draft, organize, and build surveys across commonly used platforms, depending on the tool’s features and permissions.
AI may be especially useful for the implementation layer: logic, validation, response requirements, and testing checklists.
When there is no sensitive or identifiable information, Codex may also help analyze survey results and create simple graphical summaries.
Human review remains essential. Every pathway, validation rule, response option, dataset, and graph must be checked.
Used carefully, this workflow could make small-scale program evaluation more realistic for school psychologists and researchers.

AI Use Disclosure - Portions of this post were drafted with the assistance of an AI writing tool and revised by the author for accuracy, clarity, and professional judgment.

AI for Program EvaluationSchool PsychologyProgram EvaluationSurvey DevelopmentCodexChatGPTNeeds AssessmentEducational TechnologyFamily EngagementEthical AI Use

Adam Lockwood

Adam B. Lockwood, PhD, NCSP, LP, is a school psychologist, researcher, and consultant focused on the responsible use of artificial intelligence in education and psychology.

https://lockwoodconsulting.net/about