Toward AI-Resilient Assessment in Computer Science Courses in an AI-Native World

arXiv CS.AI

이 뉴스, 어떠셨어요?

한 번의 탭으로 반응을 남겨요 · 로그인 불필요

CC BY

이 매체는 공공·자유 라이선스로 본문을 직접 표시합니다.

Abstract

AI-native course assessments in senior computer science courses and related fields should grade students by \emph{AI-resilient skill}: the ability to achieve outcomes beyond a strong AI baseline.

Such assessments should allow students to use AI freely, while reducing the extent to which greater private AI budget or more intensive AI use, by itself, becomes a grading advantage.

This paper proposes a minimal formal framework for this goal.

The framework specifies a real task, an executable evaluator, a declared AI-native Pareto frontier, and a grading rule based on Pareto surplus.

The central claim is simple: Pareto surplus provides a measurable, protocol-relative certificate that a submitted artifact achieves a tradeoff not already supplied by the declared AI baseline, and grading by this surplus is AI-resilient with respect to that baseline.

Interpreting surplus as evidence of student skill requires the surrounding assessment protocol--for example, design reports, ablations, prompt traces, oral checks, or reproducibility explanations--but the grading certificate itself is behavioral and executable.

The framework is then extended to practical complications, including self-improving AI loops, budget neutrality, server-mediated feedback, and prompt-based red teaming.

As a concrete instantiation, we describe an AI-resilient approximate-membership assignment centered on Bloom filters for COMP 480/580 at Rice University, designed to test whether students can improve beyond AI-generated implementations.

전문 보기

Toward AI-Resilient Assessment in Computer Science Courses in an AI-Native World

이 뉴스, 어떠셨어요?

Abstract

관련 뉴스

'research' 카테고리 뉴스

What Drives Interactive Improvement from Feedback?

Contrastive Reflection for Iterative Prompt Optimization

How Can AI Find My Model? A Model-Finding Experimental Study Considering Data Formats, Embeddings, and Retrieval Strategies

arXiv의 다른 기사

Beyond expert users: agents should help users construct preferences, not just elicit them

Investigating Multi-Agent Deliberation in Law

Why Solve It Twice? Hierarchical Accumulation of Skills for Transfer-Efficient ML Engineering