ETS Automated Scoring And NLP Technologies - Educational Testing Service

1y ago

7 Views

2 Downloads

1.82 MB

6 Pages

Last View : 29d ago

Last Download : 3m ago

Upload by : Isobel Thacker

Report this link

Download PDF

Transcription

ETS Automated Scoring and NLP Technologies Using natural language processing (NLP) and psychometric methods to develop innovative scoring technologies The growth of the use of constructed-response tasks (test questions that elicit open-ended responses, such as short written answers, essays and recorded speech) — along with the ongoing need to report test results in a timely fashion — spurred the development of innovations in scoring. ETS began conducting research on automated scoring of constructedresponse tasks in the 1980s and expanded this research to incorporate NLP technologies in the mid-1990s. This line of research ultimately resulted in multiple types of automated scoring technologies for such fields and areas as architectural design, mathematics, essays, typed answers, computer science and spoken responses. Today, the e-rater engine is used to assist human raters in scoring academic essays on the GRE General Test and the TOEFL test. The e-rater engine reliably predicts human scores, as indicated by more than 10 years of system evaluations. Over the years, ETS researchers have extended their work in NLP research and developed other automated scoring technologies: the c-rater TM system, the m-rater engine, and the SpeechRater SM engine. ETS has incorporated these technologies into many of its testing programs, products and services, including the Criterion Online Writing Evaluation Service, and TOEFL Practice Online. ETS also uses NLP to develop learning tools and test development applications, as well as Language MuseTM. Examples of ETS’s NLP capability are available at www.ets.org/research. What Is Automated Scoring? At ETS, we have made a substantial investment in research on the automated scoring of open-ended tasks for more than a decade. Our goal is to improve the validity of the score results, while creating methods and computer applications that reduce the cost and effort involved in using human graders. We believe that scores should support the uses of assessment regardless of the role computers play in creating them. We briefly describe here the automated scoring applications that we have developed. These applications — the e-rater engine, the c-rater system, the m-rater engine, and the SpeechRater engine — help evaluate responses to tasks that require test takers to write essays, fill in the blank, write math equations or give oral responses. We also describe the Language Muse, which teachers can use to make tests and other classroom materials more understandable to English-language learners in situations where knowledge of English is not an instructional goal. We continuously refine each of these applications based on the best-available definitions of skill and proficiency, as well as state-of-the-art psychometric, NLP and speech science. We are also able to provide quick results through the use of web-based services. To learn more about how your program can benefit from the ETS automated scoring capabilities, contact RDWeb@ets.org.

The c-rater System 3. Third, a matching algorithm uses the linguistic features culminated from both SR and NLP to automatically determine whether a student’s response says the same thing or implies the expected concepts. TM The c-rater system is ETS’s technology for the automatic analytic-based content scoring of short free-text responses, ranging in length from a few to approximately 100 words. 4. And fourth, the c-rater system applies the scoring rules to produce a score and individualized instructional feedback that justifies the score to the student. Analytic-based content is the kind of content that is predefined by a test developer in terms of main ideas or concepts. These concepts form the evidence that a student needs to demonstrate as her/his knowledge in his/her response. The following shows an example of a test item with the expected analytic-based content in the response and one way of assigning score points. 4 Main Processes Test Item (Full credit: 2 points) Stimulus: A reading passage Prompt: In the space provided, write the question that Alice was most likely trying to answer when she performed Step B. Concepts or main/key points: C1: How does rain formation occur in winter? C2: The c-rater system has been used within many domains, including biology, English, mathematics, information technology literacy, business, psychology and physics. Assessment and learning work in tandem, in a literal sense, in the c-rater system. How is rain formed? C3: How do temperature and altitude contribute to the formation of rain? Scoring rules: 2 points for C1 The e-rater Engine 1 point for C2 (only if C1 is not present) or C3 (only if C1 and C2 are not present) ETS first deployed the e-rater automated essay evaluation and scoring engine in 1999 to provide one of two scores for essays on the writing section of a graduate admissions program. Otherwise 0 points The e-rater engine predicts essay scores based on features related to writing quality, including grammar, usage, mechanics, style, organization and development. The computational methodology underlying the system is NLP, which identifies and extracts linguistic features from stored, electronic text or speech. The engine’s score predictions have been shown to be comparable to human reader scores, and its additional capabilities can automatically flag or detect off-topic responses. There are four main processes in the c-rater system: 1. The first is Sample Responses (SR), in which a set of model responses are generated — either manually or automatically. 2. Second, the c-rater system automatically processes model responses and students’ responses using a set of NLP tools and extracts the linguistic features. 2

The e-rater engine has gone through many changes since its first release. The most notable changes include increased coverage of the writing construct and enhancements to the set of linguistic features extracted for use in the e-rater model building and scoring procedures. The ability to develop more features that are relevant to the writing construct was a direct result of advances in the field of NLP. The e-rater engine is also the scoring engine behind test-preparation products. These include TOEFL Practice Online and practice tests for high-stakes writing tasks, such as those that appear on the GRE and TOEFL exams. The m-rater Engine The m-rater automated scoring engine scores computerdelivered constructed-response mathematics items for which a response is either a mathematical expression or equation, or a graph. When the response is an expression or equation, the m-rater engine is used in conjunction with ETS’s Equation Editor, which allows a student to enter an equation or inequality or other expression in a standard format, with exponents, radical signs, etc. When the response is a graph, the m-rater engine is used in conjunction with ETS’s Graph Editor, which allows the student to enter a graph consisting of one or more points, lines broken lines or curves. The m-rater engine can score responses to items in a set of items conditional on the student’s responses to previous items in the set. Using NLP methods, the e-rater engine identifies and extracts the following features for model building and essay scoring: Grammatical, word usage or mechanical errors resence and development of essay-based P discourse elements Style weaknesses Statistical analysis that examines use in user essays as compared to training essays at different score points Two measures of essay content The e-rater engine is used in high- and low-stakes settings. In high-stakes settings, the engine is used operationally for both the Issue and Argument prompts of the Writing section of the GRE General Test, resulting in increased quality and faster score reporting. The engine is also used for the Independent prompt of the Writing section of the TOEFL iBT test. In low-stakes applications, the engine is integrated into the Criterion Online Writing Evaluation Service. This web-based, essay evaluation service is widely used as an instructional writing application in K–12 and community college settings. Using the e-rater system, the Criterion service offers immediate, individualized feedback about errors in grammar, usage and mechanics; the presence and absence of discourse structure elements (i.e., thesis statement, main points, supporting ideas and conclusion statements); and style advice. The engine also provides advisories if an essay is irrelevant to the topic, has discourse structure problems and contains disproportionately large numbers of grammatical errors (given essay length). All of the diagnostic feedback can be used by students to revise and resubmit an essay. Resubmissions receive additional feedback. Equation Editor Graph Editor 3

Most other automated capabilities for assessing English learners’ spoken responses are limited to tasks for which the responses are predictable, such as reading a passage aloud or repeating a sentence. The SpeechRater engine is not limited in this way. It can be used to score spontaneous responses, in which the range of valid responses is very broad. The engine allows the advantages of automated scoring (reliability, flexibility, reduced cost and speed) to be applied, even to very naturalistic tasks. When the response is an expression or equation, the m-rater engine determines if the student’s response is mathematically equivalent to the correct response. It is therefore not necessary, when writing an m-rater engine scoring model, to list all acceptable versions of the correct response. The m-rater engine determines mathematical equivalence by numerically evaluating the two expressions or equations at many points to be sufficiently confident that they are equivalent (or to find a counterexample that shows they are not equivalent). The m-rater engine randomly selects the points to be evaluated. In addition, the content specialist writing the scoring model can specify additional points to be evaluated. Research has shown that numerical evaluation has roughly the same level of accuracy as symbolic manipulation. How the SpeechRater engine works The SpeechRater engine processes each response with an automated speech recognition system specially adapted for use with nonnative English. Based on the output of this system, natural language processing is used to calculate a set of features that define a “profile” of the speech on a number of linguistic dimensions, including fluency, pronunciation, vocabulary usage and prosody. A model of speaking proficiency is then applied to these features in order to assign a final score to the response. While the structure of this model is informed by content experts, it is also trained on a database of previously observed responses scored by human raters in order to ensure that the engine’s scoring emulates human scoring as closely as possible. When the response is a graph, the student enters the graph in the Graph Editor by selecting points in the coordinate plane and selecting a button to indicate how the points are to be connected — with a straight line, with a curve, with broken line segments or not connected at all, but left as points. The m-rater engine then scores the response based on the points the student selected. Both the Equation and Graph Editors can be configured by content specialists for individual items. For example, specialists can specify the letters a student can enter as variables in an equation, or the scale and grid interval of the axes in a graph. Furthermore, if the response is found to be unscoreable due to audio quality or other issues, the SpeechRater engine can set it aside for special processing. Currently, the SpeechRater engine uses a subset of the information used by trained human raters to score spoken responses. Because of the challenging nature of automated analysis of speech from nonnative English speakers, at varying proficiency levels, many of the engine’s features focus on speech delivery, rather than the higher-level aspects of language use or topic development. However, ongoing research is gradually reducing the differences between the criteria human raters use and those applied by the SpeechRater engine. The SpeechRater Engine SM The SpeechRater engine provides automated scoring of spoken English proficiency, as demonstrated through spontaneous speaking tasks like those found on the TOEFL test. It has been used to score the TOEFL Practice Online Speaking test since 2006. “ The SpeechRater engine processes each response with an automated speech recognition system specially adapted for use with nonnative English. ” 4

Using the SpeechRater engine in low-stakes settings materials. The application supports the development of language-centered, instructional scaffolding to facilitate student content learning and language skills development. Language Muse contains two core modules: (a) an automated linguistic feedback module and (b) a lesson planning module. The feedback module supports teachers in the efficient identification of vocabulary, sentence complexity and discourse features in classroom readings that might be unfamiliar to English learners. The most recent version of the SpeechRater engine (2.0, 2009) shows a correlation of 0.73 with human scores from the operational TOEFL test. Based on the engine’s current construct coverage and agreement with human scores, it is suitable for use on assessments used to make low-stakes decisions. The features the SpeechRater engine uses to establish its profile for a response are not limited to the type of Teachers can use identified features to inform lesson items found on the TOEFL test. Since they target the SM planning and to cover language objectives required Language Muse underlying speaking proficiency construct, they could by state curriculum standards. This is consistent with also be applied to other item types that address a similar SM critical goals of the Common Core StateofStandards Language Muse is atasks. web-based application intended to support teachers’ coverage construct using speaking Initiative. Language Muse has been used in teacher language objectives, especially for English learners, to ensure that all students have equal professional development programs at Stanford access to content TMin classroom reading materials. The application supports the development of Language Muse University, George Washington University and language-centered, instructional scaffolding to facilitate student content learning and language Georgia State (a) University. The system is being skills development. Language Muse intended contains two core modules: an automated linguistic Language Muse is a web-based application piloted with teachers in California, New Jersey in feedback module and (b) a lesson planning module. The feedback module supports teachers to support teachers’ coverage of language objectives, and Texas and for use in classrooms with high the efficient identification vocabulary, discourse features in especially for English learners, of to ensure that all sentence students complexity classroom readings that might be unfamiliar to English learners. English learner populations. have equal access to content in classroom reading Teachers can use identified features to inform lesson planning and to cover language objectives required by state curriculum standards. This is consistent with critical goals of the Common Core State Standards Initiative. Language Muse has been used in teacher professional development programs at Stanford University, George Washington University and Georgia State University. The system is being piloted with teachers in California, New Jersey and Texas for use in classrooms with high English-learner populations. 5

About ETS At ETS, we advance quality and equity in education for people worldwide by creating assessments based on rigorous research. ETS serves individuals, educational institutions and government agencies by providing customized solutions for teacher certification, English language learning, and elementary, secondary and post-secondary education, as well as conducting education research, analysis and policy studies. Founded as a nonprofit in 1947, ETS develops, administers and scores more than 50 million tests annually — including the TOEFL and TOEIC tests, the GRE tests and The Praxis Series assessments — in more than 180 countries, at over 9,000 locations worldwide. Copyright 2012 by Educational Testing Service. All rights reserved. ETS, the ETS logo, LISTENING. LEARNING. LEADING., CRITERION, E-RATER, GRE, TOEFL, TOEFL IBT and TOEIC are registered trademarks of Educational Testing Service (ETS). C-RATER, LANGUAGE MUSE and THE PRAXIS SERIES are trademarks of ETS. SPEECHRATER is a service mark of ETS. 21446

ETS has incorporated these technologies into many of its testing programs, products and services, including the Criterion Online Writing Evaluation Service, and TOEFL Practice Online. ETS also uses NLP to develop learning tools and test development applications, as well as Language MuseTM. Examples of ETS's NLP capability are available at

Related Documents:

A PRACTICAL INTRODUCTION TO NLP NLP - Excellence Assured

have been so impressed with NLP that they have gone on to train in our Excellence Assured NLP Training Academy and now use NLP as NLP Practi-tioners, Master Practitioners and Trainers with other people. They picked NLP up and ran with it! NLP is about excellence, it is about change and it is about making the most of life. It will open doors for

18 Views

1y ago

Natural Language Processing Techniques Using Deep learning ANN

NLP experts (e.g., [52] [54]). This process gave rise to a total of 57 different NLP techniques. IV. CLASSIFYING NLP TECHNIQUES BY TASKS We first classify the NLP techniques based on their text-processing tasks. Figure 1 depicts the relationship between NLP techniques, NLP tasks, NLP resources, and tools. We define an NLP task as a piece of .

21 Views

6m ago

The Power of - cprsuccess.com

5. Using NLP to Overcome Mental Barriers 6. Using NLP to Overcome Procrastination 7. Using NLP in Developing Attraction 8. Using NLP in Wealth Manifestation 9. How to Use NLP to Overcome Social Phobia 10. Using NLP to Boost Self-Condidence 11. Combining NLP with Modelling Techniques 12. How to Use NLP as a Model of Communication with Others 13.

13 Views

1y ago

1. NLP Weltkongress - NLP Institutes

1.NLP's state of development calls for the 1st NLP World Congress The field of NLP has now existed for approximately 34 years. "The wild days" (a book describing the first 10 years of NLP) are over. NLP has grown and can be said to have grown up, integrating depth

9 Views

1y ago

NLP & COACHING PRE - STUDY

methods that are still part of good NLP Practitioner and NLP Master Practitioner trainings today, such as anchoring, sensory acuity and calibration, reframing, representational systems Today NLP is still evolving as NLP’ers continue experimenting with the application of NLP. Like most things,

70 Views

2y ago

NLP and You Design - NLP Life Training

NLP LIFE TRAINING www.nlplifetraining.com UK (0)845 260 7930 . NLPLifeTraining.com 2 Special NLP Report NLP and You The Keys to Success, Health & Happiness in Business & in Life Contents Part 1: 3 Introduction - Your First Steps in NLP Part 2: 4 Richard Bandler - A Life In Change Part 3: 9

15 Views

1y ago

Online • Online Plus • One-on-One NLP Practitioner Training ...

The iNLP Center focuses on NLP training in a real world environment your own! By offering both online NLP training and personalized, one-on-one certification, you will learn NLP at your own pace. We take NLP training seriously. And we have worked with some of the best names in the world of NLP, including Michael Grinder.

12 Views

1y ago

Anthropogenic CO2 as a Feedstock for Cyanobacteria-Based ...

Pradeep Sharma, Ryan P. Lively, Benjamin A. McCool and Ronald R. Chance. 2 Cyanobacteria-based (“Advanced”) Biofuels Biofuels in general Risks of climate change has made the global energy market very carbon-constrained Biofuels have the potential to be nearly carbon-neutral Advanced biofuels Energy Independence & Security Act (EISA) requires annual US production of 36 .

71 Views

3y ago

Recent Views

Family Law and You Booklet - lsc.sa.gov.au

FAMILY LAW AND YOU The Family Law Act is the main law that deals with divorce, disputes about children and property matters. All children are covered by the Family Law Act, no matter where in Australia they live or who their parents are. The courts that can make decisions under the Family Law Act are federal courts called Family Law Courts.

1y ago

143 Views

12 PUBLIC LAW AND PRIVATE LAW - Home: The National .

INTRODUCTION TO LAW MODULE - 3 Public Law and Private Law Classification of Law 164 Notes z define Criminal Law; z list the differences between Public and Private Law; and z discuss the role of Judges in shaping Law 12.1 MEANING AND NATURE OF PUBLIC LAW Public Law is that part of law, which governs relationship between the State

3y ago

745 Views

Dr. Ram Manohar Lohiya National Law University, Lucknow

2. Health and Medicine Law 3. Int. Commercial Arbitration 4. Law and Agriculture IXth SEMESTER 1. Consumer Protection Law 2. Law, Science and Technology 3. Women and Law 4. Land Law (UP) Xth SEMESTER 1. Real Estate Law 2. Law and Economics 3. Sports Law 4. Law and Education **Seminar Courses Xth SEMESTER (i) Law and Morality (ii) Legislative .

3y ago

496 Views

Case Law Update by Victor P. Valmus Family Law uarterly

Family Law uarterly Official Publication of the Cobb County Family Law Section The Cobb Case Law Update The Cobb Family Law uarterlyJune, 201 The Cobb Family Law Quarterly June, 2014 In this Edition Business Valuation and Reporting in Matrimonial Disputes by Marc L. Effron, CPA/ CFF, JD, CVA and Kevin P. Couillard, ASA, CFA

1y ago

114 Views

Board Beans Collection - BOARD BEANS - Board Beans

Catan Family 3 4 4 Checkers Family 2 2 2 Cherry Picking Family 2 6 3 Cinco Linko Family 2 4 4 . Lost Cities Family 2 2 2 Love Letter Family 2 4 4 Machi Koro Family 2 4 4 Magic Maze Family 1 8 4 4. . Top Gun Strategy Game Family 2 4 2 Tri-Ominos Family 2 6 3,4 Trivial Pursuit: Family Edition Family 2 36 4

2y ago

384 Views

Companies Law - Cayman Islands dollar

Law 1 of 1971-15th December, 1970 Law 7 of 2000- 20th July, 2000 Law 7 of 1973-28th June, 1973 Law 5 of 2001-20th April, 2001 Law 24 of 1974-22nd November, 1974 Law 10 of 2001-25th May, 2001 Law 25 of 1975-9th December, 1975 Law 29 of 2001-26th September, 2001 Law 19 of 1977-10th November, 1977 Law 46 of 2001-14th January, 2002

3y ago

454 Views

It’s the Law!

ciples stated in Boyle’s Law, Charles’ Law, Gay-Lussac’s Law, Henry’s Law, and Dalton’s Law. Students will be able to explain the application of Boyle’s Law, Charles’ Law, Gay-Lussac’s Law, Henry’s Law, and Dalton’s Law to observations or events related to SCUBA diving. MateriaLs None audio/visuaL MateriaLs None teachinG tiMe

2y ago

378 Views

WHAT LAW IS ? An Introduction to Law

common law system civil law system!! sources of law in civil law !! a1. primary: statutes (written law) enacted by legislative power are the principal source of law. ! a2. two subsidiary sources of law: ! a2.1 administrative regulations a.2.2 customs!! ! sources of law in common law !!! b1. two primary sources of

2y ago

385 Views

Intermediate Law Law and You Worksheet 3: Australian law - Home Affairs

4. There are different kinds of law to deal with different kinds of problems. Four important kinds of law are civil law, criminal law, family law and administrative law. Civil law deals with disputes between individuals; for example, if someone sells you goods that are faulty, or that cause you injury or damage, you can take that person to court.

4m ago

110 Views

What is Family Law? - Courts and Tribunals Judiciary

What is family law? After all, the law of inheritance is usually thought of as a branch of property law and thus a matter for the Chancery rather than the Family Division. And family 1 Changing families: family law yesterday, today and tomorrow - a view from south of the Border [2018] Fam Law 538, 542-3.

1y ago

128 Views

Domestic Violence and Family Law in Papua New Guinea

Family law in PNG Family law deals with issues relating to family and domestic relationships. Major topics covered by family law include marriage, divorce, child maintenance, prop - erty claims following separation and the custody and adoption of children (Jessep and Luluaki 1985:11). Much of PNG's family law legislation was adopted as

1y ago

126 Views

Faculty of Juridical, Social and Political Sciences Year .

Law L Law IV 8 Drept procesual civil II / Civil Procedure Law II 5 Law L Law IV 8 Dreptul comerțului internațional / International ommercial Law 4 Law L Law IV 8 riminalistică / Forensics 4 Law L Law IV 8 Practică de cercetare pentru elaborarea lucrării de lincență(3 săptămân

2y ago

384 Views

Ohm ’s Law

Ohm ’s Law Ohm's law states that, in an electrical circuit, the current passing through most materials is directly proportional to the potential difference applied across them. 3-1—3-3: Ohm ’s Law Formulas There are three forms of Ohm’s Law: I V/R V IR R V/I where:File Size: 1MBPage Count: 40Explore furtherOhm's Law Quiz MCQs with Answers Ohm Lawohmlaw.comOhm’s Law Worksheet - Basic Electricity - All About omohms law worksheet - eering.orgOhm’s Law Worksheet - Richmond County School Systemwww.rcboe.orgOhm's Law with Examples - Physics Problems with Solutions ended to you b

2y ago

295 Views

Family Law for the Future — An Inquiry into the Family Law .

Review of the Family Law System On 27 September 2017, the Australian Law Reform Commission received Terms of Reference to undertake an inquiry into the family law system. On behalf of the Members of the Commission involved in this Inquiry, and in accordance with the Australian Law

3y ago

136 Views

Practice Material - Family - Law Society of British Columbia

The Law Society's . Report of the Family Law Task Force: Best Practice Guidelines for Law-yers Practising Family Law. Family law has undergone significant changes over the past several years, and more changes are underway. 2. It is important to verify that your legal knowledge and re-sources are current. For example, note these changes:

1y ago

125 Views

ETS Automated Scoring And NLP Technologies - Educational Testing Service

It looks like you're using an ad-blocker