D U C 2 0 0 4 S u m m a r y A s s e s s m e n t Login by typing your last name as the userid and then the password Lori will give you. Left mouse click on little icon of a window and shell near the lower left corner of your screen to open a window you can type commands in. --------------------------------------------------------------------- Part I - Checking and correcting the manual summaries Over the next weeks you will be judging the quality of automatically produced summaries of newspaper articles against summaries of the same documents created manually. To make the judging more precise, the summaries have been divided into units. This was done automagically and may need some correction before the actual assessment begins. To start the correction program, type the following three lines: cd cd duc-models chunkapp-driver.sh Maximize the program window by left mouse clicking once on the tiny square icon near the upper righthand corner of the program window. You will see a number of text units - each with a number at the end. Some units may take more than one line to display. Most units will be parts of sentences. That is ok. Your job is to read through the units. If you find one that does not make much sense on its own, see if it would make sense if you made it part of the preceding or following unit. If it would, then left mouse click on the number that separates the two units you want to join. Remember only join if it helps make sense of a unit that makes little sense on its own. Resist the temptation to join just to form sentences. If you make one or more changes, the "Save" button will be activated - it will turn dark. When you have made all the changes you need to, left mouse click on the "Save" button and then on the "Quit" button. Be careful to Save BEFORE you Quit - otherwise your changes will be lost and you may not realize it. If there are no changes to be made, just click on "Quit". When you click on "Quit" the program window will disappear, a new one will appear, and you start the reading and joining process all over again. You will have a little less than 60 files to edit. Let Lori know when you are finished. --------------------------------------------------------------------- Part II - Assessment To start the assessment program (SEE), type the following three lines (hitting the RETURN key after each). cd cd duc SEE SEE has a list of summary pairs for you to compare. One summary, the manually produced one, is called the MODEL and appears on the right. The other is the summary you will be judging. It is called the PEER and it appears on the left. SEE has a list of the model summary - peer summary pairs for you to compare. The pair you last worked on will be displayed when you start SEE, or the first pair in your list if this is the first time you start SEE. As you work your way through the list, you will see the same model summary in a series of pairs since there are a number of peer summaries to judge against each model. - In the upper right-hand corner are the buttons (Previous Summary Pair, Next Summary Pair) you use to move to the next or previous summary pair in your list. You can go back and change your judgments until you finish with the summaries for a particular set of documents. - The peer summary (the one produced by the system) appears on the left. The model will appear on the right when you need it. - At the top left of the bottom 1/3 of SEE are 4 tabs - 4 sets of questions you are to answer for each summary pair. Work on the tabs/questions in the following order: 1. Overall Quality (part 1) 2. Overall Quality (part 2) 3. Per Unit Content 4. Unmarked Peer Units (PUs) Click on a tab to see the questions associated with it. Once you have answered the 4 sets of questions for a peer-model summary pair. Click on the Next Summary Pair button in the upper right-hand corner of the screen to go the next pair and repeat the comparison process. Now for some details about the questions associated with each tab: 1. & 2. Overall Quality: Here you evaluate the peer summary as a whole. Read the peer summary and then answer the questions by clicking on the small circle near the text that seems the best answer. Be careful to answer all the questions. In the lower lefthand corner of the SEE window, you will see a count of how many of the 7 questions you have answered so far. 3. Per Unit Quality: Here you will do a piece-by-piece evaluation of how well the peer summary expresses the ideas in the model. Each summary has been broken into units. The units in the model are called model units - MUs for short. The units in the peer are called peer units - you guessed it, PUs for short. You should step through the units in the model - the current one is highlighted in green. You can use the Next or Prev buttons in the middle of the screen on the right to move through all of the model units. The text of the current model unit appears below the Per Unit Quality tab. For EACH model unit: - First, find all the peer units, which tell you at least some of what the current model unit tells you, i.e., peer units which express at least some of the same facts as the current model unit. When you find such peer units, you click on them. That underlines the peer unit in red. - When you have underlined all such peer units for the current model unit, then think about the WHOLE SET of marked (i.e., underlined) peer units together and answer the question once for the current model unit. - Once you have answered the question, move on to the next model unit by clicking on the Next button in the middle of the screen to the right. Repeat these 3 steps for EACH model unit until you have worked your way through all the model units. When you are done with all the model units, you are almost done with this summary pair. Click on the next tab to the right: Unmarked PUs 4. Unmarked Peer Units (PUs): Here you will see only the units in the peer summary, which don't tell you any of what the model unit tells you. (Other units have been blued out.) Look at all of the peer units, if any, which are visible and answer the questions. Depending on what how many PUs there are, only some answers to the first question are possible. Here are the guidelines: if no unmarked PUs exist, then the answer must be 0% if only 1 unmarked PU exists, then the answer must be 100% or 0% if only 2, then the answer must be 100%, 60%, or 0% if only 3, then the answer must be 100%, 60%, 40%, or 0% if only 4, then the answer must be 100%, 80%, 60%, 20%, or 0% 5. Next summary pair: If you have done everything described above, you are finished judging the current summary pair! Congratulations. Click on the Next Summary Pair button in the upper right-hand corner of the screen to see the next peer-model pair on your list and start the process over again at step 1. above. Assessing the very short summaries The rules for assessing the very short (~10-word) summaries are rather different. Please read the following carefully and don't let yourself a) The 7 quality questions will NOT BE ASKED. b) Each model/peer will contain only ONE model/peer unit. c) Neither the model nor the peer needs to be in any special format. Some may look like headlines but that is only one possibility. They may also look like a list of words or phrases in no particular order. This is OK, but will mean some of the very short summaries will be easier to interpret than others. d) Because of the lack of required structure, you should apply a looser definition of overlap in comparing the content of the model and peer. DO NOT REQUIRE they express some of the same complete FACTs. Look instead at the degree of OVERLAP IN REFERENCES to people, objects, events, locations, times, or abstract concepts. Remember, all the references in the model are equally important when figuring out the degree of overlap. e) Under "Unmarked PUs" you will answer either 100% or 0% since there will be only one peer unit.