In the ground truth files (citydata.txt, countrydata.txt, filmdata.txt, mayorofdata.txt), each example is on a tab-delimited line. The extraction counts for each pattern are given, followed by the tag (1-correct, 0-error) and the string of the extraction. The patterns were grouped into extraction mechanisms for the purpose of using multiple urns as give below. Each label and left/right-handedness combination defines an urn (there are four urns for all classes except mayor, which has only two (left/right)). The indexes of the fields in the data file comprising each mechanism are given below (where each line starts with field 1). (* --------City-----------*) label1 = {1,2,3,6,7,8,9,10,11,21,22,25,26}; lefthand = {3,7,10,12,13,14,15,21,23,24,25,26,27}; (* --------Film-----------*) label1 = {1,2,6,7,8,9,12,13,15,16,19,24,25,28}; lefthand = {6,7,8,9,10,11,15,16,17,18,20,21,22,24}; (* --------Country-----------*) label1 = {1,2,3,4,5,6,7,8,10,11,21,22,25}; lefthand = {3,4,5,6,10,14,15,16,17,18,19,20,21,22}; (*---------MayorOf-----------*) lefthand = {1,2,5,6,7,8};