In this project, you will build a program used by an admissions office to help process
applicants for a graduate degree in engineering. Your program will read several database
files containing applicant information and other information relevant to the applications
process. Based on an admissions "formula," you will compute a score for each applicant,
and output a file with all the applicants and their scores. The output of your program
will aid the admissions office in their admissions decision.
The database your program will process consists of 3 files: applicants.txt, ranking.txt,
and journals.txt. (A sample of these files are available on the web site-http://www.enee.umd.edu/class/enee114/projects/pr3/. All database files are text files, containing
sequences of ASCII characters. Below, we provide detailed information about the contents and format of each database file. This information will help you write code to correctly read the information from each file into your program.
1.1 Applicants File
The applicants file, named applicants.txt, lists all the student applicants along with their
personal and academic information. There is one line of text for each applicant in this file. In the sample applicants.txt file that we provide, there are 810 applicants. In general, you can assume there are never more than 1000 applicants. Each line of text contains at least 7 fields. These 7 mandatory fields provide the following information, in the order listed below:
Each field is a sequence of characters, i.e. a string. The first two fields are strings containing the applicant's last and first name, respectively. The next 4 fields are strings that specify numbers. Your program should convert these strings into the corresponding numbers for the purposes of scoring the applicant. The GPA field should be converted into a floating point number between 0.0 and 4.0. The GRE Verbal and GRE Quantitative fields report the applicant's scores on 2 parts of the GRE standardized test (the graduate school equivalent of SATs). Your program should convert each of these two strings into integers between 0 and 800. The Publications field is a count of the number of technical papers the student has published in journals. Your program should convert this string into an integer between 0 and 2. Finally, the last field is a string containing the name of the student's undergraduate institution.
In addition to these 7 mandatory fields, there are up to 2 additional optional fields appearing after the Undergraduate Institution field. These optional fields are strings that
specify the journals from which the student has published technical papers. The number
of optional fields is determined by the Publications field: if the student has published 0
papers, there are no optional fields; if the student has published 1 paper, there is 1 optional field; and if the student has published 2 papers, there are 2 optional fields.
Fields are delimited by a single comma character, with 0 or more white space characters
before and after the comma. For example, an applicant with zero publications would have
an entry with the following format:
where "<fi>" denotes the string from the ith field, and "<W>" denotes 0 or more white
space characters. The white space and comma characters allow you to parse the sequence
of fields in the following manner. The first field always starts with the first character in the line, and ends the first time you encounter either a white space or comma character. The second field starts when you encounter the first non-white space character after the first comma, and ends the first time you encounter either a white space or comma character. The third through sixth fields are delimited in exactly the same way as the second field. (Note, the first 6 mandatory fields are guaranteed to never contain a white space or comma character). The seventh field starts when you encounter the first non-white space character after the sixth comma, and ends when you encounter either a comma character or a 'n' character. The two optional fields (if present) are delimited in exactly the same way as the seveth field. (Note, the seventh field and two optional fields may contain white space characters, but are guaranteed to never contain a comma character).
In general, you can assume that a line in the applicants.txt file will never exceed 2000
characters, and any single field will never exceed 256 characters.
1.2 School Rankings File
The school rankings file, ranking.txt, contains a list of the top 25 engineering programs
from the 2006 U.S. News rankings. Each line of this file lists one ranked school. The
format of each line is:
where "<rank>" is a string representing an integer between 1 and 25, and "<school>" is
a string representing the name of a school. Between the ranking and school name, there
is exactly 1 period character followed by 1 white space character. Many of the applicants
in the applicants.txt file received their undergraduate degrees from schools in this list,
but not all of them. You can assume that this file always contains 25 lines. You can also
assume that each school name will never exceed 256 characters.
1.3 Journal Impact Factors file
The journal impact factors file, journals.txt, contains a list of computer science and en-
gineering journals and conference proceedings, with one journal or conference proceeding listed per line. For each publishing venue, the file specifies an "impact factor" which is a numeric score that reflects the quality of the journal or conference proceeding. The larger the impact factor, the higher quality the journal or conference proceeding. The format of each line in the journals.txt file is:
<journal>, <impact factor>
where "<journal>" is a string representing the name of a journal or conference proceeding,
and "<impact factor>" is a string representing a floating point number between 1.12 and
3.31. Between the journal name and impact factor, there is exactly 1 comma character
followed by 1 white space character. Note, journal and conference proceeding names may
contain white space characters. All the publications listed in the applicants.txt file are
covered by this list of journals and conference proceedings. You can assume that each
journal or conference proceeding name will never exceed 256 characters.
2 Applicant Score
Your program should read the contents of all three database files described in Section 1.
You will need to create the appropriate arrays of strings, integers, and floating point
values to store the database contents, and follow the format rules described in Section 1
to correctly extract all the data.
Once you have loaded the database into your program, you will compute a score for each
student that reflects the quality of his/her academic record. To compute the score, begin
by averaging each student's normalized GPA and GRE scores. This is known as the
"baseline score," and is computed using the following formula:
baselinescore = (((GPA / 4.0) + (GREVerbal / 800.0) + (GREQuantitative / 800.0)) / 3.0) * 100.0
In addition to this baseline score, add 5 points if the student graduated from a school
ranked between 11th and 20th, and add 10 points if the student graduated from a school in the top 10. Finally, if the student has publications, find the corresponding impact factor
for each published paper, multiply the impact factor by 10.0, and add the result to the
student's baseline score. The baseline score, with the school ranking and impact factor
adjustments, represents the student's final applicant score.
Your program should create an output file, called scores.txt. This file should contain 1 line for each of the applicants in the applicants.txt file. For each applicant, you should print the last and first name of the applicant, the applicant's computed score from Section 2, and the ranking of the undergraduate institution that the applicant attended. For the last
name and first name fields, you should pad the field with blank space characters so that
the entire field is exactly 15 characters wide. (You can assume that all first and last names are less than 15 characters wide). Between the applicant's score and the undergraduate institution ranking, you should print a single tab character, 't'. Finally, the order of the applicants in the scores.txt should be identical to the corresponding order from the applicants.txt file.
I have provided a scores.txt file that should be generated from
the sample database files I supplied (applicants.txt, ranking.txt, and journals.txt). The
scores.txt file generated by your program should match the provided scores.txt file exactly.