Project Report on Information System and Data Warehouse

Description
In computing, a data warehouse or enterprise data warehouse (DW, DWH, or EDW) is a database used for reporting and data analysis. It is a central repository of data which is created by integrating data from one or more disparate sources.

STUDENT INFORMATION SYSTEM AND DATA WAREHOUSE PROJECT REPORT

Prepared by Deborah Lauriano November 20, 2002

SIS DATA WAREHOUSE PROJECT REPORT

Executive Summary Purpose The purpose of this project was to investigate queries processed against the SIS/Banner System, the Campus Data Warehouse, and the Student Extract, with these general goals: • • • • • categorize the queries according to business functions analyze which each system is used for query purposes analyze the effect of third-party queries processed against the Banner System explain the uses of the Payroll ID in Banner if the findings indicate, propose a technical architecture which is a coordinated approach to fulfilling query requirements that provides a uniform and consistent view of the data.

Primary Recommendations Many account holders of each of these systems were contacted by phone, email, and in person to discuss how data is used, and to identify the systems chosen for query purposes. Campus units have invested significant resources to develop many unique queries over the last ten years. These queries and the common queries were found to use significant resources in the Banner System. A primary recommendation in the following report is establish a mirrored instance of the Banner System, and to direct third party query users to this system. This would protect the Banner transaction processing system from resource intensive queries, would preserve the investment made by campus units, and could be used to also provide a minimal disaster recovery system. There are quite a few common uses of the data, the largest of which is course enrollment totals, and specific student enrollment. Major, GPA, and level data comprise the second most common data request. All systems are used to gather this data, but the Student Extract and the Campus Data Warehouse are the favored sources of these data requests. These common uses, and others such as Undergraduate Admissions data, minimum progress, progress toward degree, filing to graduate, and commencement, are performed by each deans office using locally developed queries, extracts, and databases. A primary recommendation in the following report is to present information supporting these processes in a uniform and consistent manner through an SIS Data Warehouse. Graduate Admissions data should be included in this warehouse. The existing Oracle database in the Campus Data Warehouse should be used as the basis for this development. These recommendations are not dependent on each other. However, the recommended approach is to establish a mirrored system first, and the SIS Data Warehouse second. At some point it could become possible to direct all query users to the SIS Data Warehouse, allowing direct access for unique usages, and a common web interface for similar requests.

1

SIS DATA WAREHOUSE PROJECT REPORT 1.0 Background The Banner Student Information System was implemented in the early 1990’s, and was delivered with virtually no reports and no coordinated mechanism to fulfill the report requirements of Student Affairs or academic departments. In order to fill this gap, several uncoordinated activities took place. The Office of the Registrar took on the burden of attempting to fulfill reporting needs of units that contacted them directly. Deans’ offices and some academic units added technical staff, in part, to fulfill reporting requirements. A Student Extract was developed to provide very basic enrollment and undergraduate admissions data. The Campus Data Warehouse was developed by the Data Administration unit, and contains general student, undergraduate admissions, enrollment, and payroll data. Historically, there has been no coordinated effort to direct query users of student information to the most appropriate of these alternatives. In an effort to understand how these systems are being used, to reduce duplication of effort, and to discuss models of how student data could be provided in a coordinated fashion, a charge was developed by Vice Provost Bruno and Associate Vice Chancellor Franks to investigate and report on third-party query users of Banner, users of the Student Extract, and users of the Campus Data Warehouse. 2.0 Summary of Approach The third-party queries to Banner are defined to be those reports or data extracts that are generated outside of the Banner application forms and menus. These include data requests from web-services, such as MyUCDavis and Grad Track$, and local reports generated using SQL or other tools such as Access, Brio, and Crystal Reports. One goal of this project was to categorize all third-party queries to Banner and all uses of the Campus Data Warehouse. Given the time limitation of this project, this was deemed impossible to achieve given that an informal survey revealed that the Registrar’s Office maintains over 1600 queries, the Graduate Division over 400, and Letters and Science over 200. It became necessary to take a more conservative approach to this project. Users of the three systems were contacted, and information was gathered via in person interviews, email, and phone. This current exploration of the query usage of Banner, the Campus Data Warehouse, and the Student Extract (referred to as Antares, and more currently as Aedes) focuses on these areas: • • • • Functional needs of query users and choice of source systems Use of employment data Impact of third-party query users on the Banner System Proposed future query architecture.

3.0 Summary of functional needs of query users and choice of source systems. The following campus units were contacted in order to gather information regarding the query usages of SIS Banner and the Campus Data Warehouse:

2

SIS DATA WAREHOUSE PROJECT REPORT

• • • • • • • • • • • • • • • • • •

College of Letters and Science Division of Biological Sciences College of Engineering School of Law School of Veterinary Medicine College of Agriculture and Environmental Sciences Languages and Literature Graduate School of Management Graduate Studies Undergraduate Admissions Office of the Registrar Financial Aid Office Student Affairs Research and Information Electrical Engineering and Computer Science Physics Chemical Engineering and Material Science Math Department Subject A.

Staff members in these units were asked to categorize the types of queries and reports that are generated from the Banner transaction processing system, the Campus Data Warehouse, and the Student Extract. Their answers demonstrate that their usage of the data falls into a number of similar and unique categories. 3.1 Similar usages The deans offices, the Office of the Registrar, and some departments, have several similar functional needs that are assisted by queries to Banner, the Campus Data Warehouse, and the Student Extract: • • • • • • • • • • Course enrollment data through enrollment and drop/add periods Probation Minimum Progress Progress towards degree Commencement General information regarding majors, class standing, and level Historical information regarding majors, GPA, and demographic data Admissions profiles Email addresses for students and staff Facilitation of authorization to local web services.

Current course enrollment data represents the highest volume of similar usage among the units. This represents course enrollment totals by section, and course enrollment data by student. Graduate Studies, College of Engineering, Languages and Literature, Division of Biological Sciences, use the Campus Data Warehouse to produce the course enrollment

3

SIS DATA WAREHOUSE PROJECT REPORT data, while the Registrar’s Office, Graduate School of Management, School of Veterinary Medicine, and the College of Letters and Science use Banner.1 Graduate Studies and other academic departments have expressed the need to access Graduate admissions data in a simplified form, similar to the undergraduate admission data that is available from the Student Extract or Campus Data Warehouse. There are enough similar uses of data by these departments to strongly suggest that a common rather than a distributed approach to providing this data would be beneficial. 3.2 Unique usages The Admissions Office uses the Banner system to assist the admissions decision cycle. Admissions reports are produced representing different admission criteria for the different colleges and programs. It is an iterative process by which groups or individuals are selected for admission to the university. Approximately fifty parameterized reports are generated reflecting applicants from feeder schools, applicants from specific geographic regions, advanced standing students, re-entry students, high achieving students, international and non-immigrant students, athletes, music students, EAOP students, etc. The Early Academic Outreach Program in the Admissions office uses the Campus Data Warehouse, and specifically the Hyperion Analyzer tool, to understand the application yield, and acceptance rates from specific populations. These reports use historical information to inform program managers how to focus the activities of the EAOP program. The Director of Admissions describes this information as critical to the unit’s planning process. The Financial Aid Office produces reports to assist with the Financial Aid applicant and award cycles. Much of the interaction with students has moved to the web, and an increased number of reports have been necessary to make this process effective. Reports are used to identify populations that are ready for processing, and to move applicants to the next phase of processing. If anomalies are found with one student, reports are generated to identify others who may have the same data issues. Reports are also generated to monitor fund balances and fund usage. The College of Engineering uses the Campus Data Warehouse to assist in reporting to outside agencies, some of which affect the rating of the graduate program. These include U.S. News and Report, the Engineering Workforce Commission, and the American Society of Engineering Educators.

1

Degree Navigator does not produce reports that can be used for probation, minimum progress, progress towards degree, or commencement, according to the Registrar’s Office, which is why units are dependent on Banner and the Data Warehouse for these reports.

4

SIS DATA WAREHOUSE PROJECT REPORT The Graduate Studies and the College of Engineering have similar requirements to track graduate fellowship, block grant funding, work-study eligibility and appointments, and aggregate loan amounts on each student. This requires access to student data on Banner (this is not available on the Campus Data Warehouse) and payroll data that are gathered from the Campus Data Warehouse. There is interest from these units to put the appropriate Financial Aid information on the Campus Data Warehouse to make this process more effective. The current process is described in more detail in the section 3.0. Graduate Studies has a web application, Grad Track$, that produces on-demand reports. The reports are parameterized to allow different selection criteria, and is focused on providing information on fund balances, and populations of students who are receiving awards, selected by term, major, and fund. Graduate Studies has used the Campus Data Warehouse to identify students who are late paying registration fees. Departments are contacted to follow up with students to determine if they need assistance. The Office of the Registrar maintains a large inventory of reports. The office produces reports for ad hoc requests from departments, and produces a large volume of mailing labels for honor societies, academic departments, and other requestors. The office provides reports to the colleges for the graduation certification process as well as commencement. The Law School extracts data for the currently enrolled student body, importing into their own local student database to aid their processing for grades, class scheduling, and graduation. Resulting data are uploaded to Banner to keep the systems synchronized. Student Affairs Research and Information merges demographic data from the Campus Data Warehouse with survey results that the unit conducts. The Physics Department uses the Student Extract for recruiting purposes. The Electrical Engineering and Computer Sciences department uses the Student Extract for populating a local database of undergraduates. The database is used for setting up computer accounts, establishing files for advising, producing email notifications, and tracking for graduate admissions and enrollment. Chemical Engineering and Material Science uses the Student Extract to populate a local database so that advisors can track and keep notes on students. Subject A tracks all students who have to fulfill the Subject A requirement. They use the Student Extract for this purpose. The Department of International Studies uses the Campus Data Warehouse to generate new faculty position requests.

5

SIS DATA WAREHOUSE PROJECT REPORT 3.3 Source systems and methods of delivery UNIT Admissions Office – admissions cycle Admissions Office - EAOP Office of the Registrar Financial Aid Office Graduate Studies College of Letters and Science College of Agriculture and Environmental Sciences Division of Biological Sciences College of Engineering Graduate School of Management School of Veterinary Medicine School of Law Student Affairs Student Affairs Research and Information Office of Resource Management and Planning Language and Literature International Studies Physics Electrical Engineering and Computer Science Chemical Engineering and Material Science SOURCE SYSTEM Banner and Campus Data Warehouse Campus Data Warehouse Banner Banner Banner and Campus Data Warehouse Banner and Campus Data Warehouse Banner Campus Data Warehouse Oracle database and Banner Banner, Campus Data Warehouse Oracle database, and Student Extract Banner and Campus Data Warehouse Oracle database Banner Banner Campus Data Warehouse Banner, Campus Data Warehouse Oracle database Campus Data Warehouse Campus Data Warehouse Campus Data Warehouse Oracle database Student Extract Student Extract Student Extract METHOD OF DELIVERY SQL and Crystal Reports, Hyperion Analyzer. Hyperion Analyzer, and multi-dimensional cubes SQL and Oracle reports SQL and Crystal Reports SQL. Provides extracts to Access databases for further refinement. SQL. Provides extracts to Access databases for further refinement. Hyperion Analyzer. SQL. Provides extracts to local database. May develop web-interface for users. Hyperion analyzer. SQL. SQL. Provides extracts to local database for further refinement. SQL. Provides local web interface to data. SQL SQL. Provides extract to populate local SIS database. Hyperion Analyzer SQL. Provides extract to populate local databases. Hyperion Analyzer. Hyperion Analyzer. SQL. Provides local web interface to data. Hyperion Analyzer. Hyperion analyzer. SQL Brio. Brio. Provides extracts to local database. Provides extract to local database.

6

SIS DATA WAREHOUSE PROJECT REPORT Mathematics Department History Subject A 4.0 Use of Employment Data The Payroll/Personnel ID has been loaded into the Banner system for the last two years. It is primarily used to provide a method to reconcile payroll data for Graduate students with fellowships and block grants. Students are awarded fellowships and block grants through the Financial Aid module in Banner but are paid through the Payroll System. A data feed of payment amounts is generated from the Payroll System and is matched, through the Payroll ID, to the student so that paid-to-date amounts associated with the award can be updated. Graduate Studies also uses the Campus Data Warehouse to assist in this process for individual students. The Payroll/Personnel ID feed to Banner results in the creation of a suspense record when an individual cannot be matched to a student with confidence2. Graduate Studies uses the Campus Data Warehouse, which supplies a crossreference between student and people on the Payroll System, to research the suspense records so that students can be definitively matched to Payroll feed records. In addition to the managing the suspense process, Graduate Studies runs bi-weekly jobs to extract data from Banner and payroll data on the Campus Data Warehouse to identify fellowship and block grant funding payment reconciliation problems. The extracted data are loaded to local databases to produce the comparison reports. On average, there are about one dozen students who have payment recording issues that have to be resolved every pay period. Graduate Studies also extracts payroll data from the Campus Data Warehouse and student data from Banner to check employment eligibility. Graduate students are required to advance to candidacy after nine quarters of employment. If they do not advance, they lose employment. Graduate Students cannot hold student titles for more than 12 quarters. Graduate Studies currently builds a local database with this employment history, as the Campus Data Warehouse does not currently store enough historical information to assist in this process. Banner also uses the Payroll ID to help in setting up the authorization schema for employees who use Banner. Users of Banner are required to have a corresponding entry in the table that holds information about people. The Payroll feed, and by implication, the Payroll ID, is used to keep this table current. Student Extract and Campus Data Warehouse Campus Data Warehouse Student Extract SQL. Hyperion Analyzer Hyperion Analyzer SQL (query no extract).

2

The size of the suspense file has grown to the degree that managing it has become a challenge. This summer the Banner team met with the affected units to provide training in the mechanics of reducing the suspense file. The implementation of a Person Directory will have a positive affect on the campuses ability to manage this process because the matching algorithms will be managed outside the Banner system with clearer ownership and management processes. Until the Person Registry is established, there does not appear to be a method that improves on the current one of reconciling student and payroll data.

7

SIS DATA WAREHOUSE PROJECT REPORT The School of Veterinary Medicine uses enrollment data to enable the student/instructor evaluation process. The College of Letters and Science uses the Campus Data Warehouse for graduate students in the payroll system. They retrieve the number of units to determine the correct FICA and retirement codes for these students. The Language and Literature Department uses payroll data to track faculty merits and promotions, and in the hiring process, to report on titles, steps, employment history, and department. The Office of Resource Management and Planning uses employee data in the Campus Data Warehouse for LRDP analysis. An example of this is analyzing where staff reside and salary ranges by job title and organization. It also uses the data to analyze age groups of ladder-rank faculty in order to plan for retirement and recruitment. Several units use the Campus Data Warehouse personnel data to secure email address for departmental staff. 5.0 Impact of Third-Party Query Users on the Banner System In order to understand the third-party query usage on Banner, querying activity and CPU usage was monitored over a three-day period coinciding with the day before Fall 2002 classes started, and the first, second, and third day of classes. These are days that typically reflect the highest usage of Banner from all perspectives. These two measures give the clearest picture of system resource usage. The following shows the percentage of third-party tool users to all query activities for each day: • • • • 1 day before classes started – 22% of query usage due to third party tool users. 1st day of classes – 24% of query usage due to third party tool users. 2nd day of classes – 23% of query usage due to third party tool users. 3rd day of classes – 37% of query usage due to third party tool users.

The CPU usage of third-party tool users consistently ranged from 5% to 10% during this period.3 While the query activity and CPU usage of third-party tools is not critically high, the Banner technical team feels that usage of this sort is rising, as especially evidenced by the increased data requests from web services. The Banner technical team does not proactively review the third-party queries for efficiency or effectiveness. The volume of unique requests makes such a review prohibitive. The queries are reviewed only when
3

There were two users that consistently ranked high in each category, the data requests from the Campus Events room scheduling web-server, and the Financial Aid award process.

8

SIS DATA WAREHOUSE PROJECT REPORT their resource usage becomes excessive, which occasionally happens. Consideration of the trend of growing query usage, the goals of managing query usage, and isolating the transaction processing system from resource intensive queries, suggests that it would be beneficial to separate query users from the transaction processing system. 6.0 Proposed Future Query Architecture 6.1 Findings and Recommendations Finding There is no comprehensive mechanism that optimizes the performance of third party queries processing against the Banner system. Occasionally there are individual queries that negatively impact performance of the transaction processing system. This is especially critical during peak usage, the start of classes, course enrollment, the add/drop periods, admissions processing, and aid processing. The collection of third party queries against the Banner system, while not a critical factor in system performance, use significant resources. Directing query users to another system offers the highest degree of protection of the Banner transaction processing system against rogue queries, and would return a significant amount of system resources to the transaction processing system. Many of the query users require real time information, such as admit status of applicants, course enrollment counts, financial aid processing status, student class schedules, course enrollment, etc. This suggests that a separate system for query users should mirror the data on the Banner system in real time. Many of the third-party query users have invested ten years of effort developing queries against the Banner system. A significant number of these queries are unique. In order to leverage this investment, these users should have access to an environment that is a replica of the Banner database. Recommendation Provide an exact replica of the Banner transaction processing system to query users on a separate server. Use Oracle 9i to provide a mirror of the Banner database, which is updated in real time. Consider using this server to implement another instance of the database for limited disaster recovery purposes. Finding The deans’ offices and academic departments have a number of similar information requirements as previously noted. Extracts and reports of this information have been developed independently by units who have the resources to do so, and do not provide a unified view of the data.4
4

Many units go to the Office of the Registrar to provide these reports. There is one staff member in the Office of the Registrar who is primarily responsible for developing and providing this information. His role is critical to the delivery of information to campus units. To provide this function in his absence could pose a challenge.

9

SIS DATA WAREHOUSE PROJECT REPORT

There are only a few users of the Student Extract. This data could easily be provided in another venue. Recommendation Provide a data warehouse environment for commonly required information. Develop a portfolio of web-based parameterized queries, which target a rationalized database developed for this purpose, and provide a unified view of the data. Include payroll data to facilitate graduate student block grant, fellowship, and employment eligibility processes. Consider using the Oracle back-end database that feeds the Campus Data Warehouse as a basis for this implementation. Consider retiring the Student Extract and providing the information through the SIS Data Warehouse. Establish a warehouse users group comprised of the query users/developers from Student Affairs and the deans offices to develop the requirements of these commonly used reports and to guide the overall development of this component.5 See the figure below for a pictorial representation of the proposed architecture. As depicted, the Banner transaction processing system would feed the mirrored system. The SIS Data Warehouse system would be fed by the mirrored system and the Payroll/Personnel system. If both a mirrored system and a SIS Data Warehouse are established, it is logical that a mirrored system be implemented before the SIS Data Warehouse.

5

This approach allows a centralized approach to administering security, which should be considered an advantage of this model. The current model delegates control to departments, and is not centrally monitored.

10

SIS DATA WAREHOUSE PROJECT REPORT

PROPOSED SIS/DATA WAREHOUSE QUERY ARCHITECTURE
PROPOSED: MIRRORED SYSTEM Mirrored systems for query and disaster recovery. Real time update

CURRENTLY IN PLACE SIS Transaction Processor

.

PROPOSED: SIS DATA WAREHOUSE SIS Data Warehouse for optimized queries serving common uses, including crossreference to payroll data. Enrollment data updated hourly, other data updated nightly.

Access granted to Banner Application Forms, SISWEB, RSVP.

Access granted to 3rd party query tools, MyUCDavis, Grad Track$, read-only web applications.

Access through webenabled pre-defined reports, and possibly direct access through 3rd party query tools

Payroll / Personnel

6.2

Estimated Resources and dependencies

6.2.1 Mirrored System The establishment of a mirrored system, including a database for disaster recovery, could be accomplished by leveraging existing database administration, system administration, and programming staff. A separate Oracle license, and hardware would be required for the mirrored database. Ongoing costs include operations support, database administration, system administration, Oracle licensing, and equipment replacement. On-going maintenance should be minimal. Set-up and testing could take as little as 2-3 months. Communication to the third-party tool users, and assistance in redirection of their queries 11

SIS DATA WAREHOUSE PROJECT REPORT to the mirrored database will take several months to be effective. A transition period where both the transaction processor and the mirrored database are available to query users is advisable. 6.2.2 SIS Data Warehouse Although recommended, the establishment of a SIS Data Warehouse is not dependent on the existence of a mirrored system. The SIS Data Warehouse could draw data directly from the SIS Transaction Processing System. However, this would negate the mitigation of performance risk to the SIS Transaction Processing System that comes as a result of redirecting many third-party tool users to a mirrored system. The SIS Data Warehouse will require hardware for the database and web-server, an Oracle license, and a web presentation tool license (for example Cold Fusion). A full time data analyst and a minimum of two programmers will be required for a minimum of one year. This staff could be incrementally reduced or shared with other efforts after the initial development. Ongoing costs will include operational support, database administration, system administration, Oracle licensing, web-tool licensing, and equipment replacement. The existing Campus Data Warehouse Oracle back-end database should be used as a starting point for the SIS Data Warehouse. It collects much of the data that is commonly required and also integrates payroll data. As previously recommended, the direction and focus of the SIS Data Warehouse should defined by an oversight committee comprised of key staff from Student Affairs and the deans offices.

12

SIS DATA WAREHOUSE PROJECT REPORT

Appendix I Units and Employees Who Provided Information for this report Chemical Engineering and Material Science College of Agricultural and Environmental Science College of Engineering College of Letters and Science Christopher Derr Michael Ranson Andy Richmond Donna Davies Suzy Wolf Andy Richmond Dan Trask Randi Thompson Everett Wilson Dax Prather Ellen Tani Karen Gurley Paul Drobny Liz Webb Trina Wilson Bradford Luten Barbara Breeden Rose Mary Miller Hebert Diaz-Flores Cortney Midla Patrick Kelly Ann Leamon Debi Lorraine Jamila Sahakian Dave Squire Todd Van Zandt Valerie Vaughn John Alden Katherine Perrone Jeremy Smith Nancy Davis Zach Johnson Tracy Ligtengerg Don Barclift Keitha Hunter Chris Redder Bob Agee Julie Saylor Joe Kiskis Anthony Stewart Daniel Yang Lauren Snow

Division of Biological Sciences Electrical and Computer Engineering Financial Aid Office Graduate School of Management Graduate Studies History Informational and Educational Technology

International Studies Languages and Literatures Mathematics Office of the Registrar Office of Resource Management and Planning Physics School of Law School of Medicine 13

SIS DATA WAREHOUSE PROJECT REPORT School of Veterinary Medicine Student Affairs Student Affairs Research and Information Subject A Office Undergraduate Admissions Ray Thai Bill Cooper Nancy Flagg Gillian Butler Antionette Davis Carol Beck Michael Dang Paul Doty Darlene Hunter Dona Owfook Karen Rendig Gary Tudor

14

SIS DATA WAREHOUSE PROJECT REPORT APPENDIX II SURVEY OF ALL KNOWN HYPERION ANALYZER USERS, FREQUENCY, AND USE6 Division of Biological Sciences Frequently in Fall, once per quarter thereafter Admissions numbers - by division and by major; Student numbers – by division and by major; student credit hours Undergraduate Admissions data To compare student enrollments within departments and generate new faculty position requests Deans need centralized, consistent and easily accessible student-related data to answer many ad hoc questions To determine future course enrollments based on historical data To develop undergraduate applicant survey samples based on demographics Student and faculty headcounts by different brackets, including age; various employee statistics for budget planning Student and course enrollment numbers Admissions, student and employee data Admissions statistics for transfer students

College of Engineering International Studies

Quarterly Once or twice a year

College of Letters and Science

Once a month to several times a week

Languages and Literature Student Affairs Research and Information Office of Resource Management and Planning

Several times a month Several times a month Several times a month

History Student Affairs Undergraduate Admissions

2-3 times a month Once a month Infrequently

6

All known users were contacted. All but two provided information. Hyperion analyzer allows users to access the Essbase “cubes” through an intuitive front end tool.

15



doc_908334439.pdf
 

Attachments

Back
Top