Written permission required for all reproduction and/or adaptation of C4H published and presented works. Please contact us at
[email protected].
Intermediate Data Management and Analysis Data Management is the way we organize our data or information. Good organization allows us to more easily analyze our data and understand what it can tell us about our programs, what is working, and what to improve.
Basic Steps for Data Management:
ENTER DATA: Organize your data into a spreadsheet. Columns=variables, Rows=records, Cells=values. TRACK YOUR DATA: If you are collecting data from multiple sources (such as different kinds of client data from separate data systems or from multiple program sites) you may want to use a data tracker. A data tracker is a spreadsheet that shows each kind of data you are receiving from each site. You can check off when you receive the data. This will help you keep track of what you have already (or should have) and what you are waiting for. CHECK YOUR DATA: Review data in a systematic way to look for data entry errors. If you find the same errors over and over again, figure out what is the cause (e.g. a database problem, training problem, etc.) and fix it. PREPARE FOR ANALYSIS: This might include matching data on the same person from multiple sources or creating new variables you will need for later analysis.
Tools for Data Management in Excel: Excel isn’t the only spreadsheet software for managing data, but it is a common software that many people already have available to them. All spreadsheet software will have some version of the following tools. For more information about how to use these Tools in Excel, see the recorded webinar and online training modules that accompany this information sheet. If you use something other than Excel, check out the Help features and Tutorials for whatever software you use.
Data Validation: allows you to restrict the “correct” answers that can be entered into a spreadsheet, or check for “incorrect” answers after data entry. Ex. If one of your questions is a 7 point scale, you should only have values of 1-7 (plus 99 for any missing data). Using the List option in the Data Validation Tool, you should be able to easily spot any incorrect data that may have been incorrectly entered, such as “0” or “44”.
VLookUp: is a formula to match data. If you receive data on the same person from multiple sources (different kinds of data or data from different locations) you can match all of their data and combine it into one place. VLookUp works based on there being something similar in each record such as a name or Unique Identifier. The formula is =VLOOKUP(value,data range,col no.,FALSE)
Concatenate: is a formula that brings together text from different sources (different cells or even from within the formula itself) into a new value in a new cell. For example, you can merge first name and middle initial to start constructing a Unique Identifier. The formula is =CONCATENATE(Text1,Text2,Text3,…,TextN)
Text To Columns: select this tool when you need to separate data that is currently placed together in one cell, but should be in 2 or more separate cells for your analysis. For example, if your data entry person entered first and last name into one cell, but you want them separated out, you can use Text to Columns to make each one its own variable.
Data Analysis is the process of looking at data and figuring out what is important about it.
Important depends on:
What we are trying to do in our programs (our goals and objectives); The context of our programs (how easy it is to accomplish our goals and objectives; what supports us in reaching them; what are the obstacles to reaching them); Our reporting requirements; What is important to our stakeholders: funders, community partners, clients, and the community at large.
www.capacity4health.org
1
Written permission required for all reproduction and/or adaptation of C4H published and presented works. Please contact us at
[email protected].
Hot Tips for Data Analysis Formulas allow us to quickly and easily do all the calculations we need in order to analyze our data. Common Excel formulas include SUM, AVERAGE, MEDIAN, MODE, COUNT, COUNTIF, and IF. Check out the Insert Function feature in Excel for full explanations of different Excel formulas. A trick for using formulas in Excel is that because of Excel’s relationality, you can enter a formula once, then copy and paste it down an entire column or across an entire row, and Excel will automatically adjust the cell references.
Dummy Variables are new pieces of information that we create in new columns. They are useful for assessing whether we met our specific program objectives. For example, if our objective was “80% of our participants will show a 5-percentage-point gain from pre-test to post-test,” we create a new column “5-percentage-point gain”. In each cell we put either: 1= “Yes, has a 5-percentage-point gain or greater”; 0= “No, has less than a 5-percentage-point gain”. To find out whether 80% of our participants met the criteria we take the average of the Dummy Variable column. If the average of all those 1s and 0s is .80 or greater, then 80% or more of our participants showed a 5percentage-point gain and we’ve met our program objective.
Pivot Tables are a tool in Excel that allow us to compare frequencies, averages, changes, etc. for different subgroups of our data: Pivot Tables work well in Excel 2010 (or 2011 for Macs) but are much less user friendly in Example of a Pivot Table earlier versions. Pivot Tables allow you to “fold up” a large spreadsheet so you can see comparisons more easily. In the example pivot table above, you can see the gender of clients at 4 different clinics. You can easily see, for example, that Clinic D was the only clinic that served more women than men.)
Heat Mapping uses conditional formatting to visually show us what our data contains. In this example, we’ve selected Conditional Formatting, selected a 3-color scale with green=high and red=low. This allows us to see at a glance the distribution of high and low values.
www.capacity4health.org
2
Written permission required for all reproduction and/or adaptation of C4H published and presented works. Please contact us at
[email protected].
Additional Resources for Data Management and Analysis Here are some additional resources you may find helpful for Managing and Analyzing Data:
•
A 1 hour recorded webinar, along with 2 Online Training Modules accompany this Information Sheet and are available online in the Data Management and Analysis section of Capacity 4 Health’s Online Resource Library. http://library.capacity4health.org/category/topics/monitoring-and-evaluation-me/data-management-and-analysis
“Excel Basics” (You Tube) A fun series of YouTube webinars on how to use Excel. http://www.youtube.com/user/ExcelIsFun#p/c/3FBEE51974F03CCF/21/XmSp2-Fa4rg
“Excel Training” (Microsoft Training) Microsoft Office has a good, basic training series on using Excel. http://office.microsoft.com/en-us/support/training-FX101782702.aspx
Lynda.com online paid resource for in depth Microsoft Excel tutorials http://lynda.com
“Overview of PivotTables” (Microsoft) http://office.microsoft.com/en-us/excel-help/overview-of-pivottable-and-pivotchart-reportsHP010342752.aspx?CTT=1
“Pivot Tables” (Excel for Dummies online) http://www.dummies.com/how-to/content/the-essentials-of-excel-2010-pivot-tables-and-pivo.html
You can find many other Data Management, Analysis and Excel resources also in the Capacity for Health Online Resource Library.
•
Capacity for Health offers FREE individualized capacity-building assistance. For more information, contact Sonya Dublin, Capacity for Health, at 415-568-3335,
[email protected] www.capacity4health.org
3