FRHD notes for exam Lecture - process evaluation Types of evaluation research: Formative evaluation: -
E.g. someone learning how to drive – instructor not interested in getting to a location, they are concerned about the process.
-
E.g. making a soup without instructions. Tasting and adjusting
Summative evaluation - e.g. whether or not people like the soup you’ve made Hypothetical scenario -
Prof developed integrated program of weight training that students implemented
-
Students fitness levels improved
-
You want to learn more about this program to better understand why it was successful. What would you ask? o
How long/what type of activities
Black box evaluation -
Pretest DO NOT KNOW WHAT CAUSED CHANGE (black box) effects
-
Must open black box to understand cause
-
Logic model
-
Why needed? Target groups, strategies, program activities ******** look at slides
Purpose of process evaluation -
Program monitoring
-
Program improvement
-
Accountability
-
Explain observed program outcomes o
Why program had positive effect
o
What worked? Disaggregation – take group who took intervention and split between who did well and who didn’t and find out why they are different
Why intervention had no effects
Was intervention implemented as intended?
Was intervention strong enough to make a difference?
•
Did the control group receive similar intervention from another source? •
Contamination
Did external events weaken intervention’s impact?
Did program’s theory of cause and effect work as expected?
Are there problems in the organization implementing intervention? •
o
Dose-response relationship
Staff not trained well enough/inspired
Why intervention had unintended consequences?
Components of process evaluation 1. Recruitment -
Procedures used to approach and attract prospective participants
-
Organizational/community and individual levels
-
Resources used
-
Reasons for non-participation
2. Reach -
Percent of intended target audience that participates in intervention o
-
Also, each component – e.g. 50% for one, 40% for another…
Attendance o
Which sub-groups/certain people?
-
Barriers for participation
-
Characteristics of participants
3. Maintenance -
Keeping participants involved in program and data collection
-
Assess dropout rate
4. Dose delivered -
Amount/percent of intended intervention or each component that is actually delivered to participants
-
Function of efforts of intervention providers
5. Does received
-
Extent of which participants engage with intervention o
People can be given a pill but do they actually TAKE the pill?
-
“exposure”
-
Characteristics of target audience (who did less/more)
6. Fidelity (honesty) -
Extent to which intervention was delivered as planned
-
Whether intended and implemented intervention congruent
-
Function of intervention providers
7. Implementation -
Composite score that indicated extent to which intervention has been implemented and received by target audience
-
Combination of reach, dose delivered, dose received and fidelity
-
Multiplicative vs averaging approach
-
o
Multiplication - multiply percent of the above 4 characteristics to receive implementation score – ultimate score would be 1 (100%x100%)
o
Average – add percent and divide by total number of indicators
Continuous (%s) vs category analysis (category is splitting scores in two)
8. Adaptation -
Changes/modifications made in original intervention during implementation
-
Fidelity-adaptation debate: max fidelity vs. adaptation permitted o
Adaptation as an implementation failure
o
Vs. adaptation as inevitable
9. Contamination -
Extent to which participants receive interventions from outside program, and extent to which control group receives intervention
10. Context -
Aspects of larger physical, social, political and economic environment that may influence intervention implementation
Illustration of process evaluation – Wang et al (2013) study HBHS -
Healthy bodies and souls
-
Church based health intervention
-
Promote healthy food
-
Randomized control trial (some churches received and some didn’t)
-
Purpose: o
How well was it implemented in terms of reach dose, delivery and fidelity
-
Had different phases with different components/strategies
-
Process evaluation: o
o
In church measures to assess delivery:
Reach – percent of people with contact of intervention in each session
Dose delivered – how many intervention components distributed to person in one visit
Fidelity – percent of effective visits out of total visits per program phase
Post intervention exposure survey:
Assess individual level dose received
Interview sample congregates
Dose received: percent of people who successfully recalled exposure to any intervention components
Lecture – Focus Groups -
Research method for collecting qualitative data
-
Group discussions
-
Focused
Strengths
-
o
Exploring – inductive
o
Context and depth (what are barriers)
o
Interact directly with respondents
o
Their own words
o
React to and build on others’ responses
o
Data from children or illiterate individuals
o
Understand results
DO NOT quantify comments – use survey or other method. Responses may be influenced by each other as well
Uses o
Needs assessment
o
Program development
o
Process evaluation (understand what happened during program)
o
Outcome evaluation
Myths -
Quick and cheap – not as easy of a process (lots of organization of people)
-
Require professional moderators – doesn’t need to be professional, just skilled (whatever resources are available)
-
Require special facilities – can do it anywhere e.g. restaurant
-
Must consist of strangers – no
-
Will not work for sensitive topics – will work, just be mindful that people don’t disclose too much
-
Reaches/produces conformity – no, all people do not need to come to same conclusion – sometimes want to maximize variation (hear from all views)
-
Must be validated by other methods – no, do not need another method (survey) focus group are still good at producing data alone
Ethical issues -
Informed consent
-
Confidentiality
-
Privacy
-
Dealing with stressful/emotional topics – e.g. take a break, make resources for clients clear, therapists may be involved
How structured should they be? -
-
More structured o
If pre-determined issues/barriers
o
Questions specific to barriers
Less structured o
If exploratory
o
Don’t use prompts, get out idea without having a preset idea in mind
Steps in focus group 1. State research purpose
2. Identify moderator - Skilled; train - Compatibility (female moderator with female interviewees) 3. Develop interview guide - Provides direction for discussion - Only a guide, so do not have to ask all questions – can use different angles if seems more appropriate Types of questions:
Opening questions (ice breaker)
Introductory questions
Key questions
Transition questions
Ending questions
o
Use open ended questions
o
Unstructured vs structured questions
o
Order of questions:
Less structured questions first
Funnel (general to specific)
Among key questions, most important questions first (time restraint)
o
Number of questions (hard to choose, depends on number of participants, time restraints, complexity)
o
Probes/prompts
o
Pilot test
4. Recruit Sample -
Purposive sampling
-
Convenience sampling
-
Deciding on group composition: o
Homogeneous participants (relatively similar) – reduce conflicts
o
Segmentation
o -
-
Strangers or acquaintances (strangers usually best to reduce bias)
Deciding on group size: o
7-10 individuals
o
Consider amount of time that each person will get to talk
o
Larger vs smaller (large can increase opinions, but can be harder to manage)
Deciding on number of groups: o
Diversity of comments
o
3-5 groups
o
Theoretical saturation – no new information
5. Conduct focus group -
1.5-2.5 hrs
-
“small talk” before
-
Observe interactions (how people feel about others in group)
-
Inform group that session recorded and whether “observed” – put name tags in front of people. Dominant beside interviewer (less eye contact), introverted people right in front (more eye contact so they may talk more)
-
Ground rules
-
Problems o
Experts (self-proclaimed)
o
Others interrupted – if bad may need to ask person to leave
o
Friends participating (chatting with eachother)
6. Analysis and interpret Data -
Moderator and assistant do preliminary analysis
-
Focus on one question at a time o
Themes
o
Quotes
Different analysis strategies o
Based on memories – efficient
o
Based on notes – more systematic
o
o
o
Field notes during session
Based on tape recording
Researcher listens to tape
Brief summary
Based on transcripts
Transcribe interviews – figure out themes….
Open coding – open mind; jot down themes/ideas, first stab at it
Focused coding – revisit transcripts, merge themes, naming, description, segments of data
Content analysis
Cut and paste
Find option on word processor
Software packages •
NVivo
•
NUDIST
•
Ethnograph – each line on software packages have a different code for each themes so you can look it up efficiently (identified themes not emerging themes – do not magically emerge)
Lecture – Quasi-experimental Research designs Questions of interest o
Is the program effective in achieving its intended objectives?
o
How do you know that it was the program or an alternative explanation
Criteria of objectives o
Specific, measureable, purposeful standards, realistic, time frame
Internal validity o
The validity of inferences about whether the relationship between two variables is causal
o
Threats to internal validity: plausible alternative explanations for what causes the observed effect
o
o
o
o
o
-
E.g. frog jumping competition – study designed to determine how far frogs can jump
-
Needs starting point – yells “jump” frog jumps 10 ft = baseline
-
1 leg weight = 5 ft
-
2 leg weight = 3ft
-
3 leg weight = 1
-
4 leg weight = doesn’t move
-
Obvious reason for not moving is because it’s weighed down, BUT novice researcher says “when you strap weights on frog, frog becomes deaf”
History
Any event that coincides with the independent variable and could affect the dependent variable (don’t know if it was historical event or program causing effect)
E.g. drug makes rat excited BUT rat looking at pretty girl at same time = cause
Must rule out other possible causes
Maturation
Threat that some biological, psychological, or emotional process within the person and separate from the intervention will change over time (growth and development transitions – learning, aging…)
Can be challenging to do program with children because are at age to change
Testing
Effects due to repeated testing of participants over time
E.g. give sample survey twice = remember responses from pretest for posttest. Could broaden time period, or give 2 different tests
Instrumentation
Any change that occurs over time in measurement procedures or devices
Change instrument/measurement (e.g. survey, researcher, taking test at home vs. doctors)
Should calibrate (e.g. scale) and standardize
Statistics regression – no matter what answers will regress towards mean (pennies)
E.g. people go down after being on sport’s illustrated – yes because they’re already at top. FLIP COIN EXAMPLE
Seek groups very different than mean so no matter what it will improve
Threats to internal validity Attrition o
Threat that arises when some participants don’t continue throughout the study
o
Example: Effect of a health education program on physical activity patterns:
o
Previous research: males more active
Assign 100 M 100 F to both intervention and comparison groups
Results: intervention group more active
Dropout: 50F in intervention group and 50 in comparison group
How would you interpret these results?
Mostly males in intervention group so automatically more likely to be active
Need to look at response (drop out) AND describe characteristics of those dropout and those who stayed
Selection o
Any pre-existing differences between individuals in the different experimental conditions that can influence the dependent variable – comparing apples to oranges
o
Interaction with other treats
Diffusion or imitation of intervention o
Program diffuses/spreads to comparison group from intervention group
o
E.g. Dartmouth health promotion study
Compensatory equalization of intervention o
Idea that people think that the two groups are being unjust – so share info to comparison group and lose group
Compensatory rivalry o
Comparison group feels they are being disadvantaged – so they overcompensate to overdue positive effects – lose group
Resentful demoralization
o
Comparison group feels there is no change and so they change their moods/behaviors in a negative way
Experimenter expectancy o
Experimenter tells participants their expected results and skew results
Mediator vs moderator o
Looking at correlation between independent and dependent variable can be explained by mediator (third party – must happen before dependent)
o
Independent mediator dependent
o
E.g. dependent is behavior of healthy eating, mediator is intentions to change (need intention before behavior changes) – like domino effect
o
Moderator MODIFIES relationship between independent and dependent (can strengthen or increase or decrease or can how that there is no relationship)
o
E.g. relationship between alcohol and physical activity – moderator may be person’s gender. Use is stronger/weaker depending on M/F
Research designs Notations: X = intervention X = removal of intervention O = observation R = randomly assigned --- = not randomly signed
Quasi-experimental designs Single-group, posttest only design: Group A X O1 -
E.g. people want to quit smoking, end of program see how many are smoking – weak too many internal validity risks
Single-group, pretest-posttest design: Group A O1 X O2 -
still weak because too many risks
Nonequivalent posttest only design: Group A X1 O1
Group C -
O2
Still weak because no comparison group – only know measurement after intervention
Pretest-posttest, nonequivalent comparison group design: Group A O1 X1 O2 Group B O3 -
O2
Stronger
Comparing nonequivalent posttest-only vs. pretest-posttest, nonequivalent comparison group -
Pretest lets you know how people were before intervention and so more confident with how intervention worked
-
Groups could not be equivalent at base line – one group could be hire and therefore may change more
-
If intervention group changes a lot and comparison doesn’t it could be regression to mean
Cohort design: Earlier cohort X O1 Later cohort
O2 X O3
-
Everyone has received intervention – can’t weed people out
-
Combines two group designs – compare O2’s pretest with O1’s posttest
-
Only as strong as evidence that both cohorts are the same
Time series designs -
Collect observations/measures before, during and after intervention
-
E.g. simple “interrupted” time series:
O1 O2 O3 X O4 O5 O6
-
At point of interruption trying to see what the change looks like
-
Increases knowledge of effect of intervention because can look at larger time frame and multiple data points (can rule out idea of pattern instead of intervention working)
-
Threat if you give questionnaires all the time (testing error/effect)
-
This is used when data you have is archival – e.g. health records
-
X can be one shot or continuous intervention (tornado vs. new health campaign keeps being shown over months – OXOXOXO)
o
o
o
If effects are only immediate and then do not last it was not overly effective
Interrupted time series with non-equivalent control group
Group A O1 O2 X O3 O4 Group B O1 O2 O3 O4
Must be similar to work
Interrupted time series with removed intervention
O1 O2 x O3 O4 X O5 O6
When removed should get worse
E.g. gym fees paid for a year, after a year not paid
Threat – resentment/demoralization if someone of higher power took something away from you
Interrupted time series with multiple replications
O1 O2 x O3 O4 X O5 O6 X O7 O8 X O9
Experimental Randomized designs Randomized designs ** notes
o
-
Randomization – 50 get it, 50 don’t
-
NOT random selection (selection is sampling issue – use sample to generalize whole population)
-
Tossing a coin
-
Table of random numbers
Between participants, randomized, posttest-only design:
o
Between-participants, matched randomized, posttest-only design:
o
No pretest do nothing to compare it to
Match measures that correlate with topic
Between-participants, randomized, pretest-posttest design:
-
o
o
Statistical comparisons:
Pretest
Posttest
Pretest and posttest for each participant
Compare groups on difference between pre and posttests
Solomon 4-group design:
-
Combines different methods
-
Statistical comparisons:
O1 and O3 – should be similar
O2 and O4 – different (A has intervention)
O1-O2 vs O3-O4 – different
O5 and O6 – different
O4 and O6 – same (neither had intervention, both control groups)
O2 and O6 – different
O2 and O5 – same (both had intervention and there is no interaction), if different then pretest probably interacted with intervention
Between-participants factorial design:
E.g. FITT – duration and frequency of workout – could od two individual studies on each, or a combination of both (e.g. 30m/5days week vs. 60m/3days a week)
Each factor (duration/frequency) can have different levels e.g. duration = 0m, 30m or 60m. frequency has 2 – 3x/w or 5x/w
-
main effects: differences among groups for a single independent variable that are significant, temporarily ignoring all other independent variables
-
interaction: influence of 1 independent variable on a dependent variable depends on level of another independent variable
o
E.g. no movie if have exam but if no exam DEPENDS on what movie (variables depend on another variable)
o
E.g. 2 x 3 factorial design: reaction time of people with alcohol/caffeine interaction
o
No caffeine/low caffeine/moderate caffeine VS no alcohol/alcohol
If only focus on alcohol OR caffeine that is its MAIN EFFECT, together is INTERACTION
If lines on diagram are not parallel = interaction
Research designs and statistics “An ounce of design is worth a pound of analysis. The design or plan for gathering the evidence does more to determine the quality of research project than does the stat analysis. Often one can fix a mistake in stats.” RESEARCH DESIGN OVERTRUMPHS STATS Lecture – cross sectional and longitudinal design Cross sectional designs o
Data collected at 1 point in time -
E.g. effect of children on marital satisfaction
-
Measure marital satisfaction of couples, whether had any children, if so age of oldest
-
Then create groups and compare marital satisfactions
-
Couples without children
-
Couples with oldest child under 1y old
-
Couples with oldest child between 1 and 5y
o
Couple be causal marital sat child
o
Relationship between age a physical activity -
E.g. Select 3 groups of participants from population:
• o
20y today, 30y today, 40y today
Similar to nonequivalent posttest-only design
3 interventions = age
Repeated cross sectional design o
Snapshots done number of times
o
Different snapshots of different samples (same questions asked)
Prospective cross sectional o
Begin now and repeat study at different points in future (forwards)
Retrospective cross sectional o
Draw on existing data sets to examine patterns of change up to present time (backwards)
Pros of cross sectional -
Ideal for descriptive analysis
-
Get results quickly
-
Relatively cost effective
-
Easier to ensure anonymity
Cons of cross sectional -
Internal validity – not good
-
External validity – can be good if it’s a random sample
Longitudinal research designs o
Effect of children on marital satisfaction
o
1st married couples with no kids
o
Follow sample 5 years later – end up in 1 of 3 groups (no kids, kid >1, kid